You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Create a pipeline which uses DuckDB as a source like so:
conn_string=f"duckdb:///md:{db_name}{access_token_suffix}"credentials=create_engine(
conn_string,
connect_args={
'read_only': True,
}
)
pipeline=dlt.pipeline(
pipeline_name=pipeline_name,
destination='postgres',
dataset_name=dataset_name,
progress="log",
)
source1=sql_database(
credentials,
schema=schema_name, # with our without, same effecttable_names=['foobar'], # or schema_name.foobar or db_name.schema_name.foobarreflection_level="full",
)
source2=sql_database(
credentials,
schema=schema_name, # with our without, same effectreflection_level="full",
).with_resources('foobar')
source3=sql_table(
credentials,
schema="info",
table="foobar",
)
load_info1=pipeline.run(source1, write_disposition=write_disposition)
print(load_info1)
load_info2=pipeline.run(source2, write_disposition=write_disposition)
print(load_info2)
load_info3=pipeline.run(source3, write_disposition=write_disposition)
print(load_info3)
Expected errors to see:
dlt.pipeline.exceptions.PipelineStepFailed: Pipeline execution failed at stage extract when processing package 1742387100.2242408 with exception:
<class 'dlt.extract.exceptions.ResourceExtractionError'>
In processing pipe foo: extraction of resource foo in generator table_rows caused an exception: (duckdb.duckdb.CatalogException) Catalog Error: Table with name foo does not exist!
Did you mean "db_name.schema_name.foo"?
LINE 2: FROM foo
^
[SQL: SELECT foo.id, foo.created_at, foo.bar
FROM foo]
(Background on this error at: https://sqlalche.me/e/20/f405)
When fully qualifying the table name in the functions like db_name.schema_name.table_name, the error returned is:
sqlalchemy.exc.InvalidRequestError: Could not reflect: requested table(s) not available in Engine(duckdb:///foobar): db_name.schema_name.table_name
Operating system
macOS
Runtime environment
Local
Python version
3.12
dlt data source
sql_database
dlt destination
Postgres
Other deployment details
No response
Additional information
No response
The text was updated successfully, but these errors were encountered:
thenaturalist
changed the title
sql_database source not working with DuckDB/ MotherDucksql_database() & sql_table() source functions not working with DuckDB/ MotherDuck
Mar 19, 2025
dlt version
1.8.0
Describe the problem
I want to load data from a DuckDB/ MotherDuck source using either the
sql_database
orsql_table
functions.Both fail as dlt fails to resolve DuckDBs catalog inspection to correctly scope table and column qualifications.
I am using https://github.com/Mause/duckdb_engine verions
0.15.0
.Expected behavior
dlt is generating SQL which correctly identifies a specified table based on DuckDB's catalog conventions and the pipeline completes without error.
Steps to reproduce
Create a DuckDB instance with a synthetic test table as described here: https://duckdb.org/docs/stable/guides/snippets/create_synthetic_data.html
Create a pipeline which uses DuckDB as a source like so:
Expected errors to see:
When fully qualifying the table name in the functions like
db_name.schema_name.table_name
, the error returned is:Operating system
macOS
Runtime environment
Local
Python version
3.12
dlt data source
sql_database
dlt destination
Postgres
Other deployment details
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: