Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teradata source failure #2406

Open
Aaron-Zhou opened this issue Mar 14, 2025 · 1 comment
Open

Teradata source failure #2406

Aaron-Zhou opened this issue Mar 14, 2025 · 1 comment
Assignees

Comments

@Aaron-Zhou
Copy link

dlt version

1.8.0

Describe the problem

I am using sql-database to connect to Teradata as source, target is snowflake.
Always got error at pipeline execution step 'normalize'. The error detail is "Pipeline execution failed at stage normalize when processing package 1741946639.82303 with exception:\n\n<class 'dlt.normalize.exceptions.NormalizeJobFailed'>\nJob for dim_customer.7e5ea11f39.typed-jsonl failed terminally in load 1741946639.82303 with message In schema: sql_database: Cannot coerce NULL in table dim_customer column title which is not nullable."
The column dim_customer.title is actually nullable in teradata, it seems dlt failed to reflect the correct source column type from teradata.

source definition: source = sql_database(credentials=ConnectionStringCredentials(conn_str), schema=source_schema, table_names=tables)
pipeline: dlt.pipeline(pipeline_name="rdms2snowflake", destination='snowflake', dataset_name=target_schema)
run: pipeline.run(source, write_disposition="replace", schema_contract={"tables": "evolve", "columns": "evolve", "data_type": "evolve"})

Expected behavior

source column should be reflected as nullable and data should be moved to snowflake

Steps to reproduce

Create source to teradata table, create pipeline to snowflake and run.

source definition: source = sql_database(credentials=ConnectionStringCredentials(conn_str), schema=source_schema, table_names=tables)
pipeline: dlt.pipeline(pipeline_name="rdms2snowflake", destination='snowflake', dataset_name=target_schema)
run: pipeline.run(source, write_disposition="replace", schema_contract={"tables": "evolve", "columns": "evolve", "data_type": "evolve"})

Operating system

macOS

Runtime environment

Local

Python version

3.12

dlt data source

conn_str = 'teradatasql://user:[email protected]/db'

sql_database(credentials=ConnectionStringCredentials(conn_str), schema=source_schema, table_names=tables)

dlt destination

Snowflake

Other deployment details

No response

Additional information

No response

@sh-rp
Copy link
Collaborator

sh-rp commented Mar 21, 2025

@Aaron-Zhou FYI: We use sql_alchemy for the schema reflection, so there is a fair chance we are getting an incorrect schema from sqlalchemy. You could have a look in the sqlalchemy terradata plugin github repo if there are any issues that mention this or verify yourself with pure sqlalchemy that this is indeed the problem and open a ticket there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

3 participants