Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CannotCoerceColumnException - Source data changed from int to string and string field has None #2385

Open
shobanaelangok opened this issue Mar 6, 2025 · 1 comment

Comments

@shobanaelangok
Copy link

shobanaelangok commented Mar 6, 2025

dlt version

dlt 1.3.0

Describe the problem

Error:
PipelineStepFailed: Pipeline execution failed at stage extract when processing package 1741243693.0254757 with exception:
<class 'dlt.common.schema.exceptions.CannotCoerceColumnException'>
In schema: table_data_daily: Cannot coerce type in table data_daily column title existing type text coerced type bigint value: None

Description:
Getting above error when dlt pipleine running via airflow tries to load data for a field named title
This field title initially had int data type in source and now they changed it to string (it also contains None) and when it is loading to bigquery it errors with above exception

Things i tried:
removing title column
Altered the title column data type to string
adding title column as string with no data manually
adjust import/export datatypes

Expected behavior

this field is not being automatically converted and loaded as _text suffix

Steps to reproduce

(DLT setup as incremental write diposition based on primary key)
1st run, have int values along with None in source db, load in bigquery as int field
2nd run, change the title field data type to string and keep the None as is, try loading to bigquery

data before change in source (int)

1
2
None

data after change in source (string)

yes
no
None

Operating system

macOS, Other

Runtime environment

Airflow

Python version

3.11

dlt data source

Azure mysql DB

dlt destination

Google BigQuery

Other deployment details

No response

Additional information

No response

@shobanaelangok shobanaelangok changed the title CannotCoerceColumnException - Source data changed from into string and string field has None CannotCoerceColumnException - Source data changed from int to string and string field has None Mar 7, 2025
@sh-rp
Copy link
Collaborator

sh-rp commented Mar 10, 2025

@shobanaelangok dlt does not change the types of existing columns in the destination. For your case you have a couple of options:

  • Load the data to a new dataset that has the correct column type hint from the beginnging (you can set it to str with apply_hints and incoming ints will be converted to strings for you)
  • Rename the updated column to something else in a map function, then you will get a new column with the right type, but it will be a separate column
  • Manually cast the new column to the old int type in a map function, but this will only work if you can actually cast all incoming str values to int.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Development

No branches or pull requests

2 participants