You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to run a GE checkpoint from Airflow.
Checkpoint is based on SQL query.
SQL query must get values for its parameters from Airflow - e.g. a datamart should be checked for DQ for particular date and region after that date and region were refreshed from another Airflow task.
Part of checkpoint.yml looks like:
validations:
- batch_request:
datasource_name: snowflake
data_connector_name: default_runtime_data_connector_name
data_asset_name: db1.table1
runtime_parameters:
query: "SELECT *
from db1.table1
WHERE fld1 > $DATE_PARAM_FROM_AIRFLOW and fld2 = $REGION_PARAM_FROM_AIRFLOW
"
How to do it properly with GreatExpectationsOperator?
Looks like it can't pass parameters only,
while query_to_validate or checkpoint_config will break unit tests (you will need airflow to test your checkpoint!)
Workaround: use environment variables.
Thanks!
The text was updated successfully, but these errors were encountered:
We need to run a GE checkpoint from Airflow.
Checkpoint is based on SQL query.
SQL query must get values for its parameters from Airflow - e.g. a datamart should be checked for DQ for particular date and region after that date and region were refreshed from another Airflow task.
Part of checkpoint.yml looks like:
How to do it properly with GreatExpectationsOperator?
Looks like it can't pass parameters only,
while query_to_validate or checkpoint_config will break unit tests (you will need airflow to test your checkpoint!)
Workaround: use environment variables.
Thanks!
The text was updated successfully, but these errors were encountered: