Releases: edanalytics/edu_edfi_airflow
Releases · edanalytics/edu_edfi_airflow
edu_edfi_airflow v0.4.3
What's Changed
- Make dependency between EM-to-S3 and file-removal mandatory in Earthb… by @jayckaiser in #85
- Feature/optimized bulk copy by @jayckaiser in #84
Full Changelog: v0.4.2...v0.4.3
edu_edfi_airflow v0.4.2
New features
- Add boolean
pull_all_deletes
argument toEdFiResourceDAG
to re-pull all deletes for a resource when any are added (resolves deletes-skipping bug). - Allow
SNOWFLAKE_TENANT_CODE
to be overridden inearthmover_kwargs
inEarthbeamDAG
.
Under the hood
- Simplify taskgroup declaration in
EarthbeamDAG
.
Fixes
- Fix bug where singleton filepaths in
EarthbeamDAG
were not converted to lists upon initialization. - Add dependency between Lightbeam and file-deletion in
EarthbeamDAG
.
Full Changelog: v0.4.1...v0.4.2
edu_edfi_airflow v0.4.1
Under the hood
- Wrap Snowflake stage with single quotes to support filepaths with special characters
Fixes
- Fix bugs where files written to S3 could be overwritten in
EarthbeamDAG
- Fix bug where optional files fail upload to S3
Full Changelog: v0.4.0...v0.4.1
edu_edfi_airflow v0.4.0
New features
- Add
EarthbeamDAG.partition_on_tenant_and_year()
, a preprocessing function to shard data to parquet on disk. This is useful when a single input file contains multiple years and/or tenants. - Add
EarthbeamDAG.build_dynamic_tenant_year_task_group()
to build dynamic Earthbeam task groups for each file to process in a source folder - Add ID matching sub-taskgroup and arguments to
EarthbeamDAG
taskgroups, in order to retrieve an assessment's identity columns from Snowflake - Add optional postprocess Python callable to
EarthbeamDAG
taskgroups - Add optional Lightbeam validation to
EarthbeamDAG
taskgroups - Add option to log Python preprocess and postprocess outputs to Snowflake
Under the hood
- Make accessing the
Total-Count
of the Ed-Fi/deletes
endpoints optional using argumentget_deletes_cv_with_deltas
(necessary for generic Ed-Fi 5.3 ODSes) - Refactor
EarthbeamDAG
to use Airflow TaskFlow syntax and simplify Earthbeam task groups - Deprecate
EarthbeamDAG.build_tenant_year_task_group()
argumentraw_dir
Full Changelog: v0.3.1...v0.4.0
edu_edfi_airflow v0.3.1
Fixes
- Fix bug where updates to query-parameters persisted across every
EdFiResourceDAG
- Add logging of failed endpoints on
EdFiResourceDAG
taskfailed_total_counts
Full Changelog: v0.3.0...v0.3.1
edu_edfi_airflow v0.3.0
New features
- Add
/keyChanges
ingestion for resource endpoints - Add new method for
EdFiResourceDAG
endpoint instantiation usingresource_configs
anddescriptor_configs
arguments in init- The prior methods
EdFiResourceDAG.{add_resource, add_descriptor, add_resource_deletes}
are deprecated in favor of this more performant approach.
- The prior methods
- Refactor
EdFiToS3Operator
taskgroup into three options (determined byrun_type
argument):- "default": One
EdFiToS3Operator
task per resource/deletes/keyChanges endpoint - "bulk": One
BulkEdFiToS3Operator
task in which all endpoints are looped over in one callable - "dynamic": One dynamically-mapped
EdFiToS3Operator
task per resource with deltas to ingest
- "default": One
Under the hood
- Copies from S3 to Snowflake in
EdFiResourceDAG
are now completed in a single bulk task (instead of one per endpoint) EdFiResourceDAG
andEarthbeamDAG
now inherit fromea_airflow_util
DAG factoryEACustomDAG
- Streamline XCom passing between tasks in
EdFiResourceDAG
- Change-version window delta counts are made when checking change versions in Snowflake.
- Only resources with rows-to-ingest are passed to the Ed-Fi operator.
Full Changelog: v0.2.5...v0.3.0
edu_edfi_airflow v0.2.5
What's Changed
- Add optional argument
schedule_interval_full_refresh
to specify a CRON syntax for full-refresh Ed-Fi DAG runs by @jayckaiser in #29 - Update Earthbeam DAG logging copy statement to prevent character-escaping issues during copy by @jayckaiser in #31
Full Changelog: v0.2.4...v0.2.5
edu_edfi_airflow v0.2.4
What's Changed
-
- Add alternative arguments for setting
s3_destination_key
inS3ToSnowflakeOperator
:s3_destination_dir
ands3_destination_filename
by @rlittle08 in #30
- Add alternative arguments for setting
Full Changelog: v0.2.3...v0.2.4
edu_edfi_airflow v0.2.3
What's Changed
- Move min_change_version fix from init to execute. by @jayckaiser in #28
Full Changelog: v0.2.2...v0.2.3
edu_edfi_airflow v0.2.2
What's Changed
- Refactor task-group ordering to branch EM/LB logging outside of main task group in EarthbeamDAG by @jayckaiser in #21
- Add optional pool argument when initializing an Ed-Fi task group that overrides default DAG pool by @jayckaiser in #24
Full Changelog: v0.2.1...v0.2.2