Releases: edanalytics/edu_edfi_airflow
Releases · edanalytics/edu_edfi_airflow
edu_edfi_airflow v0.4.0
New features
- Add
EarthbeamDAG.partition_on_tenant_and_year(), a preprocessing function to shard data to parquet on disk. This is useful when a single input file contains multiple years and/or tenants. - Add
EarthbeamDAG.build_dynamic_tenant_year_task_group()to build dynamic Earthbeam task groups for each file to process in a source folder - Add ID matching sub-taskgroup and arguments to
EarthbeamDAGtaskgroups, in order to retrieve an assessment's identity columns from Snowflake - Add optional postprocess Python callable to
EarthbeamDAGtaskgroups - Add optional Lightbeam validation to
EarthbeamDAGtaskgroups - Add option to log Python preprocess and postprocess outputs to Snowflake
Under the hood
- Make accessing the
Total-Countof the Ed-Fi/deletesendpoints optional using argumentget_deletes_cv_with_deltas(necessary for generic Ed-Fi 5.3 ODSes) - Refactor
EarthbeamDAGto use Airflow TaskFlow syntax and simplify Earthbeam task groups - Deprecate
EarthbeamDAG.build_tenant_year_task_group()argumentraw_dir
Full Changelog: v0.3.1...v0.4.0
edu_edfi_airflow v0.3.1
Fixes
- Fix bug where updates to query-parameters persisted across every
EdFiResourceDAG - Add logging of failed endpoints on
EdFiResourceDAGtaskfailed_total_counts
Full Changelog: v0.3.0...v0.3.1
edu_edfi_airflow v0.3.0
New features
- Add
/keyChangesingestion for resource endpoints - Add new method for
EdFiResourceDAGendpoint instantiation usingresource_configsanddescriptor_configsarguments in init- The prior methods
EdFiResourceDAG.{add_resource, add_descriptor, add_resource_deletes}are deprecated in favor of this more performant approach.
- The prior methods
- Refactor
EdFiToS3Operatortaskgroup into three options (determined byrun_typeargument):- "default": One
EdFiToS3Operatortask per resource/deletes/keyChanges endpoint - "bulk": One
BulkEdFiToS3Operatortask in which all endpoints are looped over in one callable - "dynamic": One dynamically-mapped
EdFiToS3Operatortask per resource with deltas to ingest
- "default": One
Under the hood
- Copies from S3 to Snowflake in
EdFiResourceDAGare now completed in a single bulk task (instead of one per endpoint) EdFiResourceDAGandEarthbeamDAGnow inherit fromea_airflow_utilDAG factoryEACustomDAG- Streamline XCom passing between tasks in
EdFiResourceDAG - Change-version window delta counts are made when checking change versions in Snowflake.
- Only resources with rows-to-ingest are passed to the Ed-Fi operator.
Full Changelog: v0.2.5...v0.3.0
edu_edfi_airflow v0.2.5
What's Changed
- Add optional argument
schedule_interval_full_refreshto specify a CRON syntax for full-refresh Ed-Fi DAG runs by @jayckaiser in #29 - Update Earthbeam DAG logging copy statement to prevent character-escaping issues during copy by @jayckaiser in #31
Full Changelog: v0.2.4...v0.2.5
edu_edfi_airflow v0.2.4
What's Changed
-
- Add alternative arguments for setting
s3_destination_keyinS3ToSnowflakeOperator:s3_destination_dirands3_destination_filenameby @rlittle08 in #30
- Add alternative arguments for setting
Full Changelog: v0.2.3...v0.2.4
edu_edfi_airflow v0.2.3
What's Changed
- Move min_change_version fix from init to execute. by @jayckaiser in #28
Full Changelog: v0.2.2...v0.2.3
edu_edfi_airflow v0.2.2
What's Changed
- Refactor task-group ordering to branch EM/LB logging outside of main task group in EarthbeamDAG by @jayckaiser in #21
- Add optional pool argument when initializing an Ed-Fi task group that overrides default DAG pool by @jayckaiser in #24
Full Changelog: v0.2.1...v0.2.2
edu_edfi_airflow v0.2.1
What's Changed
- Remove
provide_contextfrom kwarg arguments in EarthbeamDAG.build_b… by @jayckaiser in #22 - feature/add database connection for Earthmover source as env var by @sleblanc23 in #23
New Contributors
- @sleblanc23 made their first contribution in #23
Full Changelog: v0.2.0...v0.2.1
edu_edfi_airflow v0.2.0
What's Changed
- Feature/refactor change version post copy by @jayckaiser in #9
- Feature/refactor change version by @jayckaiser in #10
- Feature/dynamic dbt dag trigger by @jayckaiser in #11
- Refactor/edfi dag cleanup by @jayckaiser in #12
- Feature/airflow 2.6 by @jayckaiser in #14
- Rc/1.0.0 snowflake load by @jayckaiser in #16
- Feature/earthbeam dag updates by @jayckaiser in #15
- Feature/earthbeam dag 2.6 by @jayckaiser in #17
- Feature/earthbeam dag by @jayckaiser in #13
- Rc/1.0.0 backwards compatible testing by @jayckaiser in #18
- Refactor EdFiResourceDAG.chain_task_groups_into_dag() to add a final … by @jayckaiser in #20
- Rc/1.0.0 by @jayckaiser in #19
Full Changelog: v0.1.2...v0.2.0
edu_edfi_airflow v0.1.2
What's Changed
- Update docs site link in README.md with correct URL. by @ejoranlienea in #2
- Force full_refresh to True if use_change_version is False. by @jayckaiser in #3
- Comment-out endpoint-ping. by @jayckaiser in #5
- Fix/reset tmp storage after s3 error by @jayckaiser in #4
- Turn off reverse_paging for deletes ingestion. by @jayckaiser in #7
- Add query_parameters to resource definition. by @jayckaiser in #6
New Contributors
- @ejoranlienea made their first contribution in #2
Full Changelog: v0.1.0...v0.1.2