Releases: edanalytics/earthmover
Releases · edanalytics/earthmover
v0.3.2
What's Changed
- feature: Add
DebugOperation
for logging data head, tail, columns, or metadata midrun - feature: Add
FlattenOperation
for splitting and exploding string columns into values - feature: Add optional 'fill_missing_columns' field to
UnionOperation
to fill disjunct columns with nulls, instead of raising an error (default False) - feature: Add
git_auth_timeout
config when entering Git credentials during package composition - feature: Add
earthmover clean
command that removes local project artifacts - feature: only output compiled template during
earthmover compile
- feature: Render full row into JSON lines when template is undefined in
FileDestination
- Many bugfixes and compile improvements
Full Changelog: v0.3.1...v0.3.2
v0.3.1
What's Changed
- allow any ordering of Transformations during graph-building in compile by @jayckaiser
- only create a
/packages
dir whenearthmover deps
succeeds by @jayckaiser - explain
earthmover_compiled.yaml
inREADME.md
by @sleblanc23
Full Changelog: v0.3.0...v0.3.1
earthmover 0.3.0
What's Changed
- feature: add project composition using packages keyword in template file (see README)
- feature: add installation extras for optional libraries, and improve error logging to notify which is missing
- feature: GroupByWithRankOperation cumulatively sums record counts by group-by columns
- feature: setting log_level: DEBUG in template configs or setting debug: True for a node displays the head of the node mid-run
- feature: add optional_fields key to all Sources to add optional empty columns when missing from schema
- feature: add optional ignore_errors and exact_match boolean flags to DateFormatOperation
- internal: remove attempted directory-hashing when a source is a directory (i.e., Parquet)
- internal: Remove unused group_by_with_count and group_by_with_agg operations
Full Changelog: v0.2.1...v0.3.0
Note: This version has slightly different packaging requirements than v0.2.1. Please make sure to re-install the package if using locally.
v0.2.1 (and v0.2.2)
What's Changed
- feature/sort_rows_operation by @jayckaiser in #56
- fixing typos in docs and comments by @tomreitz in #68
- adding fromjson() function to Jinja by @tomreitz in #75
- bump version to 0.2.1 by @tomreitz in #76
Full Changelog: v0.2.0...v0.2.1
Note: An error in the release of v0.2.1 to PyPI required releasing a v0.2.2 there which is identical in content to v0.2.1 here on GitHub. GitHub releases and PyPI versions will sync back up in the upcoming 0.3.0 release.
v0.2.0
What's Changed
- Rc/1.0.0 yaml version 2 operators by @jayckaiser in #43
- Rc/1.0.0 yaml version 2 snake case operation by @jayckaiser in #44
- Rc/1.0.0 yaml version 2 v2 config refactor by @jayckaiser in #45
- Rc/1.0.0 yaml version 2 remove operation data by @jayckaiser in #49
- Rc/1.0.0 yaml version 2 dask performance improvements by @jayckaiser in #52
- Rc/1.0.0 yaml version 2 post discussion cleanup partition size by @jayckaiser in #54
- Rc/1.0.0 yaml version 2 post discussion cleanup by @jayckaiser in #53
- Rc/1.0.0 yaml version 2 by @jayckaiser in #46
- Rc/1.0.0 by @jayckaiser in #55
- switch logic order for checking if source is remote by @sleblanc23 in #48
New Contributors
- @sleblanc23 made their first contribution in #48
Full Changelog: v0.1.6...v0.2.0