-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[BUGFIX] Fix log duplication when using specific super call (#168)
Addressed an issue of duplicate logs when using a `super()` call in the execute method of a Step class under the specific condition where `super()` was not pointing to the direct ancestor of the called`Step`. The problem was caused by the `_is_called_through_super` method of the `StepMetaClass` not being able to inspect beyond the immediate ancestor of the called object. This fix involves updating the `_is_called_through_super` method to traverse the entire method resolution order (MRO) and correctly identify if the `execute`-method is called through `super()` in any parent class of its direct ancestry. Additionally, the `_execute_wrapper` method was updated to ensure logging is only triggered once per execute call. While fixing this issue, I came across a few more problems that needed to be addressed. A summary below. All relevant tests have been updated / addressed also. ## Snowflake Switched Snowflake classes to use `params` over `options` to stay in line with the rest of the Koheesio classes. 1. src/koheesio/integrations/snowflake/__init__.py: - Introduced `SF_DEFAULT_PARAMS` with default Snowflake parameters. - Renamed `options` to `params` to accommodate the switch to `ExtraParamsMixin` and updated the class to use `default_factory=partial(dict, **SF_DEFAULT_PARAMS)` (this was to make mypy and pytorch happy) - Added a property named `options` for backwards compatibility. ## JDBC switch to `ExtraParamsMixin` 1. `spark/readers/jdbc.py`: - Introduced ExtraParamsMixin to handle additional parameters natively. - Renamed `options` Field to `params` to accommodate the switch to `ExtraParamsMixin` and added alias="options". - Added a property named `options` for backwards compatibility. - `dbtable` and `query` validation are now handled upon `__init__` rather than at runtime (this is more in line with how Koheesio's other classes work and how it is intended to be used) - by default, either `dbtable` or `query` need to be submitted to use JDBC (as was always intended) 2. `spark/readers/hana.py`: (depends on jdbc) - Renamed `options` Field to `params` to accommodate the switch to `ExtraParamsMixin` and added alias="options". 3. `spark/readers/teradata.py`: (depends on jdbc) - Renamed `options` Field to `params` to accommodate the switch to `ExtraParamsMixin` and added alias="options". ## Hash Transformation A new error popped up (only while using Spark Connect) that uncovered some bugs with how missing columns are being handled. 1. `src/koheesio/spark/transformations/hash.py`: - Updated the `sha2` function call to use named parameters. - Added a check for missing columns in the `Sha2Hash` class. - Improved the` Sha2Hash` class to handle cases when no columns are provided. ## Easier debugging and dev improvements To make debugging easier, I changed the `pyproject.toml` to allow for easier running `spark connect` in your local dev environment: - Added extra dependencies for `pyspark[connect]==3.5.4`. - Added environment variables for Spark Connect in the development environment. Additionally, changed to verbose mode logging in the pytest output. - Changed pytest options from `-q --color=yes --order-scope=module` to `-vv --color=yes --order-scope=module` (which makes test log output in CICD more readable). ## Related Issue #167 ## Motivation and Context This change is required to prevent duplicate logs when using `super()` in nested Step classes. The updated logic ensures that the logging mechanism correctly identifies and handles `super()` calls, providing accurate and non-redundant log entries. The `_is_called_through_super` method was not just used for logs, but also for `Output` validation - although I did not witness any direct issues with this, this fix ensure that we call this only once also. --------- Co-authored-by: Danny Meijer <[email protected]>
- Loading branch information
1 parent
cd13c81
commit 522fd70
Showing
12 changed files
with
234 additions
and
80 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.