Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX #19232 - Correct array type mapping in ORM converter #19241

Merged
merged 1 commit into from
Jan 7, 2025

Conversation

pmbrull
Copy link
Collaborator

@pmbrull pmbrull commented Jan 6, 2025

Describe your changes:

Fixes #19232

Sampler was failing for postgres array types. There were 2 issues:

  1. We were not properly parsing item types of the arrays
  2. We were not creating the right ORM column for item types

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

@@ -76,7 +76,9 @@ def map_types(self, col: Column, table_service_type):
"""returns an ORM type"""

if col.arrayDataType:
return self._TYPE_MAP.get(col.dataType)(item_type=col.arrayDataType)
return self._TYPE_MAP.get(col.dataType)(
item_type=self._TYPE_MAP.get(col.arrayDataType)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we were passing OM's DataType instead of a real ORM

@@ -94,6 +94,12 @@ def _process_col_type(self, column: dict, schema: str) -> Tuple:
parsed_string["name"] = column["name"]
else:
col_type = ColumnTypeParser.get_column_type(column["type"])
# For arrays, we'll get the item type if possible, or parse the string representation of the column
# if SQLAlchemy does not provide any further information
if col_type == "ARRAY" and getattr(column["type"], "item_type"):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we were ignoring the item_type property if it ever was informed and only relied on the str representation of the types

@ayush-shah ayush-shah merged commit 4ee783c into open-metadata:main Jan 7, 2025
18 of 30 checks passed
@tutte tutte added To release Will cherry-pick this PR into the release branch and removed To release Will cherry-pick this PR into the release branch labels Jan 7, 2025
Copy link

sonarqubecloud bot commented Jan 7, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ingestion safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto Classification Ingestion - AttributeError: 'DataType' object has no attribute 'dialect_impl'
4 participants