-
Notifications
You must be signed in to change notification settings - Fork 28.8k
[SPARK-53355][PYTHON][SQL] fix numpy 1.x repr in type tests #52247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
7f4de1d
to
e813915
Compare
python/pyspark/sql/tests/udf_type_tests/test_udf_input_types.py
Outdated
Show resolved
Hide resolved
return tuple(convert_to_numpy_printable(elem) for elem in x) | ||
elif isinstance(x, dict): | ||
return {k: convert_to_numpy_printable(v) for k, v in x.items()} | ||
elif hasattr(x, "dtype") and hasattr(x, "item"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will capture numpy array. Are we targeting at scalars only isinstance(x, np.generic)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, added an isinstance(x, np.generic)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
np.generic is the base class for all numpy scalar types, why do we still need to check hasattr(x, "dtype") and hasattr(x, "item")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also removed them now
python/pyspark/sql/tests/udf_type_tests/test_udf_input_types.py
Outdated
Show resolved
Hide resolved
@xinrong-meng this is ready from my side. Is there a way to trigger the master tests with the different envs on this PR? I have run this locally, but it would be nice to confirm additionally in CI. |
What changes were proposed in this pull request?
__repr__
differentlyWhy are the changes needed?
Build / Python-only (master, Minimum dependencies of PySpark)
Does this PR introduce any user-facing change?
No
How was this patch tested?
ran tests locally with numpy 1.22.4
Was this patch authored or co-authored using generative AI tooling?
No