-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Reading parquet dataset on GPU throws "cudf engine doesn't support the following keyword arguments: ['strings_to_categorical']" error #1873
Comments
The following workaround works in loading the data:
However, applying workflow transform fails:
full error:
The workflow includes |
Sorry for this ridiculously late response @orlev2 - Just coming across this now. As far as I can tell, the rapids/dask pinning in Merlin has been far too loose. NVTabular 23.8 was definitely not tested with cudf>=23.08 or dask>=2023.8. The merlin 23.08 containers use NOTE: The lack of upper pinning in NVTabular is indeed a "bug" of sorts - I apologize about that. |
Describe the bug
Reading parquet dataset on GPU throws an "cudf engine doesn't support the following keyword arguments: ['strings_to_categorical']" error. Reading the data on CPU runs successfully:
Steps/Code to reproduce bug
Expected behavior
The dataset should be read from file under both cpu=True/False
Environment details (please complete the following information):
nvtabular == 23.8.00
cudf == 23.10.02 (above error was also present under 23.12.01)
dask == 2023.9.2
@niraj06
The text was updated successfully, but these errors were encountered: