-
Notifications
You must be signed in to change notification settings - Fork 584
Simplify to_dask_cudf for compatibility with newer versions of dask #6496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify to_dask_cudf for compatibility with newer versions of dask #6496
Conversation
I wasn't sure whether to target |
Considering that the rapids-dask-dependency for 25.04 depends on the pinned dask version 2025.02.0 (https://github.com/rapidsai/rapids-dask-dependency/blob/branch-25.04/conda/recipes/rapids-dask-dependency/meta.yaml), I see no need to change this in version 25.04 and so we should target 25.06. Do you know when we will bump our dask dependency? |
I'm expecting to update the dependency for rapids 25.06, after the next Dask release (in the next couple of weeks). |
In 898bf6f I've updated the test to avoid the failure at https://github.com/rapidsai/cuml/actions/runs/14173228016/job/39704398544?pr=6496#step:9:3580
I'm looking a bit more, but this is indicating an issue. We seem to be getting back a pandas DataFrame from |
Actually, maybe this is expected. The return type of
So maybe we're OK. But I'd appreciate a close review of that change, and a confirmation that we want to be passing host inputs (numpy / pandas) rather than device inputs (cupy / cudf). |
Has anyone seen https://github.com/rapidsai/cuml/actions/runs/14200165809/job/39787356294?pr=6496#step:9:3056
I can't reproduce that locally, but I haven't tried matching versions exactly yet. |
eb0d8c8
to
60c155b
Compare
Closing in favor of #6614. |
With newer versions of Dask, some tests like
cuml/python/cuml/cuml/tests/dask/test_dask_tsvd.py::test_pca_fit[dataframe-data_info0]
are failing with a message likeThis PR simplifies the code used in the tests that converts from a Dask Array to a Dask DataFrame. It will hopefully be compatible the version of dask currently used in CI (2025.2.0, I believe) but I'm relying on CI to check that. I've tested that it works with Dask main.