Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add python-xxhash #503

Merged
merged 2 commits into from
Nov 28, 2023
Merged

Add python-xxhash #503

merged 2 commits into from
Nov 28, 2023

Conversation

dcherian
Copy link
Contributor

This is what's needed for significantly faster hashing with dask, not just xxhash

This is what's needed for significantly faster hashing with dask, not just `xxhash`
Copy link
Contributor

Binder 👈 Try on Mybinder.org!

@pangeo-bot
Copy link
Collaborator

/condalock
Automatically locking new conda environment, building, and testing images...

@scottyhq
Copy link
Member

@dcherian @jhamman are you able to provide a little more context for this? I see on this page https://docs.dask.org/en/stable/install.html?highlight=xxhash#optional-dependencies "Use xxHash hash functions for array hashing (~2x faster than MurmurHash, slightly slower than CityHash)"

Seems pretty great if just having this in the environment speeds up any dask-array computation?

@dcherian
Copy link
Contributor Author

dcherian commented Nov 28, 2023

It'll speed up dask.array.from_array(numpy_array) type things, and potentially other hashing things but mostly that. And yes, we just need to have it installed.

@scottyhq scottyhq merged commit 722f613 into pangeo-data:master Nov 28, 2023
4 checks passed
@dcherian dcherian deleted the patch-1 branch November 28, 2023 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants