Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use case focused docs pages #11095

Open
scharlottej13 opened this issue May 3, 2024 · 2 comments
Open

Use case focused docs pages #11095

scharlottej13 opened this issue May 3, 2024 · 2 comments
Labels
documentation Improve or add to documentation

Comments

@scharlottej13
Copy link
Contributor

I'm curious what folks would think about adding use case-specific pages to the Dask docs. Specifically, I was thinking about pages for machine learning and workflow orchestration where there is an especially broad ecosystem of libraries that you can use with Dask, but it'd be hard to find these by looking through the Dask docs. Maybe there are other good use cases too.
I'm not quite sure how these would fit into the current docs.dask.org table of contents. Some ideas:

  • add a new "use cases" category in the left TOC
  • create a use cases drop-down under "how to use" that can link out to common use cases
  • ...

machine learning

We have https://ml.dask.org/, which is nice because it covers many ml-specific libraries that integrate well with Dask. A lot of this information is out of date, though, and I think it would be nice to have a single, concise page in docs.dask.org that links out to examples, relevant libraries (xgboost, lightgbm, rapids, scikit-learn, dask-ml functions that are still used/maintained), etc.

workflows/etl

I think the closest thing we have to this right now is the Prefect example in examples.dask.org. I'm imagining this page could link to using Dask with Prefect, but also other workflow orchestration tools like airflow and dagster. Maybe it also mentions things like dask-sql, dask-bigquery, delta-rs + dask.

cc @jrbourbeau @fjetter @mrocklin @jacobtomlinson

@scharlottej13 scharlottej13 added the documentation Improve or add to documentation label May 3, 2024
@jacobtomlinson
Copy link
Member

I'm very +1 on adding use case examples. As you say we already have ml.dask.org and examples.dask.org. Are you suggesting putting some effort into getting those up to date? Or merging those into the main Dask docs?

@mrocklin
Copy link
Member

mrocklin commented May 9, 2024

I don't think that ml.dask.org is particularly good. I think that we should have a Machine Learning doc inside docs.dask.org that points people in different directions. My guess is that it points people to ...

  • HPO systems
    • Optuna
    • Futures
  • Gradient boosted trees
    • xgboost
    • lightgbm
  • Batch inference
    • Futures
    • Dask dataframe map_partitions
  • Pytorch training on large models (the saturn thing maybe?)

I think that I would find this more valuable than ml.dask.org, which is today focused on the Dask ML package, which is, as far as I'm aware, largely unused.

mrocklin added a commit that referenced this issue May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improve or add to documentation
Projects
None yet
Development

No branches or pull requests

3 participants