Skip to content
This repository has been archived by the owner on Apr 12, 2023. It is now read-only.

Ben-Epstein/bulk-labeling-solara

Repository files navigation

Update

This repo has been sunset. It has been moved to Solara-Examples, in the bulk-labeling folder!

bulk-labeling-solara

A tool for bulk labeling, built in Solara!

I'm trying to rebuild my original bulk-labeling app, which was Streamlit, in Solara so it can be a bit more scalable, customizable, and robust to new features!

I also want to learn how to use solara :)

Roadmap

  • Allow the user to download the labeled file :D (widgetti/solara#30)

  • Fix the layout of embeddings and dataframe so they are next to each other (not stacked)
    Also the solara message on the bottom right is in a bad spot image

  • Get a more fun animation for when the embeddings are being calculated. Would also be cool if we could update them with the logs from UMAP or tqdm

  • Fix mypy issues image

  • The "reset filters" button should really be a switch, not a checkbox. It doesn't look great
    image

  • Add a nice readme like what I have in the streamlit version - We should wait until the visual issues are fixed so we don't need to redo it

  • Write the blog on how I built it (same as above, wait until it's in a better state)

  • Deploy on solara cloud?!? 🚀

Development

  1. Setup a virtual env: python -m venv .venv && source .venv/bin/activate
  2. Install the package: pip install -e . && pyenv rehash
  3. Run: solara run bulk_labeling/main.py

Any changes you make to the app should reflect in realtime

Note: SentenceTransformers doesn't play nicely with solara

If you are going to be developing, I strongly recommend commenting out the few lines in ml.py:

from sentence_transformers import SentenceTransformer
ENCODER = SentenceTransformer("paraphrase-MiniLM-L3-v2")
return ENCODER.encode(samples)

And uncomment

# return np.random.rand(len(samples), 20)

For some reason, on a page reload, solara breaks if these lines are running.
It will also make prototyping faster because you won't be actually encoding strings.

About

A tool for bulk labeling, built in Solara!

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published