Skip to content

Pull requests: NVIDIA/NeMo-Curator

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Hard negative mining for Retriever fine-tuning
#523 opened Feb 5, 2025 by vinay-raman Loading…
3 tasks done
NV bug 5025154 fix
#522 opened Feb 5, 2025 by vinay-raman Loading…
3 tasks done
Add Partition On Logic gpuci Run GPU CI/CD on PR
#519 opened Feb 4, 2025 by VibhuJawa Loading…
Add support for Nemotron-CC quality classifiers
#518 opened Feb 4, 2025 by sarahyurick Draft
10 of 14 tasks
Add improved cleaning methods from Nemotron-CC
#517 opened Feb 4, 2025 by ryantwolf Loading…
3 tasks done
Enforce Dataframe Backend Checks
#514 opened Feb 3, 2025 by ryantwolf Loading…
3 tasks done
Removal logic for fuzzy / exact (no class abstraction) gpuci Run GPU CI/CD on PR
#509 opened Jan 31, 2025 by praateekmahajan Loading…
3 tasks
benchmark readme updates
#508 opened Jan 31, 2025 by lbliii Loading…
2 of 3 tasks
Added LookUp error handling during encoding detection.
#502 opened Jan 30, 2025 by ggcr Loading…
Update model nomenclature documentation Improvements or additions to documentation
#497 opened Jan 24, 2025 by sarahyurick Loading…
Clean up Pandas, cuDF, Dask, and Dask-cuDF DocumentDataset type logic gpuci Run GPU CI/CD on PR
#494 opened Jan 23, 2025 by sarahyurick Loading…
Add Pooling Strategy Option for embedding creation gpuci Run GPU CI/CD on PR
#491 opened Jan 20, 2025 by VibhuJawa Loading…
Standardize text_field and id_field terminology gpuci Run GPU CI/CD on PR
#485 opened Jan 17, 2025 by sarahyurick Loading…
Minor CrossFit improvements gpuci Run GPU CI/CD on PR
#483 opened Jan 16, 2025 by sarahyurick Loading…
Add nemo-toolkit dependency to gpuCI gpuci Run GPU CI/CD on PR
#480 opened Jan 10, 2025 by sarahyurick Loading…
Enable ADD ID to work with CPU/GPU both gpuci Run GPU CI/CD on PR
#479 opened Jan 10, 2025 by VibhuJawa Loading…
Support dask_expr migration into dask.dataframe
#477 opened Jan 9, 2025 by rjzamora Loading…
3 tasks
[pre-commit.ci] pre-commit suggestions
#470 opened Jan 7, 2025 by pre-commit-ci bot Loading…
[WIP] Add RAPIDS Nightly to GPU CI gpuci Run GPU CI/CD on PR
#436 opened Dec 17, 2024 by praateekmahajan Draft
3 tasks
Updating the Quick Example
#432 opened Dec 16, 2024 by stsfaroz Loading…
Add TrafilaturaExtractor class
#431 opened Dec 13, 2024 by sarahyurick Loading…
Bump nltk from 3.8.1 to 3.9 in /tutorials/dapt-curation/code dependencies Pull requests that update a dependency file
#429 opened Dec 13, 2024 by dependabot bot Loading…
ProTip! Add no:assignee to see everything that’s not assigned.