Skip to content

Pull requests: modelscope/data-juicer

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Add unittest for ray text dedup
#540 opened Jan 10, 2025 by chenyushuo Loading…
[WIP] refactor of dataset builder and executor
#537 opened Jan 9, 2025 by cyruszhang Loading…
fix save_ckpt bug bug Something isn't working dj:core issues/PRs about the core functions of Data-Juicer
#536 opened Jan 9, 2025 by HYLcool Loading…
log summarization enhancement New feature or request
#534 opened Jan 9, 2025 by HYLcool Loading…
Refine/llm api op unittest dj:core issues/PRs about the core functions of Data-Juicer enhancement New feature or request
#528 opened Jan 3, 2025 by BeachWang Loading…
[Feature] Auto generation for OP docs dj:ci/cd issues/PRs about CI/CD of Data-Juicer documentation Improvements or additions to documentation enhancement New feature or request
#527 opened Jan 3, 2025 by HYLcool Loading…
Add minhash deduplicator based on RAY and Redis dj:dist issues/PRs about distributed data processing dj:efficiency regarding to efficiency issues and enhancements dj:op issues/PRs about some specific OPs
#489 opened Nov 15, 2024 by pan-x-c Loading…
Automatically split input dataset in ray mode
#415 opened Sep 4, 2024 by pan-x-c Loading…
[WIP]Add text tagging by prompt mapper op dj:op issues/PRs about some specific OPs
#408 opened Aug 30, 2024 by garyzhang99 Loading…
1 task
Add text_pair_similarity_filter dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request
#405 opened Aug 28, 2024 by Qirui-jiao Draft
Add sentence_augmentation_mapper dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request
#401 opened Aug 22, 2024 by Qirui-jiao Draft
Add mllm_mapper dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request
#400 opened Aug 22, 2024 by Qirui-jiao Draft
Add sdxl_prompt2prompt_mapper dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request
#395 opened Aug 21, 2024 by Qirui-jiao Draft
[Ready] Add image_segment_mapper dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request
#394 opened Aug 21, 2024 by Qirui-jiao Loading…
Add GPT-4V as evaluator dj:multimodal issues/PRs about multimodal data processing enhancement New feature or request stale-pr
#276 opened Mar 22, 2024 by drcege Draft DJ-SORA
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.