generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Improve GRPO trainer error message for invalid num_generations
#3199
opened Mar 31, 2025 by
AliBakly
Loading…
Support for Models With Pre-Finetuned LoRA Adapters in GRPO: Add use_peft_as_reference Flag
#3196
opened Mar 31, 2025 by
LoganVegnaSHOP
Loading…
5 tasks done
[GRPO] Allow the use of the vllm logprobs, rather than recomputing them
#3193
opened Mar 31, 2025 by
edbeeching
Loading…
🚀 Enhance GRPO VLLM server from sync to async and accelerate training
#3182
opened Mar 30, 2025 by
binary-husky
Loading…
GRPO Overlong soft punishment based on DAPO
#3177
opened Mar 29, 2025 by
1485840691-eng
•
Draft
5 tasks
Co-Locating vLLM w/ training to achieve higher throughput and GPU utilization
#3162
opened Mar 26, 2025 by
toslali-ibm
Loading…
2 of 5 tasks
Fix: Compatibility for
formatting_func
returning a list
#3147
opened Mar 24, 2025 by
YeFD
Loading…
4 of 5 tasks
Extend BCO Trainer dataset format support
#3134
opened Mar 22, 2025 by
reihig-ut
Loading…
1 of 5 tasks
Add GRPO/ Online DPO support for quantitative models when use vllm as infer backbone.
#3133
opened Mar 22, 2025 by
maoulee
Loading…
improvement(utils.py): simplify repeating completion string
#3122
opened Mar 20, 2025 by
tpoisonooo
Loading…
feat: Add Interleaved Trainer implementation
#3107
opened Mar 18, 2025 by
ucalyptus2
Loading…
3 tasks done
Update sft trainer to include better packing
#3100
opened Mar 17, 2025 by
Ishan-Kumar2
Loading…
4 tasks done
[GRPO] add vlm training capabilities to the trainer
#3072
opened Mar 13, 2025 by
CompN3rd
Loading…
3 of 5 tasks
Fixing GRPO
reward_func
being a model with DeepSpeed ZeRO-3
#2984
opened Feb 28, 2025 by
jamesbraza
Loading…
Feature: Add SGLang as inference backend for generation in GRPO
#2981
opened Feb 28, 2025 by
jhinpan
Loading…
5 tasks done
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-03-01.