Does Ludwig Support PPO? #3949
-
Hi all, I didn't see anything on the doc site so am asking here: does ludwig support PPO training? And what would an example .yaml config file look to do this? I assume the config would need to include parameters for a supervised fine-tuned model, a reward model, and any parameters for the PPO loss calculations and gradient updates for the SFT model. Thanks! |
Beta Was this translation helpful? Give feedback.
Answered by
arnavgarg1
Feb 29, 2024
Replies: 1 comment
-
Hi @braunagn! Unfortunately, Ludwig currently doesn't support PPO or DPO, but it is something we intend to add in the next few months. Would you be interested in contributing support for either of them? |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
braunagn
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @braunagn! Unfortunately, Ludwig currently doesn't support PPO or DPO, but it is something we intend to add in the next few months.
Would you be interested in contributing support for either of them?