Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add GRPO Support #900

Open
tmostak opened this issue Feb 20, 2025 · 1 comment
Open

[FEATURE] Add GRPO Support #900

tmostak opened this issue Feb 20, 2025 · 1 comment
Labels
type/feature Feature request

Comments

@tmostak
Copy link

tmostak commented Feb 20, 2025

🚀 Feature

Add GRPO Support

Motivation

With the release of DeepSeek's R1 model, GRPO has been shown to be a powerful way to instill reasoning capabilities in models for cases where there is either labeled data or a verifier. This request is to add support to train a model with GRPO, perhaps with a focus on building reasoning abilities.

@tmostak tmostak added the type/feature Feature request label Feb 20, 2025
@sarthak247
Copy link
Contributor

Heyaaaaa!
I would like to take this. I've contributed to llmstudio before so am slightly familiar with the code base (#683 ). Was a bit occupied with life lately but I'm ready to start contributing again to h2o and other open source projects and I think this could be a good point to get back into the open source landscape.

I've read a bit about GRPO and DeepSeek but might need some support to pull this through though : )
Maybe some reading materials or sample code implementations might be great to begin with.

Regards,
Sarthak

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Feature request
Projects
None yet
Development

No branches or pull requests

2 participants