[FEATURE] Add GRPO Support #900

tmostak · 2025-02-20T21:20:56Z

🚀 Feature

Add GRPO Support

Motivation

With the release of DeepSeek's R1 model, GRPO has been shown to be a powerful way to instill reasoning capabilities in models for cases where there is either labeled data or a verifier. This request is to add support to train a model with GRPO, perhaps with a focus on building reasoning abilities.

sarthak247 · 2025-02-24T01:49:05Z

Heyaaaaa!
I would like to take this. I've contributed to llmstudio before so am slightly familiar with the code base (#683 ). Was a bit occupied with life lately but I'm ready to start contributing again to h2o and other open source projects and I think this could be a good point to get back into the open source landscape.

I've read a bit about GRPO and DeepSeek but might need some support to pull this through though : )
Maybe some reading materials or sample code implementations might be great to begin with.

Regards,
Sarthak

tmostak added the type/feature Feature request label Feb 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add GRPO Support #900

[FEATURE] Add GRPO Support #900

tmostak commented Feb 20, 2025

sarthak247 commented Feb 24, 2025

[FEATURE] Add GRPO Support #900

[FEATURE] Add GRPO Support #900

Comments

tmostak commented Feb 20, 2025

🚀 Feature

Motivation

sarthak247 commented Feb 24, 2025