Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modification Details on VERL #20

Open
zmy opened this issue Feb 13, 2025 · 3 comments
Open

Modification Details on VERL #20

zmy opened this issue Feb 13, 2025 · 3 comments

Comments

@zmy
Copy link

zmy commented Feb 13, 2025

As README mentioned:

Our training experiments are powered by our heavily modified fork of Verl, an open-source RLHF library.

However, the link https://github.com/agentica-project/verl leads to a 404 page. Do you guys have plan to publish your fork? Or can you please share more information on:

  • Which original commit / release the customized verl is forked from?
  • What key modifications (that supports deepscaler) are done in agentica-project/verl?
  • In the verl copy inside this deepscaler repo, are there further modifications beyond agentica-project/verl?

Hope to learn more about your version of verl so that I can better replicate your work, thanks very much! 😆

@zmy zmy changed the title Modifications on verl Modification Details on VERL Feb 13, 2025
@michaelzhiluo
Copy link
Contributor

The fork is already in the main repo in deepscaler!

It was a pretty old commit from VERL, don't remember... We mainly fixed GRPO and added new features such as pass of N for validation, PPO epochs (which didn't work), rejection sample (which didn't work), fixing John Schulman's KL loss, and the list goes on.

agentica-project/verl does not exist ;)

@zmy
Copy link
Author

zmy commented Feb 18, 2025

The fork is already in the main repo in deepscaler!

It was a pretty old commit from VERL, don't remember... We mainly fixed GRPO and added new features such as pass of N for validation, PPO epochs (which didn't work), rejection sample (which didn't work), fixing John Schulman's KL loss, and the list goes on.

agentica-project/verl does not exist ;)

Thanks for the information! I'm wondering do these fixes already available in the latest version of verl so that I can replace the verl folder with the latest version? Or can you please share the SHA of the "pretty old commit" so that I can check by myself?

@michaelzhiluo
Copy link
Contributor

Porting verl into the repo removed the .git file, hence losing the git logs. Probably need to do a direct Git diff, which might be too much now...

Verl has the fixed KL loss for example. We also tried a reverse KL loss (https://x.com/kalomaze/status/1891621285894995971) at some point, didn't work but this twitter post proves otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants