-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modification Details on VERL #20
Comments
The fork is already in the main repo in deepscaler! It was a pretty old commit from VERL, don't remember... We mainly fixed GRPO and added new features such as pass of N for validation, PPO epochs (which didn't work), rejection sample (which didn't work), fixing John Schulman's KL loss, and the list goes on. agentica-project/verl does not exist ;) |
Thanks for the information! I'm wondering do these fixes already available in the latest version of verl so that I can replace the verl folder with the latest version? Or can you please share the SHA of the "pretty old commit" so that I can check by myself? |
Porting verl into the repo removed the .git file, hence losing the git logs. Probably need to do a direct Git diff, which might be too much now... Verl has the fixed KL loss for example. We also tried a reverse KL loss (https://x.com/kalomaze/status/1891621285894995971) at some point, didn't work but this twitter post proves otherwise. |
As README mentioned:
However, the link https://github.com/agentica-project/verl leads to a 404 page. Do you guys have plan to publish your fork? Or can you please share more information on:
Hope to learn more about your version of verl so that I can better replicate your work, thanks very much! 😆
The text was updated successfully, but these errors were encountered: