You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for the great work! I just wanted to share that verl has added support for vLLM 0.7.x (link), and the performance boost is pretty impressive:
For a typical job like examples/ppo_trainer/run_qwen2-7b_seq_balance.sh, the rollout generation time is 115 seconds with vLLM0.6.3, while it is 85 seconds with vLLM0.7.0. By enabling the cudagraph, the generation duration is further reduced to 62 seconds.
Are you planning to add support for this in deepscaler? It would be awesome to see these speed improvements in deepscaler too.
Thanks for considering!
Best regards,
Runze
The text was updated successfully, but these errors were encountered:
Hi,
Thanks for the great work! I just wanted to share that verl has added support for vLLM 0.7.x (link), and the performance boost is pretty impressive:
Are you planning to add support for this in deepscaler? It would be awesome to see these speed improvements in deepscaler too.
Thanks for considering!
Best regards,
Runze
The text was updated successfully, but these errors were encountered: