Replies: 1 comment
-
If I use the premade training script in NeMo, will that give me the 128K context length, or 4096 context length? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Team,
Thank you for your excellent work.
Could you please tell me how to pretrain Qwen 2.5 7B with 128K context length?
Where to find this recipe?
Thanks again!
Beta Was this translation helpful? Give feedback.
All reactions