Skip to content

360-LLAMA-Factory框架能否实现长序列的训练加速?还是说只是会减少显存占用? #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
StarDewXXX opened this issue Mar 15, 2025 · 1 comment

Comments

@StarDewXXX
Copy link

例如,原生的LLaMA Factory可以在4k长度下对7B模型进行DPO full finetune,使用 360-LLAMA-Factory后,训练速度会提升吗?

@HaoshengZou
Copy link
Collaborator

HaoshengZou commented Mar 17, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants