Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about shearing params config #67

Open
LoverLost opened this issue Apr 19, 2024 · 1 comment
Open

about shearing params config #67

LoverLost opened this issue Apr 19, 2024 · 1 comment

Comments

@LoverLost
Copy link

LoverLost commented Apr 19, 2024

Hello, I would like to ask a question about parameter settings.I want to prune the llama2 model without changing the hidden_size, which means it is fixed at 4096. However i want to change the num_heads of attention,which means i want to prune the q/k/v/o from 4096 x 4096 to 4096 x 2048.Can i use the code to do this without change something? Also i noticed that in zs_block may have 'qk_head_dim_z', What does this thing do?

@xiamengzhou
Copy link
Contributor

Hi @LoverLost sorry for the late reply!

qk_head_dim_z is not supported in the current code yet, and it was supposed to prune head dimensions instead of full heads. The current code supports pruning only the heads without pruning the hidden dimensions. You need to remove hidden from the prune_params. Let me know if you encounter any issues!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants