-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to Split AWQ Weights? #626
Comments
Azure-Tang
changed the title
Implementing Tensor Parallel by Splitting AWQ Weights in PyTorch
How to Split AWQ Weights?
Sep 28, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Body:
Hello,
I am currently working on implementing tensor parallelism and need some guidance on how to split AWQ weights properly. Here's the current state of the AWQ weights I'm working with:
To split the weights, I used the following approach:
I also created a random input of shape (1, 2048, 4096) and performed a matrix multiplication with both the original and the split weights. However, the results do not match:
Could someone advise on how to correctly split the AWQ weights to achieve effective tensor parallelism? Any help or suggestions would be greatly appreciated!
Thank you!
The text was updated successfully, but these errors were encountered: