Skip to content

Enable torchao.experimental EmbeddingQuantization #1520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Jack-Khuu opened this issue Mar 31, 2025 · 6 comments
Open

Enable torchao.experimental EmbeddingQuantization #1520

Jack-Khuu opened this issue Mar 31, 2025 · 6 comments
Assignees
Labels
Quantization Issues related to Quantization or torchao triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Comments

@Jack-Khuu
Copy link
Contributor

🚀 The feature, motivation and pitch

Quantization is a technique used to reduce the speed, size, or memory requirements of a model and torchao is PyTorch's native quantization library for inference and training

There are new experimental quantizations in torchao that we would like to enable in torchchat. Specifically this task is for enabling EmbeddingQuantizer and SharedEmbeddingQuantizer.

Entrypoint:

def quantize_model(

Task: Using ExecuTorch as a reference (pytorch/executorch#9548) add support for EmbeddingQuantizer and SharedEmbeddingQuantizer.

cc: @metascroy, @manuelcandales

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

@Jack-Khuu Jack-Khuu added Quantization Issues related to Quantization or torchao triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Mar 31, 2025
@dillondesilva
Copy link

@Jack-Khuu I'd love to have a crack at this if possible! Would you mind assigning it to me?

@Jack-Khuu
Copy link
Contributor Author

Totally, give it a shot

@Jack-Khuu
Copy link
Contributor Author

Jack-Khuu commented Apr 7, 2025

Hi @dillondesilva, how's the task going? Any questions?

@dillondesilva
Copy link

dillondesilva commented Apr 8, 2025

Hey @Jack-Khuu - I've just been busy with mid-semester exams in the past week. Should have time to start sometime this week and will send questions soon 👍 Thanks for checking in!

@dillondesilva
Copy link

@Jack-Khuu Good news! Here's the PR -> #1525

I don't know if I've oversimplified it so feel free to correct me if I'm wrong but to enable the above experimental quantizers, I'm assuming all that was needed was:

  1. A mapping in quantizer_class_dict to enable devs to specify when they are using one of the new experimental options (i.e. experimental:embedding and experimental:shared)
  2. Logic to run the respective quantizers if they have been specified

@Jack-Khuu
Copy link
Contributor Author

Sweet! I'll take a look

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Quantization Issues related to Quantization or torchao triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Projects
Development

No branches or pull requests

2 participants