Enable torchao.experimental EmbeddingQuantization #1520

Jack-Khuu · 2025-03-31T20:48:15Z

🚀 The feature, motivation and pitch

Quantization is a technique used to reduce the speed, size, or memory requirements of a model and torchao is PyTorch's native quantization library for inference and training

There are new experimental quantizations in torchao that we would like to enable in torchchat. Specifically this task is for enabling EmbeddingQuantizer and SharedEmbeddingQuantizer.

Entrypoint:

torchchat/torchchat/utils/quantize.py

Line 101 in 1384f7d

def quantize_model(

Task: Using ExecuTorch as a reference (pytorch/executorch#9548) add support for EmbeddingQuantizer and SharedEmbeddingQuantizer.

cc: @metascroy, @manuelcandales

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

dillondesilva · 2025-04-01T09:52:27Z

@Jack-Khuu I'd love to have a crack at this if possible! Would you mind assigning it to me?

Jack-Khuu · 2025-04-01T17:00:33Z

Totally, give it a shot

Jack-Khuu · 2025-04-07T21:06:51Z

Hi @dillondesilva, how's the task going? Any questions?

dillondesilva · 2025-04-08T01:52:08Z

Hey @Jack-Khuu - I've just been busy with mid-semester exams in the past week. Should have time to start sometime this week and will send questions soon 👍 Thanks for checking in!

dillondesilva · 2025-04-13T11:40:47Z

@Jack-Khuu Good news! Here's the PR -> #1525

I don't know if I've oversimplified it so feel free to correct me if I'm wrong but to enable the above experimental quantizers, I'm assuming all that was needed was:

A mapping in quantizer_class_dict to enable devs to specify when they are using one of the new experimental options (i.e. experimental:embedding and experimental:shared)
Logic to run the respective quantizers if they have been specified

Jack-Khuu · 2025-04-15T17:03:12Z

Sweet! I'll take a look

Jack-Khuu added this to [torchchat] Looking for Contributors Mar 31, 2025

Jack-Khuu moved this to Ready in [torchchat] Looking for Contributors Mar 31, 2025

Jack-Khuu added Quantization Issues related to Quantization or torchao triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Mar 31, 2025

Jack-Khuu assigned dillondesilva Apr 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable torchao.experimental EmbeddingQuantization #1520

Enable torchao.experimental EmbeddingQuantization #1520

Jack-Khuu commented Mar 31, 2025

dillondesilva commented Apr 1, 2025

Jack-Khuu commented Apr 1, 2025

Jack-Khuu commented Apr 7, 2025 •

edited

Loading

dillondesilva commented Apr 8, 2025 •

edited

Loading

dillondesilva commented Apr 13, 2025

Jack-Khuu commented Apr 15, 2025

Enable torchao.experimental EmbeddingQuantization #1520

Enable torchao.experimental EmbeddingQuantization #1520

Comments

Jack-Khuu commented Mar 31, 2025

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

dillondesilva commented Apr 1, 2025

Jack-Khuu commented Apr 1, 2025

Jack-Khuu commented Apr 7, 2025 • edited Loading

dillondesilva commented Apr 8, 2025 • edited Loading

dillondesilva commented Apr 13, 2025

Jack-Khuu commented Apr 15, 2025

Jack-Khuu commented Apr 7, 2025 •

edited

Loading

dillondesilva commented Apr 8, 2025 •

edited

Loading