[WIP] Quartet QAT support #38696

BlackSamorez · 2025-06-09T12:31:24Z

This PR adds support for the Quartet QAT method.

The goal of this PR is to integrate inference and training support for the Quartet QAT method. That would allow to perform both forward and backward passes in MXFP4, allowing for very fast training on Blackwell GPUs.

Currently, we're working on the kernels here, here and here (some of the libs aren't public yet). We're planning to release the first version of the kernels this week and have optimized performance by end of June.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Rocketknight1 · 2025-06-09T14:41:19Z

cc @MekkCyber

MekkCyber

Hi @BlackSamorez ! Thanks a lot for this addition 🤗 ! Left a few comments !

MekkCyber · 2025-06-12T08:23:40Z

src/transformers/integrations/quartet_qat.py

@@ -0,0 +1,49 @@
+# Copyright 2024 The HuggingFace Team. All rights reserved.


Suggested change

# Copyright 2024 The HuggingFace Team. All rights reserved.

# Copyright 2025 The HuggingFace Team. All rights reserved.

MekkCyber · 2025-06-12T08:25:10Z

src/transformers/integrations/quartet_qat.py

+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"HIGGS through FLUTE (Flexible Lookup Table Engine for LUT-quantized LLMs) integration file"


Suggested change

"HIGGS through FLUTE (Flexible Lookup Table Engine for LUT-quantized LLMs) integration file"

"Quartet QAT integration file"

MekkCyber · 2025-06-12T08:26:31Z

src/transformers/integrations/quartet_qat.py

+if is_torch_available():
+    pass
+


we don't need this

MekkCyber · 2025-06-12T08:28:25Z

src/transformers/quantizers/quantizer_quartet_qat.py

@@ -0,0 +1,164 @@
+# Copyright 2024 The HuggingFace Inc. team. All rights reserved.


Suggested change

# Copyright 2024 The HuggingFace Inc. team. All rights reserved.

# Copyright 2025 The HuggingFace Inc. team. All rights reserved.

MekkCyber · 2025-06-12T08:58:23Z

src/transformers/quantizers/quantizer_quartet_qat.py

+    Quantizer of the HIGGS method. Enables the loading of prequantized models and in-flight quantization of full-precision models.
+    """
+


to be updated

MekkCyber · 2025-06-12T09:15:34Z

src/transformers/utils/import_utils.py

+def is_qutlass_available():
+    return _qutlass_available
+


I can't find a distribution for qutlass, is it not released yet ?

MekkCyber · 2025-06-12T09:21:58Z

src/transformers/quantizers/quantizer_quartet_qat.py

+        for name, module in tqdm(quartet_qat_modules.items(), desc="Pre-processing Quartet QAT modules", leave=False):
+            pass
+            # module.pre_forward()
+


What’s meant to happen here exactly ?

MekkCyber · 2025-06-12T09:23:32Z

src/transformers/quantizers/quantizer_quartet_qat.py

+        if isinstance(module, QuartetLinear) and tensor_name == "weight":
+            # Only quantize weights of QuartetLinear modules that are not already quantized
+            return True
+        else:
+            return False


is the bias quantized too ?

MekkCyber · 2025-06-12T09:33:20Z

src/transformers/quantizers/quantizer_quartet_qat.py

+        assert isinstance(module, QuartetLinear), f"Module {param_name} is not a QuartetLinear somehow..."
+


no need for assert here, or we can just raise an error instead

MekkCyber · 2025-06-12T09:34:04Z

src/transformers/quantizers/quantizer_quartet_qat.py

+        module.pre_forward()
+


what's happening here ?

Hadamard transform matrix initialization on the correct devices.

Since it's a QAT method, we might or might not want to keep a full-precision weight copy. If we don't need the full precision weight copy, this function also deletes the .weight parameter after quantizing it. Here's the code.

quartet

62ea54b

BlackSamorez mentioned this pull request Jun 9, 2025

Release Quartet pre-trained model checkpoints on Hugging Face IST-DASLab/Quartet#1

Closed

MekkCyber reviewed Jun 12, 2025

View reviewed changes

SunMarc self-requested a review June 12, 2025 15:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Quartet QAT support #38696

[WIP] Quartet QAT support #38696

Uh oh!

BlackSamorez commented Jun 9, 2025

Uh oh!

Rocketknight1 commented Jun 9, 2025

Uh oh!

MekkCyber left a comment

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

MekkCyber Jun 12, 2025

Uh oh!

BlackSamorez Jun 12, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,49 @@
		# Copyright 2024 The HuggingFace Team. All rights reserved.

	# Copyright 2024 The HuggingFace Team. All rights reserved.
	# Copyright 2025 The HuggingFace Team. All rights reserved.

	"HIGGS through FLUTE (Flexible Lookup Table Engine for LUT-quantized LLMs) integration file"
	"Quartet QAT integration file"

		@@ -0,0 +1,164 @@
		# Copyright 2024 The HuggingFace Inc. team. All rights reserved.

	# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
	# Copyright 2025 The HuggingFace Inc. team. All rights reserved.

		Quantizer of the HIGGS method. Enables the loading of prequantized models and in-flight quantization of full-precision models.
		"""

		assert isinstance(module, QuartetLinear), f"Module {param_name} is not a QuartetLinear somehow..."

[WIP] Quartet QAT support #38696

Are you sure you want to change the base?

[WIP] Quartet QAT support #38696

Uh oh!

Conversation

BlackSamorez commented Jun 9, 2025

This PR adds support for the Quartet QAT method.

Who can review?

Uh oh!

Rocketknight1 commented Jun 9, 2025

Uh oh!

MekkCyber left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!