`quantize` to int4 does not work with constants #579

pranavm-nvidia · 2025-03-18T17:08:06Z

If we try to quantize a constant to I4:

x = tp.Tensor([1.0, 2.0])
x = tp.quantize(x, scale=1.0, dtype=tp.int4)
print(x)

IRs:

==== Trace IR ====
def main() -> (
    %t2 : tensor<?xi4:gpu:0>
):
    %t0 = constant(shape=(2,), dtype=float32, device=gpu:0) : tensor<2xf32:gpu:0>
    %t1 = constant(shape=(), dtype=float32, device=gpu:0) : tensor<f32:gpu:0>
    %t2 = quantize(%t0 : tensor<2xf32:gpu:0>, %t1 : tensor<f32:gpu:0>, dtype=int4, dim=None) : tensor<?xi4:gpu:0>
    return %t2

==== MLIR ====
module @"outs_%t2_0" {
  func.func @main() -> tensor<?xi4> {
    %cst_f32 = tensorrt.constant dense<[1.000000e+00, 2.000000e+00]> : tensor<2xf32>
    %cst_f32_0 = tensorrt.constant dense<1.000000e+00> : tensor<f32>
    %0 = tensorrt.quantize in(%cst_f32 : tensor<2xf32>) scale(%cst_f32_0 : tensor<f32>) -> tensor<?xi4>
    return %0 : tensor<?xi4>
  }
}

We get an error:

MTRTException: failed to run pass pipeline
    Error Code: 9: Skipping tactic 0x0000000000000000 due to exception [ir_verifier.cpp:run:555] Error while verifying IR
    For graph 
    For basic block @1
    For op    32: move: result0'.1-(int4[2][1]so[0]p[0], mem_prop=0) | __mye33-O:(int4[2][1]so[0], mem_prop=0)<entry>, stream = 0 // q_const_move_result0'.1
    For constant __mye33-O:(int4[2][1]so[0], mem_prop=0)<entry>
    Constant has payload element type none.
    IBuilder::buildSerializedNetwork: Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[[tensorrt.constant] (%t0) ...(Unnamed Layer* 2) [Quantize]]}.)
    (%t0) error: failed to translate function 'tensorrt_cluster' to a TensorRT engine

The text was updated successfully, but these errors were encountered:

pranavm-nvidia added the tripy Pull request for the tripy project label Mar 18, 2025

pranavm-nvidia changed the title ~~quantize to I4 does not work with constants~~ quantize to int4 does not work with constants Mar 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`quantize` to int4 does not work with constants #579

`quantize` to int4 does not work with constants #579

pranavm-nvidia commented Mar 18, 2025

quantize to int4 does not work with constants #579

quantize to int4 does not work with constants #579

Comments

pranavm-nvidia commented Mar 18, 2025

`quantize` to int4 does not work with constants #579

`quantize` to int4 does not work with constants #579