Skip to content

AssertionError of act_observer when using SmoothQuant for Llama-13b #2033

Open
@kyang-06

Description

@kyang-06

When I tried smoothquant with sample code clip

from neural_compressor.torch.quantization import SmoothQuantConfig, convert, prepare
def run_fn(model):
    model(example_inputs)
quant_config = SmoothQuantConfig(alpha=0.5)
prepared_model = prepare(fp32_model, quant_config=quant_config, example_inputs=example_inputs)
run_fn(prepared_model)
q_model = convert(prepared_model)

I got the error

AssertionError                            Traceback (most recent call last)
Cell In[7], line 11
      9 quant_config = SmoothQuantConfig(alpha=0.5)
     10 print(quant_config)
---> 11 prepared_model = prepare(model, quant_config=quant_config, example_inputs=example_prompts)
     12 run_fn(prepared_model)
     13 q_model = convert(prepared_model)
...
...
File ~/anaconda3/envs/intel-arc-py39/lib/python3.9/site-packages/intel_extension_for_pytorch/quantization/_smooth_quant.py:85, in SmoothQuantActivationObserver.__init__(self, act_observer, act_ic_observer, smooth_quant_enabled, dtype, qscheme, reduce_range, quant_min, quant_max, alpha, factory_kwargs, eps)
     75     self.act_obs = HistogramObserver(
     76         dtype=dtype,
     77         qscheme=qscheme,
   (...)
     82         eps=eps,
     83     )
     84 else:
---> 85     assert isinstance(act_observer, UniformQuantizationObserverBase), 'act_observer:' + str(act_observer)
     86     self.act_obs = act_observer
     87 # if smooth_quant_enabled is false, this observer acts as
     88 # a normal per-tensor observer

AssertionError: act_observer:<class 'torch.ao.quantization.observer.MinMaxObserver'>

Below is my env

torch                            2.1.0a0+cxx11.abi
neural_compressor                3.0.2
neural_compressor_3x_pt          2.6
intel-extension-for-pytorch      2.1.10+xpu
intel-extension-for-transformers 1.2.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions