Open
Description
When I tried smoothquant with sample code clip
from neural_compressor.torch.quantization import SmoothQuantConfig, convert, prepare
def run_fn(model):
model(example_inputs)
quant_config = SmoothQuantConfig(alpha=0.5)
prepared_model = prepare(fp32_model, quant_config=quant_config, example_inputs=example_inputs)
run_fn(prepared_model)
q_model = convert(prepared_model)
I got the error
AssertionError Traceback (most recent call last)
Cell In[7], line 11
9 quant_config = SmoothQuantConfig(alpha=0.5)
10 print(quant_config)
---> 11 prepared_model = prepare(model, quant_config=quant_config, example_inputs=example_prompts)
12 run_fn(prepared_model)
13 q_model = convert(prepared_model)
...
...
File ~/anaconda3/envs/intel-arc-py39/lib/python3.9/site-packages/intel_extension_for_pytorch/quantization/_smooth_quant.py:85, in SmoothQuantActivationObserver.__init__(self, act_observer, act_ic_observer, smooth_quant_enabled, dtype, qscheme, reduce_range, quant_min, quant_max, alpha, factory_kwargs, eps)
75 self.act_obs = HistogramObserver(
76 dtype=dtype,
77 qscheme=qscheme,
(...)
82 eps=eps,
83 )
84 else:
---> 85 assert isinstance(act_observer, UniformQuantizationObserverBase), 'act_observer:' + str(act_observer)
86 self.act_obs = act_observer
87 # if smooth_quant_enabled is false, this observer acts as
88 # a normal per-tensor observer
AssertionError: act_observer:<class 'torch.ao.quantization.observer.MinMaxObserver'>
Below is my env
torch 2.1.0a0+cxx11.abi
neural_compressor 3.0.2
neural_compressor_3x_pt 2.6
intel-extension-for-pytorch 2.1.10+xpu
intel-extension-for-transformers 1.2.1
Metadata
Metadata
Assignees
Labels
No labels