Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model quantize error #598

Open
sailfish009 opened this issue Aug 28, 2024 · 2 comments
Open

Model quantize error #598

sailfish009 opened this issue Aug 28, 2024 · 2 comments

Comments

@sailfish009
Copy link

sailfish009 commented Aug 28, 2024

hello. I am getting an error when running the sample below.

The request file does not exist in the original source,
I copied and used the preprocessor_config.json file in the same model family.

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_path = 'OpenGVLab/ASMv2'
quant_path = 'ASMv2-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

# https://huggingface.co/liuhaotian/llava-v1.6-34b-tokenizer/blob/main/preprocessor_config.json

# Load model
model = AutoAWQForCausalLM.from_pretrained(
    model_path, device_map="cuda", **{"low_cpu_mem_usage": True}, safetensors=False
)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

# Quantize
model.quantize(tokenizer, quant_config=quant_config)

# Save quantized model
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)

print(f'Model is quantized and saved at "{quant_path}"')

The source location where the error occurs is shown below.

#     AutoAWQ/awq/quantize/quantizer.py, line 407
        if best_ratio == -1:
            logging.debug(history)
            raise Exception
@raghavgarg97
Copy link

raghavgarg97 commented Sep 5, 2024

I am facing the same issue,
quantizing gemma-2-27B using

model.quantize(tokenizer, quant_config=quant_config, calib_data=data_final,max_calib_seq_len=4096,max_calib_samples=256,n_parallel_calib_samples=10)

It fails after 35% of steps are completed

@casper-hansen any idea how to fix this?

@raghavgarg97
Copy link

raghavgarg97 commented Sep 10, 2024

@casper-hansen do you think it could be a gemma-2 model support issue?
currently I am building awq from main branch of this repo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants