LLaMA3.1 with DDM-SFT #8

lmmlzn · 2025-03-24T09:35:16Z

Why is the model trained and saved after fine-tuning LLaMA3.1 with DDM-SFT unable to load, showing the following error:

"The state dictionary of the model you are trying to load is corrupted. Are you sure it was properly saved?"

Loading method:

base_model = LlamaForCausalLM.from_pretrained(
model_name,
device_map='auto',
_attn_implementation=args.flash_attn,
torch_dtype=torch.bfloat16
)

model = DiscreteDiffusionModel(
model=base_model,
config=config,
tokenizer=tokenizer,
device='cuda'
)
model.eval()

summmeer · 2025-03-25T05:44:43Z

Hi, if you use LLaMA-Factory to do the ddm-sft, please also use it to load and do inference. Refer to: examples/inference/llama2_full_ddm-gsm-inf.yaml. Otherwise, there might be mismatch between parameters' name. However, according to your error info, it's more like a saving bug where the model is not properly saved. You can test the save function using few training steps.

lmmlzn · 2025-03-25T06:23:59Z

Thanks！

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLaMA3.1 with DDM-SFT #8

LLaMA3.1 with DDM-SFT #8

lmmlzn commented Mar 24, 2025

summmeer commented Mar 25, 2025

lmmlzn commented Mar 25, 2025

LLaMA3.1 with DDM-SFT #8

LLaMA3.1 with DDM-SFT #8

Comments

lmmlzn commented Mar 24, 2025

summmeer commented Mar 25, 2025

lmmlzn commented Mar 25, 2025