Skip to content

LLaMA3.1 with DDM-SFT #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lmmlzn opened this issue Mar 24, 2025 · 2 comments
Open

LLaMA3.1 with DDM-SFT #8

lmmlzn opened this issue Mar 24, 2025 · 2 comments

Comments

@lmmlzn
Copy link

lmmlzn commented Mar 24, 2025

Why is the model trained and saved after fine-tuning LLaMA3.1 with DDM-SFT unable to load, showing the following error:

"The state dictionary of the model you are trying to load is corrupted. Are you sure it was properly saved?"

Loading method:

base_model = LlamaForCausalLM.from_pretrained(
model_name,
device_map='auto',
_attn_implementation=args.flash_attn,
torch_dtype=torch.bfloat16
)

model = DiscreteDiffusionModel(
model=base_model,
config=config,
tokenizer=tokenizer,
device='cuda'
)
model.eval()

@summmeer
Copy link
Contributor

Hi, if you use LLaMA-Factory to do the ddm-sft, please also use it to load and do inference. Refer to: examples/inference/llama2_full_ddm-gsm-inf.yaml. Otherwise, there might be mismatch between parameters' name. However, according to your error info, it's more like a saving bug where the model is not properly saved. You can test the save function using few training steps.

@lmmlzn
Copy link
Author

lmmlzn commented Mar 25, 2025

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants