Skip to content

Modify merge_and_overwrite_lora to Support Local Directories to Merge Base Models #100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: nightly
Choose a base branch
from

Conversation

johnray22
Copy link

@johnray22 johnray22 commented Mar 26, 2025

When using local model to train, there will be some problems
This PR is related to unslothai/unsloth#1695, unslothai/unsloth#2150, unslothai/unsloth#2140

This PR fixes an issue in the merge_and_overwrite_lora function, which previously assumed that the model_name parameter was always a Hugging Face Hub repository ID. This assumption caused a FileNotFoundError when a local directory path was provided, as the function attempted to download files from the Hub instead of using the local files.

Changes Made
1.Local Directory Check: Added a check with os.path.isdir(model_name) to detect if model_name is a local directory or a Hugging Face repository ID.
2.Local File Handling: When model_name is a local directory, the function now uses the .safetensors files from that directory to merge the LoRA weights, avoiding unnecessary downloads from the Hub.
3.Updated _merge_and_overwrite_lora: Modified the internal _merge_and_overwrite_lora function to accept an optional local_file_path parameter, enabling it to process local files directly when specified.

**Note: these steps are needed **

  1. The "base_model_name_or_path" field in adapter_config.json from lora adapter folder should be set to base model's path(like 16bit model path).
  2. after merging, the config files except .safetensors files from base model(16bit ver) should be copied to the folder which saved the merged model

Theoretically, it can set a 'base_model_path' parameter to 'model.save_pretrained_merged' function, and pass it to 'merge_and_overwrite_lora', so we don't need to modify the json file manually, but it need to change the save.py in unsloth library.
And the configure files after merging seem to not work, but after coping files except 'safetensors' to the output path, it will work.(I don't know why the configure files generated by merging process not work, so I didn't add this logic)

I wrote a merging script to do the two steps: https://colab.research.google.com/drive/1PGr1LwebNfuuu2XIULsruIfI3e8L9_rS#scrollTo=3TSHXt7e0tmq

…h previously assumed that the model_name parameter was always a Hugging Face Hub repository ID. This assumption caused a FileNotFoundError when a local directory path was provided, as the function attempted to download files from the Hub instead of using the local files.

Changes Made
1.Local Directory Check: Added a check with os.path.isdir(model_name) to detect if model_name is a local directory or a Hugging Face repository ID.
2.Local File Handling: When model_name is a local directory, the function now uses the .safetensors files from that directory to merge the LoRA weights, avoiding unnecessary downloads from the Hub.
3.Updated _merge_and_overwrite_lora: Modified the internal _merge_and_overwrite_lora function to accept an optional local_file_path parameter, enabling it to process local files directly when specified.
4.Preserved Original Functionality: Kept the existing logic for Hugging Face repository IDs intact, including support for pushing the merged model to the Hub.

Note that the "base_model_name_or_path" field in adapter_config.json from lora adapter folder should be set to base model's path(like 16bit model path).

And after merging, the config files except .safetensors files should be copied to the folder which saved the merged model
Copy link
Collaborator

@rolandtannous rolandtannous left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach might need to be changed based on the comments I shared. happy to discuss though.

]
else:
try:
file_list = HfFileSystem(token=token).ls(model_name, detail=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens here with the .ls(model_name), in the case where the user trains and saves a model in a local subfolder called as an example "gemma-3-finetune", and is trying to push it to a repo "user_name/gemma-3-finetune"?
ls(model_name) would try to remotely list the files at the "gemma-3-finetune" HF repo, which will might lead to an error since the "username" for the remote repo appears to be missing from the argument passed to the ls method?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three more considerations here:

  1. why not implement the change in the parent merge_and_overwrite_lora function which calls the internal _merge_and_overwrite_lora function that you are amending here? since the parent function also contains the similar logic you're trying to change. check that first.
  2. Your changes to the _merge_and_overwrite_lora function here, in terms of the addition of local_file_path argument, will always default to a value of None, if you also don't edit how the function is currently being called by merge_and_overwrite_lora
  3. unrelated to this specific bug, but it might be a good idea to add logic , which would make merge_and_overwrite_lora check if the model has already been merged and overwritten , eg in cases where users called _save_pretrained_merged() method beforehand, in order to avoid duplicate merging effort, since it impacts code in the merge_and_overwrite_lora function.

@rolandtannous
Copy link
Collaborator

Hey @johnray22 Thank you for you effort.
These issues have now been resolved by this PR that was merged yesterday!
#135

Thank you for being part of the community.

@shimmyshimmer can we please close this as it has been superseded by the PR we merged yesterday?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants