Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finetune results #396

Open
FHT-hub opened this issue Mar 27, 2024 · 4 comments
Open

finetune results #396

FHT-hub opened this issue Mar 27, 2024 · 4 comments

Comments

@FHT-hub
Copy link

FHT-hub commented Mar 27, 2024

According to the command under seamless_communication/src/seamless_communication/cli/m4t/finetune
/README.md, I get a checkpoint.pt after finetune. And,I want to know how to verify the result of this checkpoint.pt?

@zrthxn
Copy link
Contributor

zrthxn commented Mar 30, 2024

Currently m4t_evaluate doesn’t support using a finetuned model that is saved locally at the moment, by giving a path to a model like with huggingface models. I am working on fixing this.

@FHT-hub
Copy link
Author

FHT-hub commented Apr 1, 2024

Thank you for your reply, but I may not have phrased it clearly enough, leading to a misunderstanding.
What I mean is that through the command under seamless_communication/src/seamless_communication/cli/m4t/finetune/README.md, I can get a fine-tuned checkpoint.pt.How can I use this fine-tuned checkpoint.pt for tasks like S2ST. I have looked under this checkpoint.pt and found that it does not give enough weights or even formatting to help me achieve tasks like S2ST.

@aranemini
Copy link

I found that fine-tuned checkpoint doesn't have the same format as seamlessM4T_v2_large. Two possible solutions are changing the loading code or save the checkpoint in the accepted. By modifying the save_model function like that, it works for me:

def _save_model(self) -> None:
     logger.info("Saving model")
     if dist_utils.is_main_process():
         state_dict = {
             key.replace("module.model.", ""): value
             for key, value in self.model.state_dict().items()
         }
         state_dict = {"model" : state_dict} 
         torch.save(state_dict, self.params.save_model_path)
     if dist_utils.is_dist_initialized():
         dist.barrier()

@zrthxn
Copy link
Contributor

zrthxn commented Apr 14, 2024

@aranemini the model loading code is directly calling fairseq2, getting changes merged there may take a lot of time. Your changes to _save_model looks like it will be best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants