-
Notifications
You must be signed in to change notification settings - Fork 38
Modifying the models hyperparameters #124
Comments
Yes, you can change the model hyperparameter directly in this file https://github.com/intel/neural-speed/blob/main/neural_speed/convert/convert_llama.py#L1159-L1176. Take llama as an example. Just modify and re-run this script to get a customized gguf file.
Yes. when you get a gguf model file, you can run this script This script will use the tokenizer inside of the gguf. |
Thank you very much, @Zhenzhong1 ! |
Hi @Zhenzhong1 so I tried what you advised and :
I saw all the parameters for loading the model, but not the inference parameters (temperature, top_p, etc...)
Do I need a customized .gguf file for this command line to run, or one I just downloaded as is would work (it currently doesn't : |
Nevermind my second question, I forgot it had to be Q4 ^^ |
Ah, and while I'm at it, when using CLI, if I get an error, it freezes the terminal. Would you have a trick to avoid that ? |
@benjamin27315k just type |
you don't need a customized GGUF if you only want to modify inference parameters. Inference parameters are input args, just modify them in the command line. Please check this https://github.com/intel/neural-speed/blob/main/docs/advanced_usage.md for more inference parameters. |
@benjamin27315k, I will close this issue if you don't have concerns |
Hello there,
I'm new to neural speed, coming from llama-cpp-python, and i encounter some problems (probably due to a misunderstanding on my side).
I don't want to flood you with issues so I'll start with my two main questions :
Thank you !
The text was updated successfully, but these errors were encountered: