How could I set the stop sequence for inference like in CodeLlama 70B? #666
davideuler
started this conversation in
General
Replies: 1 comment 6 replies
-
This is more of an application implementation issue than an mlx issue. I have implemented this in one of my projects. You can take a look here. |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
When running inference with CodeLlama 70B, I need to specify the stop sequence in llama.cpp or in ollama.
When I run CodeLlama 70B 4bit MLX, it outputs lots of EOT and could not stop. I am not sure if it is caused by stop sequences settings. How could I set the stop sequence in MLX?
Beta Was this translation helpful? Give feedback.
All reactions