-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Currently we hard code prompt templates in ExecuTorch LLM apps.
But HF tokenizers know how to apply the chat template, e.g.,
prompt = "Give me a short introduction to large language model."
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=True # Switches between thinking and non-thinking modes. Default is True.
)
This makes using HF models from python with the right template very easy.
Can we have similar logic in our C++ runners?
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request