Update README to include chat templating

The existing README has underwhelming or incorrect results (Example is underwhelming dottxt-ai#1347) due to lack of templating for instruct models. This adds special tokens for each instruct model call, as well as provide comments on how to obtain/produce special tokens.
cpfiffer · Jan 10, 2025 · c48f260 · c48f260
1 parent b55d314
commit c48f260
Showing 1 changed file with 78 additions and 17 deletions.
diff --git a/README.md b/README.md
@@ -64,26 +64,48 @@ is to ensure that there is a well-defined interface between their output and
 user-defined code. **Outlines** provides ways to control the generation of
 language models to make their output more predictable.
 
+The following methods of structured generation are supported:
+
+- [Multiple choices](#multiple-choices)
+- [Type constraints](#type-constraint)
+- [Efficient regex-structured generation](#efficient-regex-structured-generation)
+- [Efficient JSON generation following a Pydantic model](#efficient-json-generation-following-a-pydantic-model)
+- [Using context-free grammars to guide generation](#using-context-free-grammars-to-guide-generation)
+- [Open functions](#open-functions)
+
 ### Multiple choices
 
 You can reduce the completion to a choice between multiple possibilities:
 
 ``` python
 import outlines
 
-model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
+model_name = "HuggingFaceTB/SmolLM2-360M-Instruct"
+model = outlines.models.transformers(model_name)
 
-prompt = """You are a sentiment-labelling assistant.
-Is the following review positive or negative?
+# You must apply the chat template tokens to the prompt!
+# See below for an example.
+prompt = """
+<|im_start|>system
+You extract information from text.
+<|im_end|>
 
-Review: This restaurant is just awesome!
+<|im_start|>user
+What food does the following text describe?
+
+Text: I really really really want pizza.
+<|im_end|>
+<|im_start|>assistant
 """
 
-generator = outlines.generate.choice(model, ["Positive", "Negative"])
+generator = outlines.generate.choice(model, ["Pizza", "Pasta", "Salad", "Dessert"])
 answer = generator(prompt)
+print(f'{answer=}')
+
+# answer=Pizza
 ```
 
-### Type constraint
+### Type constraints
 
 You can instruct the model to only return integers or floats:
 
@@ -116,43 +138,49 @@ import outlines
 
 model = outlines.models.transformers("microsoft/Phi-3-mini-4k-instruct")
 
-prompt = "What is the IP address of the Google DNS servers? "
+prompt = """
+<|im_start|>system You are a helpful assistant.
+<|im_end|>
+
+<|im_start|>user
+What is an IP address of the Google DNS servers? 
+<|im_end|>
+<|im_start|>assistant
+The IP address of a Google DNS server is 
+
+"""
 
 generator = outlines.generate.text(model)
 unstructured = generator(prompt, max_tokens=30)
 
 generator = outlines.generate.regex(
     model,
     r"((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)",
+    sampler=outlines.samplers.greedy(),
 )
 structured = generator(prompt, max_tokens=30)
 
 print(unstructured)
-# What is the IP address of the Google DNS servers?
-#
-# Passive DNS servers are at DNS servers that are private.
-# In other words, both IP servers are private. The database
-# does not contain Chelsea Manning
+# 8.8.8.8
+# 
+# <|im_end|>
 
 print(structured)
-# What is the IP address of the Google DNS servers?
-# 2.2.6.1
+# 8.8.8.8
 ```
 
 Unlike other libraries, regex-structured generation in Outlines is almost as fast
 as non-structured generation.
 
 ### Efficient JSON generation following a Pydantic model
 
-Outlines  allows to guide the generation process so the output is *guaranteed* to follow a [JSON schema](https://json-schema.org/) or [Pydantic model](https://docs.pydantic.dev/latest/):
+Outlines users can guide the generation process so the output is *guaranteed* to follow a [JSON schema](https://json-schema.org/) or [Pydantic model](https://docs.pydantic.dev/latest/):
 
 ```python
 from enum import Enum
 from pydantic import BaseModel, constr
 
 import outlines
-import torch
-
 
 class Weapon(str, Enum):
     sword = "sword"
@@ -335,6 +363,39 @@ prompt = labelling("Just awesome", examples)
 answer = outlines.generate.text(model)(prompt, max_tokens=100)
 ```
 
+### Chat template tokens
+
+Outlines does not manage chat templating tokens when using instruct models. You must apply the chat template tokens to the prompt yourself. Chat template tokens are not needed for base models.
+
+You can find the chat template tokens in the model's HuggingFace repo or documentation. As an example, the SmolLM2-360M-Instruct special tokens can be found [here](https://huggingface.co/HuggingFaceTB/SmolLM2-360M-Instruct/blob/main/special_tokens_map.json).
+
+A convenient way to do this is to use the `tokenizer` from the `transformers` library:
+
+```python
+from transformers import AutoTokenizer
+
+tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-360M-Instruct")
+prompt = tokenizer.apply_chat_template(
+    [
+        {"role": "system", "content": "You extract information from text."},
+        {"role": "user", "content": "What food does the following text describe?"},
+    ],
+    tokenize=False,
+    add_bos=True,
+    add_generation_prompt=True,
+)
+```
+
+yields
+
+```
+<|im_start|>system
+You extract information from text.<|im_end|>
+<|im_start|>user
+What food does the following text describe?<|im_end|>
+<|im_start|>assistant
+```
+
 ## Join us
 
 - 💡 **Have an idea?** Come chat with us on [Discord][discord]