Skip to content

Commit 7bc134b

Browse files
committed
initial copy from smol-course
1 parent 013f293 commit 7bc134b

File tree

5 files changed

+6199
-0
lines changed

5 files changed

+6199
-0
lines changed

chapters/en/chapter11/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Instruction Tuning
2+
3+
This module will guide you through instruction tuning language models. Instruction tuning involves adapting pre-trained models to specific tasks by further training them on task-specific datasets. This process helps models improve their performance on targeted tasks.
4+
5+
In this module, we will explore two topics: 1) Chat Templates and 2) Supervised Fine-Tuning.
6+
7+
## 1️⃣ Chat Templates
8+
9+
Chat templates structure interactions between users and AI models, ensuring consistent and contextually appropriate responses. They include components like system prompts and role-based messages. For more detailed information, refer to the [Chat Templates](./chat_templates.md) section.
10+
11+
## 2️⃣ Supervised Fine-Tuning
12+
13+
Supervised Fine-Tuning (SFT) is a critical process for adapting pre-trained language models to specific tasks. It involves training the model on a task-specific dataset with labeled examples. For a detailed guide on SFT, including key steps and best practices, see the [Supervised Fine-Tuning](./supervised_fine_tuning.md) page.
14+
15+
## Exercise Notebooks
16+
17+
| Title | Description | Exercise | Link | Colab |
18+
|-------|-------------|----------|------|-------|
19+
| Chat Templates | Learn how to use chat templates with SmolLM2 and process datasets into chatml format | 🐢 Convert the `HuggingFaceTB/smoltalk` dataset into chatml format <br> 🐕 Convert the `openai/gsm8k` dataset into chatml format | [Notebook](./notebooks/chat_templates_example.ipynb) | <a target="_blank" href="https://colab.research.google.com/github/huggingface/smol-course/blob/main/1_instruction_tuning/notebooks/chat_templates_example.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> |
20+
| Supervised Fine-Tuning | Learn how to fine-tune SmolLM2 using the SFTTrainer | 🐢 Use the `HuggingFaceTB/smoltalk` dataset<br>🐕 Try out the `bigcode/the-stack-smol` dataset<br>🦁 Select a dataset for a real world use case | [Notebook](./notebooks/sft_finetuning_example.ipynb) | <a target="_blank" href="https://colab.research.google.com/github/huggingface/smol-course/blob/main/1_instruction_tuning/notebooks/sft_finetuning_example.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> |
21+
22+
## References
23+
24+
- [Transformers documentation on chat templates](https://huggingface.co/docs/transformers/main/en/chat_templating)
25+
- [Script for Supervised Fine-Tuning in TRL](https://github.com/huggingface/trl/blob/main/examples/scripts/sft.py)
26+
- [`SFTTrainer` in TRL](https://huggingface.co/docs/trl/main/en/sft_trainer)
27+
- [Direct Preference Optimization Paper](https://arxiv.org/abs/2305.18290)
28+
- [Supervised Fine-Tuning with TRL](https://huggingface.co/docs/trl/main/en/tutorials/supervised_finetuning)
29+
- [How to fine-tune Google Gemma with ChatML and Hugging Face TRL](https://www.philschmid.de/fine-tune-google-gemma)
30+
- [Fine-tuning LLM to Generate Persian Product Catalogs in JSON Format](https://huggingface.co/learn/cookbook/en/fine_tuning_llm_to_generate_persian_product_catalogs_in_json_format)
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
# Chat Templates
2+
3+
Chat templates are essential for structuring interactions between language models and users. They provide a consistent format for conversations, ensuring that models understand the context and role of each message while maintaining appropriate response patterns.
4+
5+
## Base Models vs Instruct Models
6+
7+
A base model is trained on raw text data to predict the next token, while an instruct model is fine-tuned specifically to follow instructions and engage in conversations. For example, `SmolLM2-135M` is a base model, while `SmolLM2-135M-Instruct` is its instruction-tuned variant.
8+
9+
To make a base model behave like an instruct model, we need to format our prompts in a consistent way that the model can understand. This is where chat templates come in. ChatML is one such template format that structures conversations with clear role indicators (system, user, assistant).
10+
11+
It's important to note that a base model could be fine-tuned on different chat templates, so when we're using an instruct model we need to make sure we're using the correct chat template.
12+
13+
## Understanding Chat Templates
14+
15+
At their core, chat templates define how conversations should be formatted when communicating with a language model. They include system-level instructions, user messages, and assistant responses in a structured format that the model can understand. This structure helps maintain consistency across interactions and ensures the model responds appropriately to different types of inputs. Below is an example of a chat template:
16+
17+
```sh
18+
<|im_start|>user
19+
Hi there!<|im_end|>
20+
<|im_start|>assistant
21+
Nice to meet you!<|im_end|>
22+
<|im_start|>user
23+
Can I ask a question?<|im_end|>
24+
<|im_start|>assistant
25+
```
26+
27+
The `transformers` library will take care of chat templates for you in relation to the model's tokenizer. Read more about how transformers builds chat templates [here](https://huggingface.co/docs/transformers/en/chat_templating#how-do-i-use-chat-templates). All we have to do is structure our messages in the correct way and the tokenizer will take care of the rest. Here's a basic example of a conversation:
28+
29+
```python
30+
messages = [
31+
{"role": "system", "content": "You are a helpful assistant focused on technical topics."},
32+
{"role": "user", "content": "Can you explain what a chat template is?"},
33+
{"role": "assistant", "content": "A chat template structures conversations between users and AI models..."}
34+
]
35+
```
36+
37+
Let's break down the above example, and see how it maps to the chat template format.
38+
39+
## System Messages
40+
41+
System messages set the foundation for how the model should behave. They act as persistent instructions that influence all subsequent interactions. For example:
42+
43+
```python
44+
system_message = {
45+
"role": "system",
46+
"content": "You are a professional customer service agent. Always be polite, clear, and helpful."
47+
}
48+
```
49+
50+
## Conversations
51+
52+
Chat templates maintain context through conversation history, storing previous exchanges between users and the assistant. This allows for more coherent multi-turn conversations:
53+
54+
```python
55+
conversation = [
56+
{"role": "user", "content": "I need help with my order"},
57+
{"role": "assistant", "content": "I'd be happy to help. Could you provide your order number?"},
58+
{"role": "user", "content": "It's ORDER-123"},
59+
]
60+
```
61+
62+
## Implementation with Transformers
63+
64+
The transformers library provides built-in support for chat templates. Here's how to use them:
65+
66+
```python
67+
from transformers import AutoTokenizer
68+
69+
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-135M-Instruct")
70+
71+
messages = [
72+
{"role": "system", "content": "You are a helpful coding assistant."},
73+
{"role": "user", "content": "Write a Python function to sort a list"},
74+
]
75+
76+
# Apply the chat template
77+
formatted_chat = tokenizer.apply_chat_template(
78+
messages,
79+
tokenize=False,
80+
add_generation_prompt=True
81+
)
82+
```
83+
84+
## Custom Formatting
85+
You can customize how different message types are formatted. For example, adding special tokens or formatting for different roles:
86+
87+
```python
88+
template = """
89+
<|system|>{system_message}
90+
<|user|>{user_message}
91+
<|assistant|>{assistant_message}
92+
""".lstrip()
93+
```
94+
95+
## Multi-Turn Support
96+
97+
Templates can handle complex multi-turn conversations while maintaining context:
98+
99+
```python
100+
messages = [
101+
{"role": "system", "content": "You are a math tutor."},
102+
{"role": "user", "content": "What is calculus?"},
103+
{"role": "assistant", "content": "Calculus is a branch of mathematics..."},
104+
{"role": "user", "content": "Can you give me an example?"},
105+
]
106+
```
107+
108+
⏭️ [Next: Supervised Fine-Tuning](./supervised_fine_tuning.md)
109+
110+
## Resources
111+
112+
- [Hugging Face Chat Templating Guide](https://huggingface.co/docs/transformers/main/en/chat_templating)
113+
- [Transformers Documentation](https://huggingface.co/docs/transformers)
114+
- [Chat Templates Examples Repository](https://github.com/chujiezheng/chat_templates)

0 commit comments

Comments
 (0)