Train a TinyLM from scratch, slides
Train a tiny LLM to write poem in Chinese
56315 Chinese Tang poem
Modified from https://github.com/karpathy/nanoGPT
size: 0.1-0.7B
Train loop: use naive train loop in nanoGPT
Modified from Mistral-7B
size: 0.4-0.9B
Train loop: use HuggingFace tranformers trainer
- Pretrain
- Single node
- SFT
- Alignment
- Multi-node multi-gpu / FSDP
- Small model 1-7B
- MoE
- Multimodal
- Fine-tune
Winston Zhang
2024/03/08