Skip to content

v0.9.3: Llama4, Gemma3, Qwen3, InternVL3, Qwen2.5-Omni

Latest
Compare
Choose a tag to compare
@hiyouga hiyouga released this 16 Jun 17:21
· 79 commits to main since this release
f3d144f

We will attend the AWS Summit Shanghai 2025 on June 20th! See you in Shanghai 👋

New features

New models

  • Base models
    • SmolLM/SmolLM2 (135M/360M/1.7B) 📄
    • Qwen3 Base (0.6B/1.7B/4B/8B/14B/30B) 📄
    • Gemma 3 (1B/4B/12B/27B) 📄🖼️
    • MedGemma (4B) 📄🩺
    • MiMo Base (7B) 📄
    • Seed-Coder Base (8B) 📄⌨️
    • Mistral-Small-3.1 Base (24B) 📄🖼️
    • GLM-4-0414 Base (32B) 📄
    • Llama 4 (109B/492B) 📄🖼️
  • Instruct/Chat models
    • SmolLM/SmolLM2 Instruct (135M/360M/1.7B) 📄🤖
    • MiniCPM4 (0.5B/8B) 📄🤖
    • Qwen3 (0.6B/1.7B/4B/8B/14B/32B/30B/235B) 📄🤖🧠
    • Gemma 3 Instruct (1B/4B/12B/27B) 📄🤖🖼️
    • InternVL2.5/3 Instruct/MPO (1B/2B/8B/14B/38B/78B) 📄🤖🖼️
    • Qwen2.5-Omni (3B/7B) 📄🤖🖼️🔈
    • MedGemma Instruct (4B/27B) 📄🤖🩺
    • MiMo SFT/RL (7B) 📄🤖
    • MiMo-VL SFT/RL (7B) 📄🤖🖼️
    • Hunyuan Instruct (7B) 📄🤖
    • Seed-Coder Instruct/Reasoning (8B) 📄🤖🧠⌨️
    • GLM-4-0414/GLM-Z1 Instruct (9B/32B) 📄🤖🧠
    • DeepSeek-R1-0528 (8B/671B) 📄🤖🧠
    • Kimi-VL Instruct/Thinking (17B) 📄🤖🧠🖼️
    • Mistral-Small-3.1 Instruct (24B) 📄🤖🖼️
    • Qwen2.5-VL Instruct (32B) 📄🤖🖼️
    • Llama 4 Instruct (109B/492B) 📄🤖🖼️

New datasets

  • Preference datasets
    • COIG-P (zh) 📄

Bug fix

Full Changelog: v0.9.2...v0.9.3