Skip to content

Releases: sgl-project/sglang

Release v0.5.4

26 Oct 02:37
1053e1b

Choose a tag to compare

Highlights

What's Changed

Read more

Release Gateway-v0.2.1

17 Nov 11:13
8a801ee

Choose a tag to compare

🚀 SGLang Model Gateway v0.2.1 Released!

This release focuses on stability, cleanup, and two big new performance features.

🧾 Docs & CI

  • Updated router documentation to reflect recent feature additions

🧹 Code Cleanup

  • Refactored StopSequenceDecoder for cleaner incremental decoding
  • Added spec.rs test harness under spec/ for structured unit tests

🐞 Bug Fixes

  • Fixed UTF-8 boundary in stop-sequence decoding
  • Fixed gRPC timeout configuration
  • Fixed worker filtering, tool-choice normalization, and bootstrap-port handling
  • Additional gRPC server warm-up and concurrency fixes

🌟 New Features

  • Two-Level Tokenizer Caching (L0 + L1)
  • L0: exact-match cache for repeated prompts
  • L1: prefix-aware cache at special-token boundaries
  • OpenAI-Style Classification API → new /v1/classifications endpoint, shout out to yanbo for the contribution
  • Worker Management Workflow Engine → improved async registration, worker self discovery, and health orchestration

What's Changed in Gateway

Gateway Changes (26 commits)

Paths Included

  • sgl-router
  • python/sglang/srt/grpc
  • python/sglang/srt/entrypoints/grpc_server.py

Full Changelog: gateway-v0.2.0...gateway-v0.2.1

Release Gateway-v0.2.0

17 Nov 11:03
74737b2

Choose a tag to compare

🚀 Release: SGLang Model Gateway v0.2.0 (formerly “SGLang Router”)

🔥 What’s new

🧠 Multi-Model Inference Gateway (IGW) Mode

IGW turns one router into many — letting you manage multiple models at once, each with its own routing policy, priorities, and metadata. Think of it as running several routers under one roof, with shared reliability, observability, and API surface.
You can dynamically register models via /workers, assign labels like tier or policy, and let the gateway handle routing, health checks, and load balancing.
Whether you’re mixing Llama, Mistral, and DeepSeek, or orchestrating per-tenant routing in enterprise setups, IGW gives you total control.
Your fleet, your rules. ⚡

⚡ gRPC Mode: Rust-Powered, Built for Throughput

This is the heart of 0.2.0. The new gRPC data plane runs entirely in Rust — tokenizer, reasoning parser, and tool parser included — giving you native-speed performance, and lower latency.
You can connect to gRPC-based SGLang workers, stream tokens in real time, and even handle OpenAI-compatible APIs like

🌐 OpenAI-Compatible Gateway

Seamlessly proxy requests to OpenAI, while keeping data control local.
Conversation history, responses, and background jobs all flow through the gateway — same API, enterprise privacy.
💾 Pluggable History Storage
Choose between memory, none, or oracle for conversation and /v1/responses data.
memory: Fastest for ephemeral runs.none: Zero persistence, zero latency overhead.oracle: Full persistence via Oracle ATP with connection pooling and credentials support.🧩 Pluggable MCP Integration
The gateway now natively speaks MCP across all transports (STDIO, HTTP, SSE, Streamable), so your tools can plug directly into reasoning and response loops — perfect for agentic workflows and cross-model orchestration.

🛡️ Reliability & Observability Upgrades

Built-in:
Retries with exponential backoff + jitterPer-worker circuit breakersToken-bucket rate limiting & FIFO queuingPrometheus metrics for latency, load, queue depth, PD pipelines, tokenizer speed, and MCP activityStructured tracing & request-ID propagation

✨ SGLang Model Gateway v0.2.0 — built in Rust, designed for scale, ready for reasoning.

What's Changed in Gateway

Gateway Changes (238 commits)

Read more

Release v0.5.3

06 Oct 18:45
a4a3d82

Choose a tag to compare

Highlights

What's Changed

Read more

Release v0.5.2

12 Sep 03:50
b0d25e7

Choose a tag to compare

Highlights

What's Changed

Read more

Release v0.5.1

23 Aug 19:57
97a38ee

Choose a tag to compare

What's Changed

Read more

Release Gateway-v0.1.9

17 Nov 10:58
500b15c

Choose a tag to compare

What's Changed in Gateway

Gateway Changes (10 commits)

New Contributors

Paths Included

  • sgl-router
  • python/sglang/srt/grpc
  • python/sglang/srt/entrypoints/grpc_server.py

Full Changelog: gateway-v0.1.8...gateway-v0.1.9

Release Gateway-v0.1.8

17 Nov 10:57
39decec

Choose a tag to compare

What's Changed in Gateway

Gateway Changes (4 commits)

New Contributors

Paths Included

  • sgl-router
  • python/sglang/srt/grpc
  • python/sglang/srt/entrypoints/grpc_server.py

Full Changelog: gateway-v0.1.7...gateway-v0.1.8

v0.4.10

31 Jul 18:48
0232886

Choose a tag to compare

Highlights

This is a regular release with many new optimizations, features, and fixes. Please checkout the following exciting roadmaps and blogs

What's Changed

Read more

Release Gateway-v0.1.7

17 Nov 10:51
aee0ef5

Choose a tag to compare

What's Changed in Gateway

Gateway/Router Changes (11 commits)

New Contributors

Paths Included

  • sgl-router
  • python/sglang/srt/grpc
  • python/sglang/srt/entrypoints/grpc_server.py

Full Changelog: gateway-v0.1.6...gateway-v0.1.7