Skip to content

Releases: roboflow/inference

v1.3.1

12 Jun 16:04
1eb4f8f

Choose a tag to compare

🚀 Added

  • Workflows: point prompting — labeled_points kind + SAM 3 Interactive block, and SAM3 PVS multi-prompt bug fixes — #2436 (@hansent)
  • Models: SAM 3 streaming video concept tracker (model class + Workflow block + docs) — #2439 (@hansent)
  • Workflows: Track Class Lock block — #2447 (@alexeialexandrovich)
  • Workflows: Claude Fable 5 as a Workflow model — #2433 (@Erol444)
  • Workflows: enable "best" confidence for semantic segmentation — #2425 (@leeclemnet)
  • WebRTC SDK: surface per-frame server errors to the client — #2438 (@sberan)
  • Serverless: enable SAM 3 3D and embed_image routes — #2424 (@hansent)
  • Batch Processing: Asset Library adjustments — #2399 (@PawelPeczek-Roboflow)
  • Gateway: support scheme-qualified SECURE_GATEWAY URLs — #2429 (@alexnorell)

🐛 Fixes

  • Workflows (OBB): shift oriented bounding box corners in detections_stitch#2427 (@kounelisagis)
  • Workflows (OBB): translate oriented bounding box corners in dynamic_crop#2430 (@kounelisagis)
  • SAM 3: stop IndexError on single-point visual prompts in serialize_prompt#2423 (@bigbitbus)
  • WebRTC: accept str or list[str] for ICE server URLs — #2418 (@ecarrara)
  • Depth estimation: honor query API key for model loads — #2445 (@hansent)
  • Depth estimation: route depth model id through the inference path — #2446 (@hansent)
  • Core: convert masks to bool before np.array wrapping (avoids uint8 deprecation warnings) — #2432 (@lou-roboflow)
  • Core: remove mutable default arguments — #2441 (@kounelisagis)
  • Caching: hash-truncate long cache path segments — #2279 (@hansent)
  • UI: fix mobile drawer/menu showing on the index page — #2444 (@Erol444)

🔒 Security

📦 Build & Platform

  • Shrink JetPack 6.2 image ~20% (6.56 → 5.22 GB); fix Orin (sm87) flash-attn + bitsandbytes — #2440 (@alexnorell)

📚 Docs

🔧 Dependencies

  • Bump shell-quote 1.8.2 → 1.8.4 in /theme_build (npm/yarn group) — #2426 (@dependabot)

👋 New Contributors

Full Changelog: v1.3.0...v1.3.1

v1.3.0

05 Jun 18:48
815c4eb

Choose a tag to compare

🚀 Added

🦾 RF-DETR Keypoints — pose estimation joins the RF-DETR family

The big one this release: thanks to @sergii-bond, RF-DETR now supports keypoint detection alongside the existing detection head — a single model architecture across detection and pose. Thanks to the contribution (#2401, #2416) you can pull a fine-tuned RF-DETR keypoints model and run it through the standard inference + workflows path with no extra plumbing.

🧬 YOLO26 Semantic Segmentation — fine-tuned models + binary head

Following YOLO26's earlier landing, @leeclemnet rounded out the segmentation story this release: fine-tuned YOLO26 sem-seg models are now first-class in inference (#2407, #2419).

image

🔥 New Workflows blocks

Block Type Slug What it does
current_time/v1.py roboflow_core/current_time@v1 Inject the current wall-clock time into the workflow graph as a typed step output
Vision Events (local mode) enterprise Run the Vision Events block in an in-process event-store mode instead of round-tripping through Roboflow infra
  • roboflow_core/current_time@v1 — by @patricknihranz in #2410. Drop it before any block that needs a timestamp (audit trails, time-windowed aggregations, freshness gates) without writing a custom block.
  • Vision Events block — local event-store mode (ENT-1192) — by @rvirani1 in #2402. Useful for on-prem and isolated-network deployments where the central event sink isn't reachable.

🧰 Workflow block improvements

A theme this release: a handful of existing blocks gained selector inputs so you can drive their parameters from upstream step outputs instead of hard-coding at the block level.

  • GLM-OCR — accepts a selector for task_type (@nathan-marraccini, #2409). Switch OCR mode dynamically based on prior workflow signals.
  • Qwen3.5-VL — accepts selectors for prompt and system_prompt (@nathan-marraccini, #2408). Compose prompts from prior steps without an intermediate Python block.
  • NumberInRange operator is now exposed in the Workflow Builder UI (@patricknihranz, #2229) — previously only reachable by hand-editing the YAML.

🌟 Other additions

  • Gemini 2.5 native object-detection format is now parsed by vlm_as_detector, so you can route Gemini 2.5 outputs through the same downstream blocks as any other detector (@dkosowski87, #2400).
  • Volume support added by @nkuneman in #2413 — see the PR for the mount conventions.
  • roboflow/inference-server-experimental image published (@grzegorz-roboflow, #2406) — an opt-in track for early bits before they hit the main image.

🔒 Security — please review your deployment

This release ships security enhancements for local deployments (#2417 by @PawelPeczek-Roboflow) and, alongside it, a new dedicated documentation page that walks through how to harden a self-hosted Inference server:

👉 inference.roboflow.com/install/security

Important

If you run Inference outside of localhost — in a container, on a shared host, on a private network, or anywhere reachable beyond a single developer machine — please take a few minutes to read the new guide. You own the security posture of your deployment. A default-configured server is adjusted to work in development-friendly mode and should not be deployed as is in production grade environments, due to exposing unauthenticated endpoints and ability to run Custom Python Blocks in Workflows Execution Engine.

The guide covers, in short:

  • Restrict network access — bind to localhost, keep on a private network, or place behind a firewall. Never expose the inference port directly to the public internet without authentication and TLS.
  • Enforce authentication — use WORKSPACES_WHITELISTED_FOR_LOCAL_DEPLOYMENT to require valid API keys, or place your own auth layer (OAuth, mTLS) in front.
  • Enable TLS — terminate HTTPS at a reverse proxy or set ENABLE_HTTPS=true on the server itself.
  • Disable custom Python execution — set ALLOW_CUSTOM_PYTHON_EXECUTION_IN_WORKFLOWS=false unless you specifically need it.

If you have a public-facing or multi-tenant deployment, these are not optional. The new docs page is the canonical reference going forward.

🔧 Fixed

  • Core models — forward countinference / service_secret when downloading weights by @iurisilvio in #2398 — keeps usage attribution and gated-weights flows working when models are pulled at runtime.
  • Batch processing fix by @digaobarbosa in #2411.
  • Workflows / Data Aggregator — corrected values_difference aggregation by @madhavcodez in #2388. First-time contribution — thank you!
  • Graceful fallback on ephemeral cache failure by @dkosowski87 in #2387 — the cache layer no longer takes the whole request down when its store is unavailable.
  • Server-side TTL on model-monitoring zset writes by @bigbitbus in #2390 — model-monitoring entries now expire on the cache server even if a client never cleans up.

🚧 Maintenance

👋 New contributors

A warm welcome to two first-time contributors landing in this release:


Full Changelog: v1.2.13...v1.3.0

v1.2.13

02 Jun 16:14
2802efa

Choose a tag to compare

What's Changed

  • fix(workflows): select v0 API for hosted semantic-segmentation remote execution by @leeclemnet in #2393
  • fix(aliases): resolve public yolo26-sem model aliases by @leeclemnet in #2394
  • refactor(gemini): remove deprecated model versions and update default to 2.5-flash by @dkosowski87 in #2395

Full Changelog: v1.2.12...v1.2.13

v1.2.12

29 May 14:50
615f919

Choose a tag to compare

What's Changed

Full Changelog: v1.2.11...v1.2.12

v1.2.11

27 May 20:00
83ef480

Choose a tag to compare

What's Changed

Full Changelog: v1.2.10...v1.2.11

v1.2.10

22 May 17:33
7399c06

Choose a tag to compare

What's Changed

Full Changelog: v1.2.9...v1.2.10

v1.2.9

15 May 19:45
e108b3d

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.2.8...v1.2.9

v1.2.8

13 May 16:15
9c90e4c

Choose a tag to compare

⚠️ Deprecated

🚫 Gaze (L2CS-Net / MediaPipe) detection

Gaze detection — the roboflow_core/gaze@v1 workflow block, the POST /gaze/gaze_detection HTTP route, and the InferenceHTTPClient.detect_gazes() / detect_gazes_async() SDK helpers — has been deprecated in this release. Calling any of these now raises a new FeatureDeprecatedError that surfaces as HTTP 410 Gone with error_type: "FeatureDeprecatedError".

The Gaze pipeline was built on top of MediaPipe, which has dropped support for parts of the hardware matrix Roboflow ships against (notably some Linux/aarch64 and Jetson configurations no longer have compatible wheels), and a transitive protobuf CVE2026-0994 meant we could no longer carry the dependency alongside the rest of the inference stack.

# Old (still loads, returns 410 at execute time):
from inference_sdk import InferenceHTTPClient
client = InferenceHTTPClient(api_url="...", api_key="...")
client.detect_gazes(inference_input="image.jpg")
# → raises inference_sdk.http.errors.FeatureDeprecatedError

Important

The last release supporting Gaze is v1.2.7. If you currently rely on Gaze detection locally — pin to inference==1.2.7 (or the matching Docker image tag) as a short-term bridge while you plan ahead, keeping in mind that vulnerability exists in the build.

Note

The POST /gaze/gaze_detection route remains registered as a 410-Gone stub through end of Q2 2026 so existing client integrations get a structured error rather than a 404. Set CORE_MODEL_GAZE_ENABLED=False to disable it immediately.

If Gaze is important to your workflow and you'd like to discuss bringing it back paired with a different face detector, reach out at support@roboflow.com — we're happy to chat.

PR: #2334


💪 Added

🖼️ Image Stack workflow block

A new Image Stack workflow block lets you accumulate frames across executions — useful for building temporal pipelines, sliding-window inference, or any flow that needs to reason over a buffer of recent images rather than a single frame.

PR: #2307

🤖 OpenAI-compatible LLM block for custom endpoints

A new workflow block that lets you point at any OpenAI-compatible HTTP endpoint — your own self-hosted vLLM/Ollama/LM Studio deployment, a private gateway, or a third-party provider that mirrors the OpenAI API. Combined with the extra_body follow-up, you can also pass provider-specific extensions (reasoning effort, sampling tweaks, custom routing flags) through to the upstream call.

PRs: #2309, #2313

🔐 Opt-in HTTPS for the inference server

The HTTP server now supports terminating TLS directly via SSL environment variables — no reverse proxy required for simple self-hosted setups. Off by default; opt in by setting the relevant SSL env vars.

PR: #2308


🔒 Security & Hardening

  • 🚫 Block legacy fine-tuned SAM3 loads — tighten the SAM3 model loading boundary so legacy fine-tuned artefacts that no longer match the supported package shape are rejected up front. #2278
  • 📦 OpenTelemetry stack bumped to 1.41.x / 0.62b1 to clear a transitive protobuf CVE and unlock alignment with the newer inference-models release. #2335
  • 🛡️ Additional security fixes across the stack. #2336

🚧 Maintenance


👋 New Contributors

Full Changelog: v1.2.7...v1.2.8

v1.2.7

01 May 15:03
b6a6538

Choose a tag to compare

💪 Added

🔌 Improved

📖 Documentation improvements

  • Update workflow benchmarks docs with TRT GPU results by @Erol444 in #2289

🔧 Bug fixes

🚧 Maintenance

🏅 New Contributors

Full Changelog: v1.2.5...v1.2.7

v1.2.6

27 Apr 23:09

Choose a tag to compare

What's Changed

Full Changelog: v1.2.5...v1.2.6