🚀 Added

Qwen2.5-VL 🤝 Workflows

Thanks to @Matvezy, we’ve enabled Qwen2.5-VL model in inference and the Workflows ecosystem, making it easier than ever to utilise its powerful vision-language capabilities. 🎉

💡 About Qwen2.5-VL

Qwen2.5-VL is a Visual Language model which understands both images and text, allowing it to analyze documents, detect objects, and interpret videos with human-like comprehension. All of those advancements are now available in Workflows 🤯

Take a look at docs 📖 to find more details.

🚗 Speed improvements in `inference` 🏁

@isaacrob-roboflow is not slowing down - and with him, literally whole inference is moving faster. In this release, he added few important changes:

🎯 Torch-base images pre-processing: ability to run pre-processing on GPU using pytorch to utilise underlying hardware more efficiently.
💡 ONNX IO bindings enabled: technique which minimise data round-trip time to and from memory (especially helpful when pre-processing can happen directly on GPU)
🕵️ Details of the change: #941

Want to hear about results?

🏍️ For big images (for instance of size 2048 x 2048) we observe substantial drop in inference latency - in example case latency dropped from 130-140ms 👉 50-60ms - that's nearly 3x speedup 🤯

We can't wait future optimisations 💪

💻 New home page for `inference` docs

You may have already noticed, but just to be sure, let's take a look at the new design of inference home page prepared by @isoceles.

Check out here

🚨 Deprecated

Caution

We needed to take immediate actions. Vulnerability was detected in transformers library forced us to introduce changes into inference dependencies, effectively removing part of the components which we could not prepare security patches for. Vulnerability description CVE-2024-11393:

Hugging Face Transformers MaskFormer Model Deserialization of Untrusted Data Remote Code Execution Vulnerability. This vulnerability allows remote attackers to execute arbitrary code on affected installations of Hugging Face Transformers. User interaction is required to exploit this vulnerability in that the target must visit a malicious page or open a malicious file. The specific flaw exists within the parsing of model files. The issue results from the lack of proper validation of user-supplied data, which can result in deserialization of untrusted data. An attacker can leverage this vulnerability to execute code in the context of the current user.

We advise all clients to migrate into inference 0.38.0 and stop using old builds in production.

🐳 CogVLM was removed

First of the implication of CVE-2024-11393 was upgrade to newest transformers version which was conflicting with CogVLM model requirements - as a result we decided to end of support of the model in inference.

As far as we are concerned, CogVLM turned out not to be the most popular model in the ecosystem, but if anyone is looking for alternatives - we can suggest other models existing in inference and Workflows ecosystem - like newly introduced Qwen2.5-VL.

Effective immediately, model was removed from the library, we left usage examples and and stub Workflow block which would fire an error if one tries to run it in Execution Engine providing the information about deprecation.

🐍 Python 3.8 no longer supported

Python 3.8 has already reached End Of Life and as a result - many libraries dropped support for this version of the interpreter. We tried to keep our codebase compatible as long as possible, but we were not able to apply security patch (as newer version of dependencies related to transformers already dropped support). As a result - inference will no longer support Python 3.8.

🥡 End Of Life - Jetson with Jetpack 4.5

As a result of Python 3.8 deprecation, we also needed to abandon builds for Jetson with Jetpack 4.5 which were bounded into this Python version.

📗 Other changes

Prepare version of Workflows EE where thread pool executor is injectable instead of created at each run by @PawelPeczek-Roboflow in #1014
Add changes to make it possible to register WebHooks by @PawelPeczek-Roboflow in #1020
Fix docs homepage mobile nav by @yeldarby in #1023
Aspect ratio operation by @EmilyGavrilenko in #1025
Workflow error message improvements by @EmilyGavrilenko in #1018
Add --metrics-enabled and --metrics-disabled to inference server start by @grzegorz-roboflow in #1024
Expose decoding buffer size and predictions queue size as inference pipeline manager request parameters by @grzegorz-roboflow in #1022
Update Workflows Changelog by @PawelPeczek-Roboflow in #1027
Extend dynamic_zones block to expose updated detections as extra output by @grzegorz-roboflow in #1029
Loginless Builder by @yeldarby in #1030
Bump esbuild from 0.19.12 to 0.25.0 in /theme in the npm_and_yarn group across 1 directory by @dependabot in #1021
gpu_speedups code review by @grzegorz-roboflow in #1031
Add an option to use pytorch for GPU-based image preprocessing by @isaacrob-roboflow in #941
Handle new getWeights in RoboflowInferenceModel by @grzegorz-roboflow in #1028
Addition of Qwen 2.5 VL to Inference and Workflows by @Matvezy in #1019
Add rustc to OS dependencies by @grzegorz-roboflow in #975
Fix problem with assertions by @PawelPeczek-Roboflow in #1033
Fix broken CI by @PawelPeczek-Roboflow in #1034
Fix broken parallel GPU CI by @PawelPeczek-Roboflow in #1035

🏅 New Contributors

@isoceles made their first contribution in #1011

Full Changelog: v0.37.1...v0.38.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.38.0

🚀 Added

Qwen2.5-VL 🤝 Workflows

💡 About Qwen2.5-VL

🚗 Speed improvements in `inference` 🏁

Want to hear about results?

💻 New home page for `inference` docs

🚨 Deprecated

🐳 CogVLM was removed

🐍 Python 3.8 no longer supported

🥡 End Of Life - Jetson with Jetpack 4.5

📗 Other changes

🏅 New Contributors

Contributors

v0.38.0

🚀 Added

Qwen2.5-VL 🤝 Workflows

💡 About Qwen2.5-VL

🚗 Speed improvements in inference 🏁

Want to hear about results?

💻 New home page for inference docs

🚨 Deprecated

🐳 CogVLM was removed

🐍 Python 3.8 no longer supported

🥡 End Of Life - Jetson with Jetpack 4.5

📗 Other changes

🏅 New Contributors

Contributors

🚗 Speed improvements in `inference` 🏁

💻 New home page for `inference` docs