27 Jun 21:48

mudler

cd2b0c0

v3.1.1 Latest

Latest

What's Changed

Bug fixes 🐛

fix(backends gallery): correctly identify gpu vendor by @mudler in #5739
fix(backends gallery): meta packages do not have URIs by @mudler in #5740

Exciting New Features 🎉

feat(gallery): automatically install missing backends along models by @mudler in #5736

👒 Dependencies

chore: ⬆️ Update ggml-org/whisper.cpp to c88ffbf9baeaae8c2cc0a4f496618314bb2ee9e0 by @localai-bot in #5742
chore: ⬆️ Update ggml-org/llama.cpp to 72babea5dea56c8a8e8420ccf731b12a5cf37854 by @localai-bot in #5743

Other Changes

fix(ci): better handling of latest images for backends by @mudler in #5735
fix(ci): enable tag-latest to auto by @mudler in #5738
docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #5741

Full Changelog: v3.1.0...v3.1.1

Contributors

mudler and localai-bot

Assets 10

26 Jun 20:14

mudler

v3.1.0

6a650e6

v3.1.0

🚀 LocalAI 3.1

🚀 Highlights

Support for Gemma 3n!

Gemma 3n has been released and it's now available in LocalAI (currently only for text generation, install it with:

local-ai run gemma-3n-e2b-it
local-ai run gemma-3n-e4b-it

⚠️ Breaking Changes

Several important changes that reduce image size, simplify the ecosystem, and pave the way for a leaner LocalAI core:

🧰 Container Image Changes

Sources are no longer bundled in the container images. This significantly reduces image sizes.
- Need to rebuild locally? Just follow the docs to build from scratch. We're working towards migrating all backends to the gallery, slimming down the default image further.

📁 Directory Structure Updated

New default model and backend paths for container images:

Models: /models/ (was /build/models)
Backends: /backends/ (was /build/backends)

🏷 Unified Image Tag Naming for `master` (development) builds

We've cleaned up and standardized container image tags for clarity and consistency:

gpu-nvidia-cuda11 and gpu-nvidia-cuda12 (previously cublas-cuda11, cublas-cuda12)
gpu-intel-f16 and gpu-intel-f32 (previously sycl-f16, sycl-f32)

Meta packages in backend galleries

We’ve introduced meta-packages to the backend gallery!
These packages automatically install the most suitable backend depending on the GPU detected in your system — saving time, reducing errors, and ensuring you get the right setup out of the box. These will be added as soon as the 3.1.0 images are going to be published, stay tuned!

For instance, you will be able to install vllm just by installing the vllm backend in the gallery ( no need to select anymore the correct GPU version)

The Complete Local Stack for Privacy-First AI

With LocalAGI rejoining LocalAI alongside LocalRecall, our ecosystem provides a complete, open-source stack for private, secure, and intelligent AI operations:

LocalAI

The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.

Link: https://github.com/mudler/LocalAI

LocalAGI

A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.

Link: https://github.com/mudler/LocalAGI

LocalRecall

A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI.

Link: https://github.com/mudler/LocalRecall

Join the Movement! ❤️

A massive THANK YOU to our incredible community and our sponsors! LocalAI has over 33,500 stars, and LocalAGI has already rocketed past 800+ stars!

As a reminder, LocalAI is real FOSS (Free and Open Source Software) and its sibling projects are community-driven and not backed by VCs or a company. We rely on contributors donating their spare time and our sponsors to provide us the hardware! If you love open-source, privacy-first AI, please consider starring the repos, contributing code, reporting bugs, or spreading the word!

👉 Check out the reborn LocalAGI v2 today: https://github.com/mudler/LocalAGI

Full changelog 👇

👉 Click to expand 👈

What's Changed

Breaking Changes 🛠

chore(ci): ⚠️ fix latest tag by using docker meta action by @mudler in #5722
feat: ⚠️ reduce images size and stop bundling sources by @mudler in #5721

Bug fixes 🐛

fix(backends gallery): delete dangling dirs if installation failed by @mudler in #5729

Exciting New Features 🎉

feat(backend gallery): add meta packages by @mudler in #5696

🧠 Models

chore(model gallery): add qwen3-the-josiefied-omega-directive-22b-uncensored-abliterated-i1 by @mudler in #5704
chore(model gallery): add menlo_jan-nano by @mudler in #5705
chore(model gallery): add qwen3-the-xiaolong-omega-directive-22b-uncensored-abliterated-i1 by @mudler in #5706
chore(model gallery): add allura-org_q3-8b-kintsugi by @mudler in #5707
chore(model gallery): add ds-r1-qwen3-8b-arliai-rpr-v4-small-iq-imatrix by @mudler in #5708
chore(model gallery): add mistralai_mistral-small-3.2-24b-instruct-2506 by @mudler in #5714
chore(model gallery): add skywork_skywork-swe-32b by @mudler in #5715
chore(model gallery): add astrosage-70b by @mudler in #5716
chore(model gallery): add delta-vector_austral-24b-winton by @mudler in #5717
chore(model gallery): add menlo_jan-nano-128k by @mudler in #5723
chore(model gallery): add gemma-3n-e2b-it by @mudler in #5730
chore(model gallery): add gemma-3n-e4b-it by @mudler in #5731

👒 Dependencies

chore: ⬆️ Update ggml-org/whisper.cpp to 3e65f518ddf840b13b74794158aa95a2c8aa30cc by @localai-bot in #5691
chore: ⬆️ Update ggml-org/llama.cpp to 8f71d0f3e86ccbba059350058af8758cafed73e6 by @localai-bot in #5692
chore: ⬆️ Update ggml-org/llama.cpp to 06cbedfca1587473df9b537f1dd4d6bfa2e3de13 by @localai-bot in #5697
chore: ⬆️ Update ggml-org/whisper.cpp to e6c10cf3d5d60dc647eb6cd5e73d3c347149f746 by @localai-bot in #5702
chore: ⬆️ Update ggml-org/llama.cpp to aa0ef5c578eef4c2adc7be1282f21bab5f3e8d26 by @localai-bot in #5703
chore: ⬆️ Update ggml-org/llama.cpp to 238005c2dc67426cf678baa2d54c881701693288 by @localai-bot in #5710
chore: ⬆️ Update ggml-org/whisper.cpp to a422176937c5bb20eb58d969995765f90d3c1a9b by @localai-bot in #5713
chore: ⬆️ Update ggml-org/llama.cpp to ce82bd0117bd3598300b3a089d13d401b90279c7 by @localai-bot in #5712
chore: ⬆️ Update ggml-org/llama.cpp to 73e53dc834c0a2336cd104473af6897197b96277 by @localai-bot in #5719
chore: ⬆️ Update ggml-org/whisper.cpp to 0083335ba0e9d6becbe0958903b0a27fc2ebaeed by @localai-bot in #5718
chore: ⬆️ Update leejet/stable-diffusion.cpp to 10c6501bd05a697e014f1bee3a84e5664290c489 by @localai-bot in #4925
chore: ⬆️ Update ggml-org/llama.cpp to 2bf9d539dd158345e3a3b096e16474af535265b4 by @localai-bot in #5724
chore: ⬆️ Update ggml-org/whisper.cpp to 4daf7050ca2bf17f5166f45ac6da651c4e33f293 by @localai-bot in #5725
Revert "chore: ⬆️ Update leejet/stable-diffusion.cpp to 10c6501bd05a697e014f1bee3a84e5664290c489" by @mudler in #5727
chore: ⬆️ Update ggml-org/llama.cpp to 8846aace4934ad29651ea61b8c7e3f6b0556e3d2 by @localai-bot in #5734
chore: ⬆️ Update ggml-org/whisper.cpp to 32cf4e2aba799aff069011f37ca025401433cf9f by @localai-bot in #5733

Other Changes

docs: ⬆️ update docs version mudler/LocalAI by @localai-bot in #5690
chore(ci): try to optimize disk space when tagging latest by @mudler in #5695
chore(ci): add stale bot by @mudler in #5700
Docs: Fix typos by @kilavvy in #5709

**Full...

Contributors

mudler, localai-bot, and kilavvy

Assets 10

19 Jun 15:55

mudler

v3.0.0

f9b968e

v3.0.0

🚀 LocalAI 3.0 – A New Era Begins

Say hello to LocalAI 3.0 — our most ambitious release yet!

We’ve taken huge strides toward making LocalAI not just local, but limitless. Whether you're building LLM-powered agents, experimenting with audio pipelines, or deploying multimodal backends at scale — this release is for you.

Let’s walk you through what’s new. (And yes, there’s a lot to love.)

TL;DR – What’s New in LocalAI 3.0.0 🎉

🧩 Backend Gallery: Install/remove backends on the fly, powered by OCI images — fully customizable and API-driven.
🎙️ Audio Support: Upload audio, PDFs, or text in the UI — plus new audio understanding models like Qwen Omni.
🌐 Realtime API: WebSocket support compatible with OpenAI clients, great for chat apps and agents.
🧠 Reasoning UI Boosts: Thinking indicators now show in chat for smart models.
📊 Dynamic VRAM Handling: Smarter GPU usage with automatic offloading.
🦙 Llama.cpp Upgrades: Now with reranking + multimodal via libmtmd.
📦 50+ New Models: Huge model gallery update with fresh LLMs across categories.
🐞 Bug Fixes: Streamed runes, template stability, better backend gallery UX.
❌ Deprecated: Extras images — replaced by the new backend system.

👉 Dive into the full changelog and docs below to explore more!

🧩 Introducing the Backend Gallery — Plug, Play, Power Up

No more hunting for dependencies or custom hacks.

With the new Backend Gallery, you can now:

Install & remove backends at runtime or startup via API or directly from the WebUI
Use custom galleries, just like you do for models
Enjoy zero-config access to the default LocalAI gallery

Backends are standard OCI images — portable, composable, and totally DIY-friendly. Goodbye to "extras images" — hello to full backend modularity, even with Python-based dependencies.

📖 Explore the Backend Gallery Docs

⚠️ Important: Breaking Changes

From this release we will stop pushing -extra images containing python backends. You can now use standard images, and you will have only to pick the ones that are suited for your GPU. Additional backends can be installed via the backend gallery.

Here below some examples, note that the CI is still publishing the images so won't be available until jobs are processed, and the installation scripts will be updated right after images are publicly available.

CPU only image:

docker run -ti --name local-ai -p 8080:8080 localai/localai:latest

NVIDIA GPU Images:

# CUDA 12
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12

# CUDA 11
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-11

# NVIDIA Jetson (L4T) ARM64
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64

AMD GPU Images (ROCm):

docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas

Intel GPU Images (oneAPI):

# Intel GPU with FP16 support
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel-f16

# Intel GPU with FP32 support
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel-f32

Vulkan GPU Images:

docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan

AIO Images (pre-downloaded models):

# CPU version
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu

# NVIDIA CUDA 12 version
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-12

# NVIDIA CUDA 11 version
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-11

# Intel GPU version
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel-f16

# AMD GPU version
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-aio-gpu-hipblas

For more information about the AIO images and pre-downloaded models, see Container Documentation.

🧠 Smarter Reasoning, Smoother Chat

Realtime WebSocket API: OpenAI-style streaming support via WebSocket is here. Ideal for agents and chat apps.
"Thinking" Tags: Reasoning models now show a visual "thinking" box during inference in the UI. Intuitive and satisfying.

🧠 Model Power-Up: VRAM Savvy + Multimodal Brains

Dynamic VRAM Estimation: LocalAI now adapts and offloads layers depending on your GPU’s capabilities. Optimal performance, no guesswork.
Llama.cpp upgrades also includes:

reranking
Enhanced multimodal support via libmtmd

🧪 New Models!

More than 50 new models joined the gallery, including:

🧠 skywork-or1-32b, rivermind-lux-12b, qwen3-embedding-*, llama3-24b-mullein, ultravox-v0_5, and more
🧬 Multimodal, reasoning, and domain-specific LLMs for every need
📦 Browse the latest additions in the Model Gallery

🐞 Bugfixes & Polish

Rune streaming is now buttery smooth
Countless fixes across templates, inputs, CI, and realtime session updates
Backend gallery UI is more stable and informative

The Complete Local Stack for Privacy-First AI

With LocalAGI rejoining LocalAI alongside LocalRecall, our ecosystem provides a complete, open-source stack for private, secure, and intelligent AI operations:

LocalAI

The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.

Link: https://github.com/mudler/LocalAI

LocalAGI

A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.

Link: https://github.com/mudler/LocalAGI

LocalRecall

A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI.

Link: https://github.com/mudler/LocalRecall

Join the Movement! ❤️

A massive THANK YOU to our incredible community and our sponsors! LocalAI has over 33,300 stars, and LocalAGI has already rocketed past 750+ stars!

👉 Check out the reborn LocalAGI v2 today: https://github.com/mudler/LocalAGI

LocalAI 3.0.0 is here. What will you build next?

Full changelog 👇

👉 Click to expand 👈

What's Changed

Breaking Changes 🛠

feat: Add backend gallery by @mudler in #5607
chore(backends): move bark-cpp to the backend gallery by @mudler in #5682

Bug fixes 🐛

fix(ci): tag latest against cpu-only image by @mudler in #5362
fix(flux): Set CFG=1 so that prompts are followed by @richiejp in #5378
fix(template): we do not always have .Name by @mudler in #5508
fix(input): handle correctly case where we pass by string list as inputs by @mudler in #5521
fix(streaming): stream complete runes by @mudler in #5539
fix(install.sh): vulkan docker tag by @halkeye in #5589
fix(realtime): Use updated model on session update b...

Contributors

TheDarkTrumpet, halkeye, and 10 other contributors

Assets 10

1 Join discussion

12 May 20:31

mudler

v2.29.0

fd17a33

v2.29.0

I am thrilled to announce the release of LocalAI v2.29.0! This update focuses heavily on refining our container image strategy, making default images leaner and providing clearer options for users needing specific features or hardware acceleration. We've also added support for new models like Qwen3, enhanced existing backends, and introduced experimental endpoints, like video generation!

⚠️ Important: Breaking Changes

This release includes significant changes to container image tagging and contents. Please review carefully:

Python Dependencies Moved: Images containing extra Python dependencies (like those for diffusers) now require the -extras suffix (e.g., latest-gpu-nvidia-cuda-12-extras). Default images are now slimmer and do not include these dependencies.
FFmpeg is Now Standard: All core images now include FFmpeg. The separate -ffmpeg tags have been removed. If you previously used an -ffmpeg tagged image, simply switch to the corresponding base image tag (e.g., latest-gpu-hipblas-ffmpeg becomes latest-gpu-hipblas).

CPU only image:

docker run -ti --name local-ai -p 8080:8080 localai/localai:latest

NVIDIA GPU Images:

# CUDA 12.0 with core features
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12

# CUDA 12.0 with extra Python dependencies
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12-extras

# CUDA 11.7 with core features
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-11

# CUDA 11.7 with extra Python dependencies
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-11-extras

# NVIDIA Jetson (L4T) ARM64
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64

AMD GPU Images (ROCm):

# ROCm with core features
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas

# ROCm with extra Python dependencies
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas-extras

Intel GPU Images (oneAPI):

# Intel GPU with FP16 support
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel-f16

# Intel GPU with FP16 support and extra dependencies
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel-f16-extras

# Intel GPU with FP32 support
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel-f32

# Intel GPU with FP32 support and extra dependencies
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel-f32-extras

Vulkan GPU Images:

# Vulkan with core features
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan

AIO Images (pre-downloaded models):

# CPU version
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu

# NVIDIA CUDA 12 version
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-12

# NVIDIA CUDA 11 version
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-11

# Intel GPU version
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel-f16

# AMD GPU version
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-aio-gpu-hipblas

For more information about the AIO images and pre-downloaded models, see Container Documentation.

Key Changes in v2.29.0

📦 Container Image Overhaul

-extras Suffix: Images with additional Python dependencies are now identified by the -extras suffix.
Default Images: Standard tags (like latest, latest-gpu-nvidia-cuda-12) now provide core LocalAI functionality without the extra Python libraries.
FFmpeg Inclusion: FFmpeg is bundled in all images, simplifying setup for multimedia tasks.
New latest-* Tags: Added specific latest tags for various GPU architectures:
- latest-gpu-hipblas (AMD ROCm)
- latest-gpu-intel-f16 (Intel oneAPI FP16)
- latest-gpu-intel-f32 (Intel oneAPI FP32)
- latest-gpu-nvidia-cuda-12 (NVIDIA CUDA 12)
- latest-gpu-vulkan (Vulkan)

🚀 New Features & Enhancements

Qwen3 Model Support: Officially integrated support for the Qwen3 model family.
Experimental Auto GPU Offload: LocalAI can now attempt to automatically detect GPUs and configure optimal layer offloading for llama.cpp and CLIP.
Whisper.cpp GPU Acceleration: Updated whisper.cpp and enabled GPU support via cuBLAS (NVIDIA) and Vulkan. SYCL and Hipblas support are in progress.
Experimental Video Generation: Introduced a /video/generations endpoint. Stay tuned for compatible model backends!
Installer Uninstall Option: The install.sh script now includes a --uninstall flag for easy removal.
Expanded Hipblas Targets: Added support for a wider range of AMD GPU architectures. gfx803,gfx900,gfx906,gfx908,gfx90a,gfx942,gfx1010,gfx1030,gfx1032,gfx1100,gfx1101,gfx1102

🧹 Backend Updates

AutoGPTQ Backend Removed: This backend has been dropped due to being discontinued upstream.
llama.cpp experimental support to automatically detect GPU layers offloading.

The Complete Local Stack for Privacy-First AI

With LocalAGI rejoining LocalAI alongside LocalRecall, our ecosystem provides a complete, open-source stack for private, secure, and intelligent AI operations:

LocalAI

The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.

Link: https://github.com/mudler/LocalAI

LocalAGI

A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.

Link: https://github.com/mudler/LocalAGI

LocalRecall

A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI.

Link: https://github.com/mudler/LocalRecall

Join the Movement! ❤️

A massive THANK YOU to our incredible community! LocalAI has over 32,500 stars, and LocalAGI has already rocketed past 650+ stars!

As a reminder, LocalAI is real FOSS (Free and Open Source Software) and its sibling projects are community-driven and not backed by VCs or a company. We rely on contributors donating their spare time. If you love open-source, privacy-first AI, please consider starring the repos, contributing code, reporting bugs, or spreading the word!

👉 Check out the reborn LocalAGI v2 today: https://github.com/mudler/LocalAGI

Let's continue building the future of AI, together! 🙌

Full changelog 👇

👉 Click to expand 👈

What's Changed

Breaking Changes 🛠

chore(autogptq): drop archived backend by @mudler in #5214
chore(ci): build only images with ffmpeg included, simplify tags by @mudler in #5251
chore(ci): strip 'core' in the image suffix, identify python-based images with 'extras' by @mudler in #5353

Bug fixes 🐛

fix: bark-cpp: assign FLAG_TTS to bark-cpp backend by @M0Rf30 in #5186
fix(talk): Talk interface sends content-type headers to chatgpt by @baflo in #5200
fix: installation script compatibility with fedora 41 and later, fedora headless unclear errors by @Bloodis94 in #5239
fix(stablediffusion-ggml): Build with DSD CUDA, HIP and Metal ...

Contributors

wyattearp, M0Rf30, and 8 other contributors

Assets 10

0 Join discussion

15 Apr 20:41

mudler

v2.28.0

56f44d4

v2.28.0

🎉 LocalAI v2.28.0: New Look & The Rebirth of LocalAGI! 🎉

Our fresh new look!

Big news, everyone! Not only does LocalAI have a brand new logo, but we're also celebrating the full rebirth of LocalAGI, our powerful agent framework, now completely rewritten and ready to revolutionize your local AI workflows!

Rewinding the Clock: The Journey of LocalAI & LocalAGI

Two years ago, LocalAI emerged as a pioneer in the local AI inferencing space, offering an OpenAI-compatible API layer long before it became common. Around the same time, LocalAGI was born as an experiment in AI agent frameworks – you can even find the original announcement here! Originally built in Python, it inspired many with its local-first approach.

See LocalAGI (Original Python Version) in Action!

Searching the internet (interactive mode):

search.mp4

Planning a road trip (batch mode):

planner.mp4

That early experiment has now evolved significantly!

Introducing LocalAGI v2: The Agent Framework Reborn in Go!

We're thrilled to announce that LocalAGI has been rebuilt from the ground up in Golang! It's now a modern, robust AI Agent Orchestration Platform designed to work seamlessly with LocalAI. Huge thanks to the community, especially @richiejp, for jumping in and helping create a fantastic new WebUI!

LocalAGI leverages all the features that make LocalAI great for agentic tasks. During the refactor, we even spun out the memory layer into its own component: LocalRecall, a standalone REST API for persistent agent memory.

🚀 What Makes LocalAGI v2 Shine?

🎯 OpenAI Responses API Compatible: Integrates perfectly with LocalAI, acting as a drop-in replacement for cloud APIs, keeping your interactions local and secure.
🤖 Next-Gen AI Agent Orchestration: Easily configure, deploy, and manage teams of intelligent AI agents through an intuitive no-code web interface.
🛡️ Privacy-First by Design: Everything runs locally. Your data never leaves your hardware.
📡 Instant Integrations: Comes with built-in connectors for Slack, Telegram, Discord, GitHub Issues, IRC, and more.
⚡ Extensible and Multimodal: Supports multiple models (text, vision) and custom actions, perfectly complementing your LocalAI setup.

✨ Check out the new LocalAGI WebUI:

What's New Specifically in LocalAI v2.28.0?

Beyond the rebranding and the major LocalAGI news, this LocalAI release also brings its own set of improvements:

🖼️ SYCL Support: Added SYCL support for stablediffusion.cpp.
✨ WebUI Enhancements: Continued improvements to the user interface.
🧠 Diffusers Updated: Core diffusers library has been updated.
💡 Lumina Model Support: Now supports the Lumina model family for generating stunning images!
🐛 Bug Fixes: Resolved issues related to setting LOCALAI_SINGLE_ACTIVE_BACKEND to true.

The Complete Local Stack for Privacy-First AI

With LocalAGI rejoining LocalAI alongside LocalRecall, our ecosystem provides a complete, open-source stack for private, secure, and intelligent AI operations:

LocalAI

The free, Open Source OpenAI alternative. Acts as a drop-in replacement REST API compatible with OpenAI specifications for local AI inferencing. No GPU required.

Link: https://github.com/mudler/LocalAI

LocalAGI

A powerful Local AI agent management platform. Serves as a drop-in replacement for OpenAI's Responses API, supercharged with advanced agentic capabilities and a no-code UI.

Link: https://github.com/mudler/LocalAGI

LocalRecall

A RESTful API and knowledge base management system providing persistent memory and storage capabilities for AI agents. Designed to work alongside LocalAI and LocalAGI.

Link: https://github.com/mudler/LocalRecall

Join the Movement! ❤️

A massive THANK YOU to our incredible community! LocalAI has over 31,800 stars, and LocalAGI has already rocketed past 450+ stars!

As a reminder, LocalAI is real FOSS (Free and Open Source Software) and its sibling projects are community-driven and not backed by VCs or a company. We rely on contributors donating their spare time. If you love open-source, privacy-first AI, please consider starring the repos, contributing code, reporting bugs, or spreading the word!

👉 Check out the reborn LocalAGI v2 today: https://github.com/mudler/LocalAGI

Let's continue building the future of AI, together! 🙌

Full changelog 👇

👉 Click to expand 👈

What's Changed

Bug fixes 🐛

fix(stablediffusion): Avoid GGML commit which causes CUDA compile error by @richiejp in #5170

Exciting New Features 🎉

feat(loader): enhance single active backend by treating as singleton by @mudler in #5107

🧠 Models

chore(model gallery): add all-hands_openhands-lm-32b-v0.1 by @mudler in #5111
chore(model gallery): add burtenshaw_gemmacoder3-12b by @mudler in #5112
chore(model gallery): add all-hands_openhands-lm-7b-v0.1 by @mudler in #5113
chore(model gallery): add all-hands_openhands-lm-1.5b-v0.1 by @mudler in #5114
chore(model gallery): add gemma-3-12b-it-qat by @mudler in #5117
chore(model gallery): add gemma-3-4b-it-qat by @mudler in #5118
chore(model gallery): add tesslate_synthia-s1-27b by @mudler in #5119
chore(model gallery): add katanemo_arch-function-chat-7b by @mudler in #5120
chore(model gallery): add katanemo_arch-function-chat-1.5b by @mudler in #5121
chore(model gallery): add katanemo_arch-function-chat-3b by @mudler in #5122
chore(model gallery): add gemma-3-27b-it-qat by @mudler in #5124
chore(model gallery): add open-thoughts_openthinker2-32b by @mudler in #5128
chore(model gallery): add open-thoughts_openthinker2-7b by @mudler in #5129
chore(model gallery): add arliai_qwq-32b-arliai-rpr-v by @mudler in #5137
chore(model gallery): add watt-ai_watt-tool-70b by @mudler in #5138
chore(model gallery): add eurydice-24b-v2-i1 by @mudler in #5139
chore(model gallery): add mensa-beta-14b-instruct-i1 by @mudler in #5140
chore(model gallery): add meta-llama_llama-4-scout-17b-16e-instruct by @mudler in #5141
fix(gemma): improve prompt for tool calls by @mudler in #5142
chore(model gallery): add cogito-v1-preview-qwen-14b by @mudler in #5145
chore(model gallery): add deepcogito_cogito-v1-preview-llama-8b by @mudler in #5147
chore(model gallery): add...

Contributors

richiejp, mudler, and 3 other contributors

Assets 10

0 Join discussion

31 Mar 10:32

mudler

v2.27.0

6d7ac09

v2.27.0

🚀 LocalAI v2.27.0

Welcome to another exciting release of LocalAI v2.27.0! We've been working hard to bring you a fresh WebUI experience and a host of improvements under the hood. Get ready to explore new updates!

🔥 AIO Images Updates

Check out the updated models we're now shipping with our All-in-One images:

CPU All-in-One:

Text-to-Text: llama3.1
Embeddings: granite-embeddings
Vision: minicpm

GPU All-in-One:

Text-to-Text: localai-functioncall-qwen2.5-7b-v0.5 (our tiniest flagship model!)
Embeddings: granite-embeddings
Vision: minicpm

💻 WebUI Overhaul!

We've given the WebUI a brand-new look and feel. Have a look at the stunning new interface:

Talk Interface	Generate Audio

Models Overview	Generate Images

Chat Interface	API Overview

Login	Swarm

How to Use

To get started with LocalAI, you can use our container images. Here’s how to run them with Docker:

# CPU only image:
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-cpu

# Nvidia GPU:
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12

# CPU and GPU image (bigger size):
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest

# AIO images (pre-downloads a set of models ready for use, see https://localai.io/basics/container/)
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu

Check out our Documentation for more information.

Key Highlights:

Complete WebUI Redesign: A fresh, modern interface with enhanced navigation and visuals.
Model Gallery Improvements: Easier exploration with improved pagination and filtering.
AIO Image Updates: Smoother deployments with updated models.
Stability Fixes: Critical bug fixes in model initialization, embeddings handling, and GPU offloading.

What’s New 🎉

Chat Interface Enhancements: Cleaner layout, model-specific UI tweaks, and custom reply prefixes.
Smart Model Detection: Automatically links to relevant model documentation based on use.
Performance Tweaks: GGUF models now auto-detect context size, and Llama.cpp handles batch embeddings and SIGTERM gracefully.
VLLM Config Boost: Added options to disable logging, set dtype, and enforce per-prompt media limits.
New model architecture supported: Gemma 3, Mistral, Deepseek

Bug Fixes 🐛

Resolved model icon display inconsistencies.
Ensured proper handling of generated artifacts without API key restrictions.
Optimized CLIP offloading and Llama.cpp process termination.

Stay Tuned!

We have some incredibly exciting features and updates lined up for you. While we can't reveal everything just yet. Keep an eye out for our upcoming announcements – you won't want to miss them!

Do you like the new webui? let us know in the Github discussions!

Enjoy 🚀

Full changelog 👇

👉 Click to expand 👈

What's Changed

Bug fixes 🐛

fix: change initialization order of llama-cpp-avx512 to go before avx2 variant by @bhulsken in #4837
fix(coqui): pin transformers by @mudler in #4875
fix(ui): not all models have an Icon by @mudler in #4913
fix(models): unify usecases identifications by @mudler in #4914
fix(llama.cpp): correctly handle embeddings in batches by @mudler in #4957
fix(routes): do not gate generated artifacts via key by @mudler in #4971
fix(clip): do not imply GPU offload by default by @mudler in #5010
fix(llama.cpp): properly handle sigterm by @mudler in #5099

Exciting New Features 🎉

feat(ui): detect model usage and display link by @mudler in #4864
feat(vllm): Additional vLLM config options (Disable logging, dtype, and Per-Prompt media limits) by @TheDropZone in #4855
feat(ui): show only text models in the chat interface by @mudler in #4869
feat(ui): do also filter tts and image models by @mudler in #4871
feat(ui): paginate model gallery by @mudler in #4886
feat(ui): small improvements to chat interface by @mudler in #4907
feat(ui): improve chat interface by @mudler in #4910
feat(ui): improvements to index and models page by @mudler in #4918
feat: allow to specify a reply prefix by @mudler in #4931
feat(ui): complete design overhaul by @mudler in #4942
feat(ui): remove api key handling and small ui adjustments by @mudler in #4948
feat(aio): update AIO image defaults by @mudler in #5002
feat(gguf): guess default context size from file by @mudler in #5089

🧠 Models

chore(model gallery): add ozone-ai_0x-lite by @mudler in #4835
chore: update Image generation docs and examples by @mudler in #4841
chore(model gallery): add kubeguru-llama3.2-3b-v0.1 by @mudler in #4858
chore(model gallery): add allenai_llama-3.1-tulu-3.1-8b by @mudler in #4859
chore(model gallery): add nbeerbower_dumpling-qwen2.5-14b by @mudler in #4860
chore(model gallery): add nbeerbower_dumpling-qwen2.5-32b-v2 by @mudler in #4861
chore(model gallery): add nbeerbower_dumpling-qwen2.5-72b by @mudler in #4862
chore(model gallery): add pygmalionai_pygmalion-3-12b by @mudler in #4866
chore(model gallery): add open-r1_openr1-qwen-7b by @mudler in #4867
chore(model gallery): add sentientagi_dobby-unhinged-llama-3.3-70b by @mudler in #4868
chore(model gallery): add internlm_oreal-32b by @mudler in #4872
chore(model gallery): add internlm_oreal-deepseek-r1-distill-qwen-7b by @mudler in #4873
chore(model gallery): add internlm_oreal-7b by @mudler in #4874
chore(model gallery): add smirki_uigen-t1.1-qwen-14b by @mudler in #4877
chore(model gallery): add smirki_uigen-t1.1-qwen-7b by @mudler in #4878
chore(model gallery): add l3.1-8b-rp-ink by @mudler in #4879
chore(model gallery): add pocketdoc_dans-personalityengine-v1.2.0-24b by @mudler in #4880
chore(model gallery): add rombo-org_rombo-llm-v3.0-qwen-72b by @mudler in #4882
chore(model gallery): add ozone-ai_reverb-7b by @mudler in #4883
chore(model gallery): add arcee-ai_arcee-maestro-7b-preview by @mudler in #4884
chore(model gallery): add steelskull_l3.3-mokume-gane-r1-70b by @mudler in #4885
chore(model gallery): add steelskull_l3.3-cu-mai-r1-70b by @mudler in #4892
chore(model gallery): add steelskull_l3.3-san-mai-r1-70b by @mudler in https://git...

Contributors

mudler, bhulsken, and 4 other contributors

Assets 10

0 Join discussion

15 Feb 17:22

mudler

v2.26.0

09941c0

v2.26.0

🦙 LocalAI v2.26.0!

Hey everyone - very excited about this release!

It contains several cleanups, performance improvements and few breaking changes: old backends that are now superseded have been removed (for example, vall-e-x), while new backends have been added to expand the range of model architectures that LocalAI can support. While most of the changes are tested, if you encounter issues with the new backends or migrated ones please file a new issue.

We also now have support for Nvidia L4T devices (for example, Nvidia AGX Orin) with specific container images. See the documentation for more details.

⚠️ Breaking Changes ⚠️

Several backends have been dropped and replaced for improved performance and compatibility.
Vall-e-x and Openvoice were deprecated and dropped.
The stablediffusion-NCN backend was replaced with the stablediffusion-ggml implementation.
Deprecated llama-ggml backend has been dropped in favor of GGUF support.

Check all details!

Backends that were dropped:

Vall-e-x and Openvoice: These projects went silent, and there are better alternatives now. They have been completely superseded by the CoquiTTS community fork, Kokoro, and OutelTTS.
Stablediffusion-NCN: This was the first variant shipped with LocalAI based on the ONNX runtime. It has now been superseded by the stablediffusion-ggml backend, which offers similar capabilities and wider support across more architectures.
Llama-ggml backend: This was the pre-GGUF backend, which is now deprecated. Moving forward, LocalAI will support only GGUF models.

Notable Backend Changes:

Mamba has moved to the transformers backend.
Transformers-Musicgen has moved to the transformers backend.
Sentencetransformers has moved to the transformers backend.

While LocalAI will try to alias to the transformers backend automatically when using these backends, there might be incompatibilies with your configuration files. Please open an issue if you face any problem!

New Backends:

Kokoro (TTS): A new backend for text-to-speech.
OuteTTS: A TTS backend with voice cloning capabilities.
Fast-Whisper: A backend designed for faster whisper model inference.

New Features 🎉

Lazy grammars (llama.cpp): Added grammar triggers for llama.cpp: this allow models trained with specific tokens to enable grammar generation when such tokens are seen: this allows precise JSON generation but also consistent output when the model does not need to answer with a tool. For example, in the config file of the model triggers can be specified as such:

  function:
    grammar:
      triggers:
        word: "<tool_call>"
        at_start: true

Function Argument Parsing Using Named Regex: A new feature that allows parsing function arguments with named regular expressions, simplifying function calls.
Support for New Backends: Added Kokoro, OutelTTS, and Fast-Whisper backends.
Diffusers Update: Added support for Sana pipelines and image generation option overrides.
Machine Tag and Inference Timing: Allows tracking machine performance during inference.
Tokenization: Introduced tokenization support for llama.cpp to improve text processing.
AVX512: There is now bundled support for CPUs supporting AVX512 instruction set
Nvidia L4T: Support for Nvidia devices on arm64, for example Nvidia AGX Orin and alikes. See the documentation. TLDR; You can start container images ready to go with:

docker run -e DEBUG=true \
                    -p 8080:8080 \
                    -v $PWD/models:/build/models  \
                   -ti --restart=always --name local-ai \
                   --runtime nvidia --gpus all quay.io/go-skynet/local-ai:master-nvidia-l4t-arm64-core

Bug Fixes 🐛

Multiple fixes to improve stability, including enabling SYCL support for stablediffusion-ggml and consistent OpenAI stop reason returns.
Improved context shift handling for llama.cpp and fixed gallery store overrides.

🧠 Models:

I've fine-tuned a family of models based on o1-cot and function call datasets to work closely with all LocalAI features regarding function calling. The models are tailored to be conversational and execute function calls:

llama3.2-1b version: https://huggingface.co/mudler/LocalAI-functioncall-llama3.2-1b-v0.4
llama3.2-3b version: https://huggingface.co/mudler/LocalAI-functioncall-llama3.2-3b-v0.5
phi-4 version: https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.3
qwen2.5 (7b) version: https://huggingface.co/mudler/LocalAI-functioncall-qwen2.5-7b-v0.5

Enjoy! All the models are available in the LocalAI gallery:

local-ai run LocalAI-functioncall-phi-4-v0.3
local-ai run LocalAI-functioncall-llama3.2-1b-v0.4
local-ai run LocalAI-functioncall-llama3.2-3b-v0.5
local-ai run localai-functioncall-qwen2.5-7b-v0.5

Other models

Numerous model updates and additions:

New models like nightwing3-10b, rombos-qwen2.5-writer, and negative_llama_70b.
Updated checksum for model galleries.
Added icons and improved prompt templates for various models.
Expanded model gallery with new additions like DeepSeek-R1, Mistral-small-24b, and more.

Full changelog 👇

👉 Click to expand 👈

Breaking Changes 🛠

chore(vall-e-x): Drop backend by @mudler in #4619
feat(transformers): merge musicgen functionalities to a single backend by @mudler in #4620
feat(transformers): merge sentencetransformers backend by @mudler in #4624
chore(stablediffusion-ncn): drop in favor of ggml implementation by @mudler in #4652
feat(transformers): add support to Mamba by @mudler in #4669
chore(openvoice): drop backend by @mudler in #4673
chore: drop embedded models by @mudler in #4715
chore(llama-ggml): drop deprecated backend by @mudler in #4775
fix(llama.cpp): disable mirostat as default by @mudler in #2911

Bug fixes 🐛

fix(stablediffusion-ggml): correctly enable sycl by @mudler in #4591
fix(stablediffusion-ggml): enable oneapi before build by @mudler in #4593
fix(docs): add missing -core suffix to sycl images by @M0Rf30 in #4630
fix(stores): Stores fixes and testing by @richiejp in #4663
fix(gallery): do not return overrides and additional config by @mudler in #4768
fix(openai): consistently return stop reason by @mudler in #4771
fix(llama.cpp): improve context shift handling by @mudler in #4820

Exciting New Features 🎉

feat(stablediffusion-ggml): respect build type by @mudler in #4581
feat(diffusers): add support for Sana pipelines by @mudler in #4603
feat(tts): Add Kokoro backend by @mudler in #4616
feat: add machine tag and inference timings by @mintyleaf in #4577
feat(transformers): add support to OuteTTS by @mudler in #4622
Extra-Usage and Machine-Tag docs by @mintyleaf in #4627
chore: fix some function names in comment by @petercover in #4665
feat(faster-whisper): add backend by @mudler in #4666
chore: detect and enable avx512 builds by @mudler in #4675
chore(downloader): support hf.co and hf:// URIs by @mudler in #4677
feat: function argument parsing using named regex by @mKenfenheuer in #4700
feat(llama.cpp): Add support to grammar triggers by @mudler in #4733
feat: tokenization with llama.cpp by @shraddhazpy in #4724
feat(diffusers): allow to override image gen options by @mudler in #4807

🧠 Models

chore(model-gallery): ⬆️ update checksum by @localai-bot in #4580
chore(model gallery): add nightwing3-10b-v0.1 by @mudler in #4582
chore(model gallery): add qwq-32b-preview-ideawhiz-v1 by @mudler in #4583
chore(...

Contributors

M0Rf30, richiejp, and 8 other contributors

Assets 10

10 Jan 22:02

mudler

v2.25.0

07655c0

v2.25.0

What's Changed

Bug fixes 🐛

chore(llava): update clip.patch by @mudler in #4453

Exciting New Features 🎉

feat(llama.cpp): expose cache_type_k and cache_type_v for quant of kv cache by @mudler in #4329
feat(template): read jinja templates from gguf files by @mudler in #4332
feat: stream tokens usage by @mintyleaf in #4415
feat(Dockerfile): allow to skip driver installation by @mudler in #4447
feat(ui): path prefix support via HTTP header by @mgoltzsche in #4497
feat(dowloader): resume partial downloads by @Saavrm26 in #4537

🧠 Models

chore(model gallery): add rp-naughty-v1.0c-8b by @mudler in #4322
chore(model gallery): add loki-v2.6-8b-1024k by @mudler in #4321
chore(model gallery): add math-iio-7b-instruct by @mudler in #4323
chore(model gallery): add llama-3.3-70b-instruct by @mudler in #4333
chore(model gallery): add mn-chunky-lotus-12b by @mudler in #4337
chore(model gallery): add virtuoso-small by @mudler in #4338
chore(model gallery): add bio-medical-llama-3-8b by @mudler in #4339
chore(model gallery): add qwen2.5-7b-homeranvita-nerdmix by @mudler in #4343
chore(model gallery): add impish_mind_8b by @mudler in #4344
chore(model gallery): add tulu-3.1-8b-supernova-smart by @mudler in #4347
chore(model gallery): add qwen2.5-math-14b-instruct by @mudler in #4355
chore(model gallery): add intellect-1-instruct by @mudler in #4356
chore(model gallery): add b-nimita-l3-8b-v0.02 by @mudler in #4357
chore(model gallery): add sailor2-1b-chat by @mudler in #4363
chore(model gallery): add sailor2-8b-chat by @mudler in #4364
chore(model gallery): add sailor2-20b-chat by @mudler in #4365
chore(model gallery): add 72b-qwen2.5-kunou-v1 by @mudler in #4369
chore(model gallery): add deepthought-8b-llama-v0.01-alpha by @mudler in #4370
chore(model gallery): add l3.3-70b-euryale-v2.3 by @mudler in #4371
chore(model gallery): add l3.3-ms-evayale-70b by @mudler in #4374
chore(model gallery): add evathene-v1.3 by @mudler in #4375
chore(model gallery): add hermes-3-llama-3.2-3b by @mudler in #4376
chore(model gallery): add fusechat-gemma-2-9b-instruct by @mudler in #4379
chore(model gallery): add fusechat-qwen-2.5-7b-instruct by @mudler in #4380
chore(model gallery): add chronos-gold-12b-1.0 by @mudler in #4381
fix: correct gallery/index.yaml by @godsey in #4384
chore(model gallery): add fusechat-llama-3.2-3b-instruct by @mudler in #4386
chore(model gallery): add fusechat-llama-3.1-8b-instruct by @mudler in #4387
chore(model gallery): add neumind-math-7b-instruct by @mudler in #4388
chore(model gallery): add naturallm-7b-instruct by @mudler in #4392
chore(model gallery): add marco-o1-uncensored by @mudler in #4393
chore(model gallery): add qwen2-7b-multilingual-rp by @mudler in #4394
chore(model gallery): add qwq-lcot-7b-instruct by @mudler in #4419
chore(model gallery): add llama-openreviewer-8b by @mudler in #4422
chore(model gallery): add falcon3-1b-instruct by @mudler in #4423
chore(model gallery): add falcon3-3b-instruct by @mudler in #4424
chore(model gallery): add qwen2-vl-72b-instruct by @mudler in #4425
chore(model gallery): add falcon3-10b-instruct by @mudler in #4426
chore(model gallery): add llama-song-stream-3b-instruct by @mudler in #4431
chore(model gallery): add llama-chat-summary-3.2-3b by @mudler in #4432
chore(model gallery): add tq2.5-14b-aletheia-v1 by @mudler in #4440
chore(model gallery): add tq2.5-14b-neon-v1 by @mudler in #4441
chore(model gallery): add orca_mini_v8_1_70b by @mudler in #4444
chore(model gallery): add anubis-70b-v1 by @mudler in #4446
chore(model gallery): add llama-3.3-70b-instruct-ablated by @mudler in #4448
chore(model-gallery): ⬆️ update checksum by @localai-bot in #4487
chore(model gallery): add l3.3-ms-evalebis-70b by @mudler in #4488
chore(model gallery): add tqwendo-36b by @mudler in #4489
chore(model gallery): add rombos-llm-70b-llama-3.3 by @mudler in #4490
chore(model-gallery): ⬆️ update checksum by @localai-bot in #4492
chore(model gallery): add fastllama-3.2-1b-instruct by @mudler in #4493
chore(model gallery): add dans-personalityengine-v1.1.0-12b by @mudler in #4494
chore(model gallery): add llama-3.1-8b-open-sft by @mudler in #4495
chore(model gallery): add qvq-72b-preview by @mudler in #4498
chore(model gallery): add teleut-7b-rp by @mudler in #4499
chore(model gallery): add falcon3-1b-instruct-abliterated by @mudler in #4501
chore(model gallery): add falcon3-3b-instruct-abliterated by @mudler in #4502
chore(model gallery): add falcon3-10b-instruct-abliterated by @mudler in #4503
chore(model gallery): add falcon3-7b-instruct-abliterated by @mudler in #4504
chore(model gallery): add control-nanuq-8b by @mudler in #4506
chore(model gallery): add miscii-14b-1028 by @mudler in #4507
chore(model gallery): add miscii-14b-1225 by @mudler in #4508
chore(model gallery): add qwen2.5-32b-rp-ink by @mudler in #4517
chore(model gallery): add huatuogpt-o1-8b by @mudler in #4518
chore(model gallery): add q2.5-veltha-14b-0.5 by @mudler in #4519
chore(model gallery): add smallthinker-3b-preview by @mudler in #4521
chore(model gallery): add mn-12b-mag-mell-r1-iq-arm-imatrix by @mudler in #4522
chore(model gallery): add captain-eris-diogenes_twilight-v0.420-12b by @mudler in #4523
chore(model gallery): add violet_twilight-v0.2 by @mudler in #4524
chore(model gallery): add qwenwify2.5-32b-v4.5 by @mudler in #4525
chore(model gallery): add sainemo-remix by @mudler in #4526
chore(model gallery): add l3.1-purosani-2-8b by @mudler in #4527
chore(model gallery): add nera_noctis-12b by @mudler in #4530
chore(model gallery): add drt-o1-7b by @mudler in #4533
chore(model gallery): add codepy-deepthink-3b by @mudler in #4534
chore(model gallery): add llama3.1-8b-prm-deepseek-data by @mudler in #4535
chore(model gallery): add experimental-lwd-mirau-rp-14b-iq-imatrix by @mudler in #4539
chore(model gallery): add llama-deepsync-3b by @mudler in #4540
chore(model gallery): add qwentile2.5-32b-instruct by @mudler in #4541
chore(model gallery): add 32b-qwen2.5-kunou-v1 by @mudler in #4545
chore(model gallery): add triangulum-10b by @mudler in #4546
chore(model gallery): add 14b-qwen2.5-kunou-v1 by @mudler in #4547
chore(model gallery): add dolphin3.0-llama3.1-8b by @mudler in https://github.com/mudl...

Contributors

jtwolfe, mudler, and 7 other contributors

Assets 11

10 Dec 14:52

mudler

v2.24.2

59cf30a

v2.24.2

What's Changed

👒 Dependencies

chore: ⬆️ Update ggerganov/llama.cpp to 26a8406ba9198eb6fdd8329fa717555b4f77f05f by @mudler in #4358

Full Changelog: v2.24.1...v2.24.2

Contributors

mudler

Assets 11

08 Dec 16:53

mudler

v2.24.1

184fbc2

v2.24.1

This is a patch release to fix #4334

Full Changelog: v2.24.0...v2.24.1

Assets 11

Uh oh!

Releases: mudler/LocalAI

v3.1.1

What's Changed

Bug fixes 🐛

Exciting New Features 🎉

👒 Dependencies

Other Changes

Contributors

Uh oh!

v3.1.0

🚀 LocalAI 3.1

🚀 Highlights

Support for Gemma 3n!

⚠️ Breaking Changes

🧰 Container Image Changes

📁 Directory Structure Updated

🏷 Unified Image Tag Naming for master (development) builds

Meta packages in backend galleries

The Complete Local Stack for Privacy-First AI

LocalAI

LocalAGI

LocalRecall

Join the Movement! ❤️

Full changelog 👇

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

Exciting New Features 🎉

🧠 Models

👒 Dependencies

Other Changes

Contributors

Uh oh!

v3.0.0

🚀 LocalAI 3.0 – A New Era Begins

TL;DR – What’s New in LocalAI 3.0.0 🎉

🧩 Introducing the Backend Gallery — Plug, Play, Power Up

⚠️ Important: Breaking Changes

CPU only image:

NVIDIA GPU Images:

AMD GPU Images (ROCm):

Intel GPU Images (oneAPI):

Vulkan GPU Images:

AIO Images (pre-downloaded models):

🧠 Smarter Reasoning, Smoother Chat

🧠 Model Power-Up: VRAM Savvy + Multimodal Brains

🧪 New Models!

🐞 Bugfixes & Polish

The Complete Local Stack for Privacy-First AI

LocalAI

LocalAGI

LocalRecall

Join the Movement! ❤️

Full changelog 👇

What's Changed

Breaking Changes 🛠

Bug fixes 🐛

Contributors

Uh oh!

v2.29.0

v2.29.0

⚠️ Important: Breaking Changes

CPU only image:

NVIDIA GPU Images:

AMD GPU Images (ROCm):

Intel GPU Images (oneAPI):

Vulkan GPU Images:

AIO Images (pre-downloaded models):

Key Changes in v2.29.0

📦 Container Image Overhaul

🚀 New Features & Enhancements

🧹 Backend Updates

The Complete Local Stack for Privacy-First AI

LocalAI

LocalAGI

LocalRecall

Join the Movement! ❤️

Full changelog 👇

What's Changed

🏷 Unified Image Tag Naming for `master` (development) builds