v0.25.0

Latest

Latest

github-actions released this 20 Feb 15:23

· 29 commits to main since this release

95f0352

⚠️ Notice

Significant changes have been implemented in this release; please consider adjusting them to fit your specific use case.

The default parallelism has been increased from 1 to 4, which might increase VRAM usage. (#3832)
Introduce a new embedding kind llama.cpp/before_b4356_embedding for llamafile or other embedding services utilizing the legacy llama.cpp embedding API. (#3828)

🚀 Features

Expose thinking process of Answer Engine to the answers in thread message. (#3785) (#3672)
Enable the Answer Engine to access the repository's directory file list as needed. (#3796)
Enable the use of @ to mention a symbol in Chat Sidebar. (#3778)
Provide default question recommendations that are repository-aware on Answer Engine. (#3815)

🧰 Fixed and Improvements

Provide a configuration to truncate text content prior to dispatching it to embedding service.. (#3816)
Bump llama.cpp version to b4651. (#3798)
Automatically retry embedding when the service occasionally fails due to issues with llama.cpp. (#3805)
Enhance the user interface experience for Answer Engine. (#3845) (#3794)
Resolve the deserialization issue related to finish_reason in chat response from LiteLLM Proxy Server.(#3882)

💫 New Contributors

@zhanba made their first contribution in #3675
@faceCutWall made their first contribution in #3812

Full Changelog: v0.24.0...v0.25.0

Contributors

zhanba and faceCutWall

Assets 10