6.3.1 #14721
DevinTDHa
announced in
Announcement
6.3.1
#14721
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
📢 Spark NLP 6.3.1: LLM Backend Upgrade and Document Processing Improvements
Spark NLP 6.3.1 focuses on strengthening distributed local LLM inference by upgrading the
jsl-llamacppbackend to a newerllama.cpprelease, while also delivering important improvements in document structure handling, ONNX model compatibility, metadata consistency.🔥 Highlights
jsl-llamacppbackend tollama.cpptag b7247, bringing upstream performance improvements, stability fixes, and expanded model compatibility for local LLM inference.Reader2Imageintegration withAutoGGUFVisionModel🚀 New Features & Enhancements
LLM Backend Upgrade (llama.cpp)
The
jsl-llamacppbackend has been upgraded tollama.cpptag b7247.This upgrade brings:
llama.cppThis is the most impactful change in this release for users running LLM workloads locally or in restricted environments.
Structural Metadata for Document Readers
All supported document readers now store structural position metadata for tables and images. Newly added metadata fields include:
domPathorderTableIndexorderImageIndexThese additions enable layout-aware downstream processing and more precise document understanding, especially for HTML and rich document formats.
Reader2Image Integration with AutoGGUFVisionModel
Reader2Image now supports interoperability with AutoGGUFVisionModel by introducing flexible handling of encoded vs. decoded image bytes and optional prompt output.
useEncodedImageBytesto control whether the image result stores:true: Encoded (compressed) file bytes for models like AutoGGUFVisionModelfalse: Decoded pixel matrix for models such as Qwen2VLTransformerPlatform Setup Documentation
Added official documentation and instructions for setting up and running Spark NLP on Microsoft Fabric, simplifying configuration and improving developer onboarding on the platform. You can see them at Spark NLP - Installation
🐛 Bug Fixes
DocumentAssembleroutputs when using LightPipeline.ResourceDownloadercould fail under certain conditions.BertEmbeddingsmodels with non-standard output tensor names.❤️ Community Support
💻 Installation
Python
Spark Packages
CPU
GPU
Apple Silicon
AArch64
Maven
Supported on on Apache Spark 3.x.
spark-nlp
spark-nlp-gpu
spark-nlp-silicon
spark-nlp-aarch64
FAT JARs
What's Changed
Full Changelog: 6.3.0...6.3.1
This discussion was created from the release 6.3.1.
Beta Was this translation helpful? Give feedback.
All reactions