Skip to content

Commit 978f55b

Browse files
[SPARKNLP-1091] AutoGGUFModel embeddings support (#14433)
* Split HasLlamaCppProperties to HasLlamaCppModelProperties and HasLlamaCppInferenceProperties * Refactor automatic gpu support * [SPARKNLP-1091] AutoGGUFEmbeddings scala side - also adds embedding support for AutoGGUFModel, as it already has the parameter * [SPARKNLP-1091] AutoGGUFEmbeddings python side * [SPARKNLP-1091] Update Documentation * [SPARKNLP-1091] Update AutoGGUFEmbeddingsTests * [SPARKNLP-1091] AutoGGUFEmbeddings python side - Also adds example notebook * [SPARKNLP-1080] AutoGGUFEmbeddings change default pretrained model * [SPARKNLP-1091] AutoGGUFEmbeddings Set Defaults --------- Co-authored-by: Maziyar Panahi <maziyar.panahi@iscpif.fr>
1 parent 6d3b273 commit 978f55b

File tree

19 files changed

+3025
-1361
lines changed

19 files changed

+3025
-1361
lines changed
Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
{%- capture title -%}
2+
AutoGGUFEmbeddings
3+
{%- endcapture -%}
4+
5+
{%- capture description -%}
6+
Annotator that uses the llama.cpp library to generate text embeddings with large language
7+
models.
8+
9+
The type of embedding pooling can be set with the `setPoolingType` method. The default is
10+
`"MEAN"`. The available options are `"NONE"`, `"MEAN"`, `"CLS"`, and `"LAST"`.
11+
12+
If the parameters are not set, the annotator will default to use the parameters provided by
13+
the model.
14+
15+
Pretrained models can be loaded with `pretrained` of the companion object:
16+
17+
```scala
18+
val autoGGUFEmbeddings = AutoGGUFEmbeddings.pretrained()
19+
.setInputCols("document")
20+
.setOutputCol("embeddings")
21+
```
22+
23+
The default model is `"nomic-embed-text-v1.5.Q8_0.gguf"`, if no name is provided.
24+
25+
For available pretrained models please see the [Models Hub](https://sparknlp.org/models).
26+
27+
For extended examples of usage, see the
28+
[AutoGGUFEmbeddingsTest](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/test/scala/com/johnsnowlabs/nlp/annotators/seq2seq/AutoGGUFEmbeddingsTest.scala)
29+
and the
30+
[example notebook](https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples/python/llama.cpp/llama.cpp_in_Spark_NLP_AutoGGUFEmbeddings.ipynb).
31+
32+
**Note**: To use GPU inference with this annotator, make sure to use the Spark NLP GPU package and set
33+
the number of GPU layers with the `setNGpuLayers` method.
34+
35+
When using larger models, we recommend adjusting GPU usage with `setNCtx` and `setNGpuLayers`
36+
according to your hardware to avoid out-of-memory errors.
37+
{%- endcapture -%}
38+
39+
{%- capture input_anno -%}
40+
DOCUMENT
41+
{%- endcapture -%}
42+
43+
{%- capture output_anno -%}
44+
SENTENCE_EMBEDDINGS
45+
{%- endcapture -%}
46+
47+
{%- capture python_example -%}
48+
>>> import sparknlp
49+
>>> from sparknlp.base import *
50+
>>> from sparknlp.annotator import *
51+
>>> from pyspark.ml import Pipeline
52+
>>> document = DocumentAssembler() \
53+
... .setInputCol("text") \
54+
... .setOutputCol("document")
55+
>>> autoGGUFEmbeddings = AutoGGUFEmbeddings.pretrained() \
56+
... .setInputCols(["document"]) \
57+
... .setOutputCol("completions") \
58+
... .setBatchSize(4) \
59+
... .setNGpuLayers(99) \
60+
... .setPoolingType("MEAN")
61+
>>> pipeline = Pipeline().setStages([document, autoGGUFEmbeddings])
62+
>>> data = spark.createDataFrame([["The moons of Jupiter are 77 in total, with 79 confirmed natural satellites and 2 man-made ones."]]).toDF("text")
63+
>>> result = pipeline.fit(data).transform(data)
64+
>>> result.select("completions").show()
65+
+--------------------------------------------------------------------------------+
66+
| embeddings|
67+
+--------------------------------------------------------------------------------+
68+
|[[-0.034486726, 0.07770534, -0.15982522, -0.017873349, 0.013914132, 0.0365736...|
69+
+--------------------------------------------------------------------------------+
70+
{%- endcapture -%}
71+
72+
{%- capture scala_example -%}
73+
import com.johnsnowlabs.nlp.base._
74+
import com.johnsnowlabs.nlp.annotator._
75+
import org.apache.spark.ml.Pipeline
76+
import spark.implicits._
77+
78+
val document = new DocumentAssembler().setInputCol("text").setOutputCol("document")
79+
80+
val autoGGUFEmbeddings = AutoGGUFEmbeddings
81+
.pretrained()
82+
.setInputCols("document")
83+
.setOutputCol("embeddings")
84+
.setBatchSize(4)
85+
.setPoolingType("MEAN")
86+
87+
val pipeline = new Pipeline().setStages(Array(document, autoGGUFEmbeddings))
88+
89+
val data = Seq(
90+
"The moons of Jupiter are 77 in total, with 79 confirmed natural satellites and 2 man-made ones.")
91+
.toDF("text")
92+
val result = pipeline.fit(data).transform(data)
93+
result.select("embeddings.embeddings").show(1, truncate=80)
94+
+--------------------------------------------------------------------------------+
95+
| embeddings|
96+
+--------------------------------------------------------------------------------+
97+
|[[-0.034486726, 0.07770534, -0.15982522, -0.017873349, 0.013914132, 0.0365736...|
98+
+--------------------------------------------------------------------------------+
99+
{%- endcapture -%}
100+
101+
{%- capture api_link -%}
102+
[AutoGGUFEmbeddings](/api/com/johnsnowlabs/nlp/embeddings/AutoGGUFEmbeddings)
103+
{%- endcapture -%}
104+
105+
{%- capture python_api_link -%}
106+
[AutoGGUFEmbeddings](/api/python/reference/autosummary/sparknlp/annotator/embeddings/auto_gguf_embeddings/index.html)
107+
{%- endcapture -%}
108+
109+
{%- capture source_link -%}
110+
[AutoGGUFEmbeddings](https://github.com/JohnSnowLabs/spark-nlp/tree/master/src/main/scala/com/johnsnowlabs/nlp/embeddings/AutoGGUFEmbeddings.scala)
111+
{%- endcapture -%}
112+
113+
{% include templates/anno_template.md
114+
title=title
115+
description=description
116+
input_anno=input_anno
117+
output_anno=output_anno
118+
python_example=python_example
119+
scala_example=scala_example
120+
api_link=api_link
121+
python_api_link=python_api_link
122+
source_link=source_link
123+
%}
File renamed without changes.

docs/en/annotators.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,7 @@ There are two types of Annotators:
4545
{:.table-model-big}
4646
|Annotator|Description|Version |
4747
|---|---|---|
48+
{% include templates/anno_table_entry.md path="" name="AutoGGUFEmbeddings" summary="Annotator that uses the llama.cpp library to generate text embeddings with large language models."%}
4849
{% include templates/anno_table_entry.md path="" name="AutoGGUFModel" summary="Annotator that uses the llama.cpp library to generate text completions with large language models."%}
4950
{% include templates/anno_table_entry.md path="" name="BGEEmbeddings" summary="Sentence embeddings using BGE."%}
5051
{% include templates/anno_table_entry.md path="" name="BigTextMatcher" summary="Annotator to match exact phrases (by token) provided in a file against a Document."%}

examples/python/llama.cpp/PromptAssember_with_AutoGGUFModel.ipynb

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -251,7 +251,7 @@
251251
"provenance": []
252252
},
253253
"kernelspec": {
254-
"display_name": "Python 3",
254+
"display_name": "sparknlp_dev",
255255
"language": "python",
256256
"name": "python3"
257257
},
@@ -264,7 +264,8 @@
264264
"mimetype": "text/x-python",
265265
"name": "python",
266266
"nbconvert_exporter": "python",
267-
"pygments_lexer": "ipython3"
267+
"pygments_lexer": "ipython3",
268+
"version": "3.10.12"
268269
}
269270
},
270271
"nbformat": 4,

0 commit comments

Comments
 (0)