C# implementation of Sentence Transformers All-MiniLM-L6-v2
Use as a .net standard 2.1 library.
Includes tokenizer and onnx model.
The Nuget does not include the onnx model or the vocab.txt. These can be found on Hugging Face (See tested models below).
The Embedder looks for the default model.onnx and vocab.txt files in the .\model
folder.
You may use a custom onnx model or custom vocab as well.
- Single Sentence
var sentence = "This is an example sentence";
using var embedder = new AllMiniLmL6V2Embedder();
var embedding = embedder.GenerateEmbedding(sentence);
- Multiple Sentences
string[] sentences = ["This is an example sentence", "Here is another"];
using var embedder = new AllMiniLmL6V2Embedder();
var embeddings = model.GenerateEmbeddings(sentences);
- Custom All-MiniLM-L6-v2 onnx model
var sentence = "This is an example sentence";
using var embedder = new AllMiniLmL6V2Embedder(modelPath: "path/to/model.onnx");
var embedding = embedder.GenerateEmbedding(sentence);
- Custom vocab
var sentence = "This is an example sentence";
BertTokenizer tokenizer = new("path/to/vocab.txt");
using var embedder = new AllMiniLmL6V2Embedder(tokenizer: tokenizer);
var embedding = embedder.GenerateEmbedding(sentence);
- Custom Tokenizer
var sentence = "This is an example sentence";
ITokenizer tokenizer = new CustomTokenizer();
using var embedder = new AllMiniLmL6V2Embedder(tokenizer: tokenizer);
var embedding = embedder.GenerateEmbedding(sentence);