SymRank is a blazing-fast Python library for top-k cosine similarity ranking, designed for vector search, retrieval-augmented generation (RAG), and embedding-based matching.
Built with a Rust + SIMD backend, it offers the speed of native code with the ease of Python.
β‘ Fast: SIMD-accelerated cosine scoring with adaptive parallelism
π§ Smart: Automatically selects serial or parallel mode based on workload
π’ Top-K optimized: Efficient inlined heap selection (no full sort overhead)
π Pythonic: Easy-to-use Python API
π¦ Powered by Rust: Safe, high-performance core engine
π Memory Efficient: Supports batching for speed and to reduce memory footprint
You can install SymRank with 'uv' or alternatively using 'pip'.
uv pip install symrank
pip install symrank
import symrank as sr
query = [0.1, 0.2, 0.3, 0.4]
candidates = [
("doc_1", [0.1, 0.2, 0.3, 0.5]),
("doc_2", [0.9, 0.1, 0.2, 0.1]),
("doc_3", [0.0, 0.0, 0.0, 1.0]),
]
results = sr.cosine_similarity(query, candidates, k=2)
print(results)
Output
[{'id': 'doc_1', 'score': 0.9939991235733032}, {'id': 'doc_3', 'score': 0.7302967309951782}]
import symrank as sr
import numpy as np
query = np.array([0.1, 0.2, 0.3, 0.4], dtype=np.float32)
candidates = [
("doc_1", np.array([0.1, 0.2, 0.3, 0.5], dtype=np.float32)),
("doc_2", np.array([0.9, 0.1, 0.2, 0.1], dtype=np.float32)),
("doc_3", np.array([0.0, 0.0, 0.0, 1.0], dtype=np.float32)),
]
results = sr.cosine_similarity(query, candidates, k=2)
print(results)
Output
[{'id': 'doc_1', 'score': 0.9939991235733032}, {'id': 'doc_3', 'score': 0.7302967309951782}]
cosine_similarity(
query_vector, # List[float] or np.ndarray
candidate_vectors, # List[Tuple[str, List[float] or np.ndarray]]
k=5, # Number of top results to return
batch_size=None # Optional: set for memory-efficient batching
)
Parameter | Type | Default | Description |
---|---|---|---|
query_vector |
list[float] or np.ndarray |
required | The query vector you want to compare against the candidate vectors. |
candidate_vectors |
list[tuple[str, list[float] or np.ndarray]] |
required | List of (id, vector) pairs. Each vector can be a list or NumPy array. |
k |
int |
5 | Number of top results to return, sorted by descending similarity. |
batch_size |
int or None |
None | Optional batch size to reduce memory usage. If None, uses SIMD directly. |
List of dictionaries with id
and score
(cosine similarity), sorted by descending similarity:
[{"id": "doc_42", "score": 0.8763}, {"id": "doc_17", "score": 0.8451}, ...]
This project is licensed under the Apache License 2.0.