SymRank is a blazing-fast Python library for top-k cosine similarity ranking, designed for vector search, retrieval-augmented generation (RAG), and embedding-based matching.
Built with a Rust + SIMD backend, it offers the speed of native code with the ease of Python.
β‘ Fast: SIMD-accelerated cosine scoring with adaptive parallelism
π§ Smart: Automatically selects serial or parallel mode based on workload
π’ Top-K optimized: Efficient inlined heap selection (no full sort overhead)
π Pythonic: Easy-to-use Python API
π¦ Powered by Rust: Safe, high-performance core engine
π Memory Efficient: Supports batching for speed and to reduce memory footprint
Below are single-query cosine similarity benchmarks comparing SymRank to NumPy and scikit-learn across realistic re-ranking candidate sizes.
| Candidates (N) | SymRank matrix (ms) | NumPy normalized (ms) | sklearn (ms) | Fastest | SymRank Speedup |
|---|---|---|---|---|---|
| 20 | 0.006 | 0.027 | 0.210 | SymRank | 4.50x |
| 50 | 0.017 | 0.050 | 0.266 | SymRank | 2.92x |
| 100 | 0.020 | 0.086 | 0.390 | SymRank | 4.18x |
| 500 | 0.169 | 0.393 | 1.843 | SymRank | 2.32x |
| 1,000 | 0.170 | 0.669 | 3.588 | SymRank | 3.95x |
| 5,000 | 0.748 | 5.261 | 32.196 | SymRank | 7.03x |
| 10,000 | 1.976 | 13.938 | 42.514 | SymRank | 7.05x |
- Cosine similarity top k (k=5), embedding dimension 1536, float32.
- NumPy baseline uses candidates normalized once and query normalized per call.
- sklearn uses sklearn.metrics.pairwise.cosine_similarity.
- Times are mean milliseconds per query on Windows.
Performance on 10,000 real OpenAI-style embeddings streamed from the Hugging Face Hub.
| Method | Mean time (ms) | Speedup vs SymRank |
|---|---|---|
| SymRank matrix | 1.800 | 1.0x |
| SymRank list | 22.753 | 12.64x |
| NumPy normalized | 44.665 | 24.81x |
| sklearn | 42.709 | 23.73x |
- Dataset: Qdrant/dbpedia-entities-openai3-text-embedding-3-large-1536-1M
- k=5, float32, Windows
- Mean time per query
You can install SymRank with 'uv' or alternatively using 'pip'.
uv pip install symrankpip install symrankSymRank provides two APIs optimized for different workflows.
Best when:
- Candidate embeddings are already stored as a single 2D NumPy array
- Performance matters (about 10 to 14x faster for N=1,000 to 10,000 versus the list API)
- Running many queries against the same candidate set
import numpy as np
from symrank import cosine_similarity_matrix
# Example data (dimension = 4 for readability)
query = np.array([1.0, 0.0, 0.0, 0.0], dtype=np.float32)
candidate_matrix = np.array(
[
[1.0, 0.0, 0.0, 0.0], # identical to query
[0.0, 1.0, 0.0, 0.0], # orthogonal
[0.5, 0.5, 0.0, 0.0], # partially aligned
[0.2, 0.1, 0.0, 0.0], # weakly aligned
],
dtype=np.float32,
)
ids = ["doc_a", "doc_b", "doc_c", "doc_d"]
results = cosine_similarity_matrix(query, candidate_matrix, ids, k=3)
print(results)Output:
[
{"id": "doc_a", "score": 1.0},
{"id": "doc_d", "score": 0.8944272},
{"id": "doc_c", "score": 0.70710677}
]Notes:
- Scores are cosine similarity (range -1 to 1, higher = more similar)
- Results are sorted by descending similarity
Typical production usage (1536-dimensional embeddings):
import numpy as np
from symrank import cosine_similarity_matrix
D = 1536
N = 10_000
query = np.random.rand(D).astype(np.float32)
candidate_matrix = np.random.rand(N, D).astype(np.float32)
ids = [f"doc_{i}" for i in range(N)]
top5 = cosine_similarity_matrix(query, candidate_matrix, ids, k=5)
for result in top5:
print(f"{result['id']}: {result['score']:.4f}")Optional batching for memory control:
# Process 10k candidates in batches of 2000
results = cosine_similarity_matrix(
query, candidate_matrix, ids, k=5, batch_size=2000
)Best when:
- Candidates come from mixed or streaming sources
- Vectors are naturally represented as (id, vector) pairs
- Simplicity is more important than maximum throughput
Basic example using Python lists:
import symrank as sr
query = [0.1, 0.2, 0.3, 0.4]
candidates = [
("doc_1", [0.1, 0.2, 0.3, 0.5]),
("doc_2", [0.9, 0.1, 0.2, 0.1]),
("doc_3", [0.0, 0.0, 0.0, 1.0]),
]
results = sr.cosine_similarity(query, candidates, k=2)
print(results)Output:
[
{"id": "doc_1", "score": 0.9939991235733032},
{"id": "doc_3", "score": 0.7302967309951782}
]Basic example using NumPy arrays:
import symrank as sr
import numpy as np
query = np.array([0.1, 0.2, 0.3, 0.4], dtype=np.float32)
candidates = [
("doc_1", np.array([0.1, 0.2, 0.3, 0.5], dtype=np.float32)),
("doc_2", np.array([0.9, 0.1, 0.2, 0.1], dtype=np.float32)),
("doc_3", np.array([0.0, 0.0, 0.0, 1.0], dtype=np.float32)),
]
results = sr.cosine_similarity(query, candidates, k=2)
print(results)Output:
[
{"id": "doc_1", "score": 0.9939991235733032},
{"id": "doc_3", "score": 0.7302967309951782}
]Optional batching:
results = sr.cosine_similarity(query, candidates, k=5, batch_size=1000)| Dataset Size | Option 1 (matrix) | Option 2 (list) | Speedup |
|---|---|---|---|
| N=100 | 0.02 ms | 0.06 ms | 3.3x |
| N=1,000 | 0.18 ms | 2.28 ms | 12.7x |
| N=10,000 | 1.50 ms | 19.66 ms | 13.1x |
Benchmark: 1536-dimensional embeddings, k=5, Python 3.14, Windows. Benchmark includes Python-side overhead for each API.
Use cosine_similarity_matrix if:
- β You have a pre-built NumPy matrix of candidates
- β Performance is critical
- β Processing many queries against the same corpus
Use cosine_similarity if:
- β Building candidates on-the-fly
- β Mixed vector input types (lists or NumPy arrays)
- β Flexibility > raw speed
Both functions return the same format: a list of dicts sorted by descending similarity score.
cosine_similarity(
query_vector, # List[float] or np.ndarray
candidate_vectors, # List[Tuple[str, List[float] or np.ndarray]]
k=5, # Number of top results to return
batch_size=None # Optional: set for memory-efficient batching
)| Parameter | Type | Default | Description |
|---|---|---|---|
query_vector |
list[float] or np.ndarray |
required | The query vector you want to compare against the candidate vectors. |
candidate_vectors |
list[tuple[str, list[float] or np.ndarray]] |
required | List of (id, vector) pairs. Each vector can be a list or NumPy array. |
k |
int |
5 | Number of top results to return, sorted by descending similarity. |
batch_size |
int or None |
None | Optional batch size to reduce memory usage. If None, uses SIMD directly. |
List of dictionaries with id and score (cosine similarity), sorted by descending similarity:
[{"id": "doc_42", "score": 0.8763}, {"id": "doc_17", "score": 0.8451}, ...]This project is licensed under the Apache License 2.0.
