RLV Locator: add lightweight semantic search as third RRF signal

## Summary

Add a lightweight sentence embedding model (all-MiniLM-L6-v2, 22M params) as the third signal in the Reciprocal Rank Fusion (RRF) locator, catching semantic matches that BM25 and keywords miss.

## Problem

Current locator uses BM25 + keyword overlap. On the 1.3MB large-doc test, Q15 fails because "temnein" (Greek word meaning "to cut") is in chunk 531, but BM25 picks chunk 553 (which discusses "Stegocephalia" = "roof-headed"). The semantic connection between "what does temnein mean" and the chunk containing "temnein (to cut)" is obvious to humans but invisible to keyword matching.

## Proposed Solution

```python
# Three-signal RRF (currently two)
rrf[cid] = (1/(60+rank_keyword) +
            1/(60+rank_bm25) +
            1/(60+rank_semantic))  # NEW
```

### Embedding model selection

| Model | Params | CPU latency | Quality |
|-------|--------|------------|---------|
| all-MiniLM-L6-v2 | 22M | ~30ms/query | Good |
| BGE-small-en | 33M | ~50ms/query | Better |
| nomic-embed-text | 137M | ~200ms/query | Best |

MiniLM is recommended: 30ms per query on CPU, no GPU needed.

### Pre-computation

Chunk embeddings are computed once during `quantcpp index` and stored alongside KV caches. Per-query cost is only one embedding (30ms).

## Expected Impact

- Q15 (temnein): semantic similarity catches the correct chunk
- 19/20 → 20/20 on large-doc test
- General: better handling of paraphrased/synonym queries

## Priority: P2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RLV Locator: add lightweight semantic search as third RRF signal #89

Summary

Problem

Proposed Solution

Embedding model selection

Pre-computation

Expected Impact

Priority: P2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model	Params	CPU latency	Quality
all-MiniLM-L6-v2	22M	~30ms/query	Good
BGE-small-en	33M	~50ms/query	Better
nomic-embed-text	137M	~200ms/query	Best

RLV Locator: add lightweight semantic search as third RRF signal #89

Description

Summary

Problem

Proposed Solution

Embedding model selection

Pre-computation

Expected Impact

Priority: P2

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions