Commit 13dc631
feat: model-agnostic RLV prompts — works with Phi-3.5, Qwen3.5, Qwen3
Replaced Phi-3.5-specific prompt formats with universal prompts:
lookup.py:
- Removed rigid "ANSWER:/NONE" format requirement
- Natural prompt: "Answer using ONLY the document. If not found, say not found"
- Model-agnostic refusal detection: 15 patterns covering all model styles
- Flexible answer prefix stripping (ANSWER:, Answer:, A:, **)
verifier.py:
- Expanded refusal detection: "not found", "no answer", "[none]" added
Results:
Phi-3.5 Q8: 3/3 quick check PASS (no regression)
Qwen3.5-4B: 3/3 quick check PASS (was 0/3 with old prompts!)
Qwen3.5-4B full: 3/7 (server stability issues on multi-hop, not prompt)
Key insight: "best benchmark model" works fine when prompts are universal.
Previous 2/7 was a prompt mismatch, not a model limitation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent a64e8de commit 13dc631
2 files changed
+29
-20
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | | - | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
35 | 38 | | |
36 | | - | |
37 | 39 | | |
38 | | - | |
39 | 40 | | |
40 | 41 | | |
41 | 42 | | |
42 | | - | |
43 | | - | |
44 | | - | |
| 43 | + | |
45 | 44 | | |
46 | | - | |
| 45 | + | |
47 | 46 | | |
48 | | - | |
49 | 47 | | |
50 | 48 | | |
51 | 49 | | |
52 | | - | |
53 | | - | |
| 50 | + | |
| 51 | + | |
54 | 52 | | |
55 | 53 | | |
56 | 54 | | |
| |||
155 | 153 | | |
156 | 154 | | |
157 | 155 | | |
158 | | - | |
159 | | - | |
160 | | - | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
161 | 166 | | |
162 | | - | |
163 | | - | |
164 | | - | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
165 | 172 | | |
166 | 173 | | |
167 | 174 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
155 | 155 | | |
156 | 156 | | |
157 | 157 | | |
| 158 | + | |
158 | 159 | | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
163 | 164 | | |
| 165 | + | |
164 | 166 | | |
165 | 167 | | |
166 | 168 | | |
| |||
0 commit comments