-
Notifications
You must be signed in to change notification settings - Fork 38
Pull requests: pinchbench/skill
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add category-level score summary at end of benchmark run
#44
opened Mar 10, 2026 by
ScuttleBot
Loading…
fix: normalize agent IDs and remove bootstrap files for benchmark
#37
opened Mar 9, 2026 by
zhuanghaoz
Loading…
fix: Multiple bug fixes across benchmark pipeline
#31
opened Mar 8, 2026 by
kaiaiagent
Loading…
5 tasks
feat: Add --judge CLI argument and require full provider prefix for model IDs
#29
opened Mar 8, 2026 by
yudistiraashadi
Loading…
Add thinking-level benchmarking support (off/minimal/low/medium/high/xhigh/adaptive)
#12
opened Mar 7, 2026 by
jb510
Loading…
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.