{"actor":"sjarmak","comment":null,"created_at":"2026-03-09T16:28:56Z","event_type":"claimed","id":36,"issue_id":"CodeScaleBench-25b.2","new_value":"{\"assignee\":\"sjarmak\",\"status\":\"in_progress\"}","old_value":"{\"id\":\"CodeScaleBench-25b.2\",\"title\":\"Define standard validation_result schema for canonical verifiers\",\"description\":\"Goal\\nDefine the common verifier output contract that all canonical tasks should converge on in addition to reward.txt.\\n\\nScope\\n- Specify required and optional fields.\\n- Distinguish continuous reward from pass/fail semantics.\\n- Cover both deterministic verifiers and answer.json-derived artifact verifiers.\\n- Ensure the schema can represent partial credit, verifier failures, and missing output cleanly.\\n\\nWhy\\nToday reward.txt is universal, but the richer semantics are inconsistent and often lost.\",\"acceptance_criteria\":\"1. A single validation_result schema is defined for canonical tasks. 2. The schema includes scorer_family, reward, pass_threshold, passed, sub_scores, output_contract, and failure/error context. 3. The schema is documented in reference docs and is implementable from shell/Python verifiers without ambiguity.\",\"status\":\"open\",\"priority\":1,\"issue_type\":\"task\",\"owner\":\"sjarmak@users.noreply.github.com\",\"created_at\":\"2026-03-09T16:05:19Z\",\"created_by\":\"sjarmak\",\"updated_at\":\"2026-03-09T16:05:19Z\"}"}
0 commit comments