fix(review-pr): schema mismatch, chunk size, and timeout diagnostics#202
fix(review-pr): schema mismatch, chunk size, and timeout diagnostics#202derekmisler wants to merge 2 commits intodocker:mainfrom
Conversation
Three targeted fixes for the runaway-drafter timeout seen in docker/sailor PR #754 (https://github.com/docker/sailor/actions/runs/25577728034/job/75088929732). ## Fix 1 — Schema mismatch in pr-review.yaml (root cause of infinite loop) The root agent's transfer_task delegation message was not explicitly documenting the drafter's enforced structured_output schema field names. When the LLM generated the task string it used 'title'/'body' (natural GitHub review terminology) instead of the schema-required 'category'/'issue'/'details'/'in_diff'. Every emit attempt was rejected by schema validation, causing the drafter to loop indefinitely reading the same source files ~140 times each until SIGKILL at 1800 s. Changes: - Add a CRITICAL block in the delegation section documenting the exact drafter response schema with explicit warning never to use title/body - Update step 4d (aggregate) to reference the correct field names - Update step 5 (parse drafter response) to name 'issue' and 'details' explicitly - Update step 9 (build comments) to list the correct finding fields ## Fix 2 — Lower soft chunk target 1000 → 600 lines (review-pr/action.yml) Chunk 1 of sailor PR #754 was 1229 lines of dense Rust snapshot/restore code — at the edge of what a drafter can process in one turn, amplifying the schema-loop impact. Lowering the soft target produces smaller, more manageable chunks. The 2000-line hard cap is unchanged. ## Fix 3 — Better exit-124 diagnostics (review-pr/action.yml) When the agent timed out the PR received a generic 'PR Review Failed' comment and an unhelpful step summary. Users had no idea why or what to do next. Changes: - Add a dedicated elif branch for EXIT_CODE=124 before the generic non-zero handler - Post an actionable PR comment explaining the timeout and suggesting /review retry or splitting the PR into smaller pieces - Add a TIMEOUT_NOTE variable with exit code, 1800 s limit, and verbose log artifact reference, printed in the step summary after the STATUS line
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
All three fixes look correct and address the described root causes. The schema documentation addition in pr-review.yaml is accurate and will prevent the drafter loop. The 1000→600 line chunk-size reduction is straightforward. The timeout branch logic is sound. One minor cosmetic issue noted inline.
Inside a double-quoted bash string, \' is not a valid escape sequence and produces a literal backslash. The step summary was rendering (\'some-log-file.txt\') instead of the intended (some-log-file.txt) Drop the attempted single-quote wrapping and write () directly.
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟢 APPROVE
All three fixes in this PR are correct:
-
Schema mismatch documentation (): The added block correctly documents the drafter's enforced schema fields (, , , ) and explicitly warns against using /. The field names in steps 4d, 5, and 9 are consistently updated to match.
-
Chunk soft-target reduction (, 1000→600 lines): The logic change is applied consistently across the condition check and both log messages. No off-by-one issues; the hard cap (2000 lines) and chunk boundary logic are unchanged.
-
Timeout diagnostics (, exit code 124): The new
elif [ "$EXIT_CODE" = "124" ]branch is correctly positioned before the generic non-zero handler. Thegh apicall uses-f body="..."which automatically triggers POST pergh api's documented behaviour (no explicit-X POSTneeded — identical pattern to the pre-existing non-zero-exit error comment at line 838). TheTIMEOUT_NOTEmultiline string,printf '%s\n'usage, andif [ -n "$TIMEOUT_NOTE" ]guard are all correct.
Summary
Three targeted fixes for the runaway-drafter timeout seen in docker/sailor PR #754.
Failing run: https://github.com/docker/sailor/actions/runs/25577728034/job/75088929732
The root cause: the drafter sub-agent looped for 1799 s re-reading the same 4 files ~140× each because its enforced
structured_outputschema rejected every emit attempt, causing the model to keep doing more file reads instead of returning results.Fix 1 — Schema mismatch in
review-pr/agents/pr-review.yamlRoot cause of the infinite loop. The root agent's
transfer_taskdelegation message did not explicitly document the drafter's enforcedstructured_outputschema field names. When the LLM generated the task string it usedtitle/body(natural GitHub review terminology) instead of the schema-requiredcategory/issue/details/in_diff. Every emit attempt was rejected by schema validation, causing the drafter to loop indefinitely reading the same source files ~140× each until SIGKILL at 1800 s.Changes:
## CRITICAL: How to delegate to the draftersection documenting the exact drafter response schema, with an explicit warning never to usetitle/bodyissueanddetailsThe drafter's
structured_outputschema itself is unchanged — it is the contract.Fix 2 — Lower soft chunk target 1000 → 600 lines (
review-pr/action.yml)Chunk 1 of sailor PR #754 was 1229 lines of dense Rust snapshot/restore code — at the edge of what a drafter can process in one turn, amplifying the schema-loop impact. Lowering the soft target produces smaller, more manageable chunks.
Fix 3 — Better exit-124 diagnostics (
review-pr/action.yml)When the agent timed out, the PR received a generic "PR Review Failed" comment and an unhelpful step summary. Users had no idea why it failed or what to do.
Changes:
elif [ "$EXIT_CODE" = "124" ]branch before the generic non-zero exit handler/reviewretry or splitting the PRTIMEOUT_NOTE(exit code 124, 1800 s limit, verbose log artifact reference) printed in the step summaryTesting
pnpm build✅ (all bundles compiled successfully)pnpm lint— Biome and tsc pass; one pre-existingactionlintwarning aboutnode24runner intest-e2e.yml(existed onmainbefore this PR)pnpm test— pre-existing DNS sandbox failure (unrelated to these changes, also reproducible onmain)