Skip to content

fix(review-validation): accept legitimate citations beyond prompt truncation#91

Open
skavinski-lab wants to merge 1 commit into
openclaw:mainfrom
skavinski-lab:fix/review-validation-line-range-full-file
Open

fix(review-validation): accept legitimate citations beyond prompt truncation#91
skavinski-lab wants to merge 1 commit into
openclaw:mainfrom
skavinski-lab:fix/review-validation-line-range-full-file

Conversation

@skavinski-lab
Copy link
Copy Markdown

Problem

When reviewing source files larger than REVIEW_PROMPT_FILE_CHAR_LIMIT (24,000 chars), validateReviewOutput rejected findings whose evidence pointed at lines or quotes that exist in the actual file but happen to sit past the per-file truncation applied to the prompt.

Two related causes:

  1. assertLineRange counted against the truncated copy. A finding citing bot.py:2628-2648 in a real 5,654-line file was thrown out as evidence line range exceeds file length: bot.py:2628-2648 because the truncated copy was only ~600 lines. Line ranges are factual claims about the on-disk file, not about what the reviewer happened to see, so they should be validated against the real file.

  2. The truncation limit was too aggressive for moderate monoliths. A 228KB Python source file was reduced to ~10% before ever reaching the reviewer, which silently dropped most findings about that file. Even with the validator fix, the reviewer needs to actually see the code to find issues in it.

Changes

  • src/review-validation.ts — split the file read. assertLineRange now compares against the full on-disk contents. assertQuote keeps using the truncated copy so the existing hallucinated-quote guard (rejects evidence that only exists beyond the truncated prompt text) still fires for quotes that the reviewer couldn't have seen.

  • src/prompt.ts — raise REVIEW_PROMPT_FILE_CHAR_LIMIT from 24,000 → 250,000. Lets typical large source files (modules, bots, service entrypoints) fit in one reviewer prompt without truncation. The hallucinated-quote guard still works above this size for genuinely huge files.

  • src/review-validation.test.ts — bump the "beyond truncation" fixture from 24K to 250K padding so the test still exercises the intended case (a quote that lives past the new truncation cutoff).

Real-world impact

Ran on a 228KB Python Discord bot daemon (bot.py, 5,654 lines, monolith with slash commands + rate limiters + scheduled tasks + AI integration). Before this patch, the reviewer rejected every finding on the file because Codex correctly cited line numbers past the truncation cutoff. After this patch, all 5 stuck features reviewed cleanly and the run surfaced 19 additional findings on the previously-blocked big-file features (broken slash command, AI quota race, security issues, tests mutating production state, etc.).

Trade-off considered

Could have made REVIEW_PROMPT_FILE_CHAR_LIMIT configurable in ClawpatchConfig.review. Kept it as a constant for now to minimize surface area, happy to add config plumbing if preferred.

…ncation

The reviewer was rejecting findings whose evidence pointed at lines or
quotes that exist in the actual file but happen to sit past the
per-file truncation applied to the prompt. Two causes:

1. `validateReviewOutput` counted line ranges against the *truncated*
   copy of the file. A finding citing `bot.py:2628-2648` in a real
   5654-line file was thrown out as "evidence line range exceeds file
   length" because the truncated copy was only ~600 lines.

2. The truncation limit (24,000 chars) was too aggressive for moderate
   monolith files. A 228KB source file was reduced to ~10% before
   ever reaching the reviewer, which silently dropped findings.

Changes:

- `review-validation.ts`: split the file read. `assertLineRange` now
  compares against the full on-disk contents (line ranges are factual
  checks against committed code, not against what the reviewer saw).
  `assertQuote` keeps using the truncated copy so the existing
  hallucinated-quote guard still fires.

- `prompt.ts`: raise `REVIEW_PROMPT_FILE_CHAR_LIMIT` from 24_000 to
  250_000. Lets typical large source files (modules, monolith bots,
  service entrypoints) fit in one reviewer prompt without truncation.

- `review-validation.test.ts`: bump the "beyond truncation" fixture
  from 24K to 250K so the test still exercises the intended case.

In a real review run on a 228KB Python bot daemon, this took the
unblocked feature count from 9/14 to 14/14 and surfaced 19 additional
findings on the previously-stuck big-file features.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant