feat(review): retry transient acpx failures + pass --prompt-retries#88
Open
coletebou wants to merge 1 commit into
Open
feat(review): retry transient acpx failures + pass --prompt-retries#88coletebou wants to merge 1 commit into
coletebou wants to merge 1 commit into
Conversation
Two-layer retry for transient acpx review failures. Layer 1 (acpx): runAcpxJson now passes --prompt-retries <n> (default 1) so the underlying agent can recover stream cuts and partial JSON internally without the whole feature failing. Override via CLAWPATCH_ACPX_PROMPT_RETRIES; 0 disables. Layer 2 (clawpatch): reviewFeature wraps provider.review() in a single retry that fires only on ClawpatchError with code === "malformed-output". Override via CLAWPATCH_REVIEW_RETRIES; 0 disables. Deterministic failures (provider-auth, unsupported-provider, agent-refused, agent-cancelled) and provider-failure (already covered by layer 1) are not retried at this layer. The retry is inside reviewFeature so the work-stealing loop in runReview does not double-claim the feature lock. Emits a "feature-retry" progress event between attempts.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
reviewFeaturehas a single attempt — anyprovider.review()failure marks the feature errored, even when the cause is a transient stream cut. acpx itself supports--prompt-retries <n>but clawpatch never sets it.This PR adds two cheap retry layers, both opt-out via env var:
runAcpxJsonpasses--prompt-retries <n>(default 1; envCLAWPATCH_ACPX_PROMPT_RETRIES;0omits the flag).runProviderReviewWithRetrywrapsprovider.reviewand retries ONCE onClawpatchError && code === "malformed-output". Default 1; envCLAWPATCH_REVIEW_RETRIES.Excluded codes (deterministic, won't retry):
provider-auth,unsupported-provider,agent-refused,agent-cancelled.The new upstream
validateReviewOutput()step runs AFTER the retry succeeds — validation failures are NOT retried (they're per-finding bugs, not transient).Why
Run
20260517T190759-3c9e9ehad 37malformed-output+ 15 timeout errors. Most are recoverable on a single retry. With these layers landed, conservative estimate is ~50% of the transient errors clear without operator intervention.Files
src/provider.ts—acpxPromptRetries()helper +--prompt-retriesarg inrunAcpxJson; both exported via__testingsrc/app.ts—runProviderReviewWithRetry,reviewRetries()knob,feature-retryprogress event between attemptssrc/provider.test.ts+src/app.test.ts— 26 new cases (5 + 4 + 7 + 4 + 6)Validation
pnpm format:check— cleanpnpm typecheck— cleanpnpm lint— cleanpnpm build— cleanpnpm test— 567 passed, 1 skippedNotes
Worst-case latency on a stuck feature with both retry layers maxed: 2 acpx tries × 2 clawpatch tries × 3 min default timeout = 12 min.
CLAWPATCH_ACPX_TIMEOUT_MSis the safety valve. With default 10-jobs concurrency the bound is acceptable.