Skip to content

Dual-path retry: exponential backoff + rate-limit handling#251

Open
MichaelGHSeg wants to merge 2 commits into
masterfrom
status-response-update
Open

Dual-path retry: exponential backoff + rate-limit handling#251
MichaelGHSeg wants to merge 2 commits into
masterfrom
status-response-update

Conversation

@MichaelGHSeg
Copy link
Copy Markdown
Contributor

Summary

Rewrites LibCurl::flushBatch() with a structured dual-path retry system.

  • 429 + Retry-After header: sleep for the specified duration (capped at rate_limit_retry_after_cap_s, default 300s), does NOT consume the retry budget. Bounded by max_rate_limit_duration_ms (default 12h).
  • Other retryable errors (5xx, 408, 410, 460): counted exponential backoff (base 500ms, 2×, cap 60s). Bounded by retry_count and max_total_backoff_duration_ms (default 12h).
  • Non-retryable errors: discard immediately.
  • Adds X-Retry-Count header on retry attempts.
  • Fixes array_splice ordering in QueueConsumer::flush() — batch is removed from the queue before calling flushBatch(), which handles all retries internally. This prevents __destruct() from re-flushing batches that already exhausted their retry budget.
  • E2E cli: error detection based on enqueue()/flush() return values only (not the error_handler callback, which fires for transient per-attempt errors too). Wires maxRetries from input config.
  • E2E: enables retry test suite.

Test plan

  • ./vendor/bin/phpunit --no-coverage passes
  • E2E basic,retry suites pass (48/48)

- LibCurl: dual-path retry loop (429+Retry-After vs counted exponential backoff), X-Retry-Count header on retries, retryable status classification, duration budgets
- QueueConsumer: add retry config properties (max_total_backoff_duration_ms, max_rate_limit_duration_ms, rate_limit_retry_after_cap_s, retry_count); fix queue splice bug (peek with array_slice, splice only on success); add isRetryable() and parseRetryAfter() helpers
- Socket: update DoPost signature for X-Retry-Count; use success range check (>= 200 && < 400)
- Make error_handler log-only; determine success from enqueue/flush
  return values to avoid false failures from transient retry errors
- Wire maxRetries from input config to retry_count option
- Remove duplicate "Flush failed" in error output
- Enable retry test suite in e2e-config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant