Commit cdfe334
fix(core): retry TASK_PROCESS_SIGSEGV under the user's retry policy
SIGSEGV was hard-classified as non-retriable in shouldRetryError on the
assumption that it's always a deterministic native crash. For Node
tasks that's not reliably true — many production SIGSEGVs are flaky:
- Native addon races (sharp, canvas, better-sqlite3, node-rdkafka,
bcrypt, etc.) — libuv thread-pool work stepping on V8 handles.
Different heap layout / thread schedule on a fresh process,
retry often succeeds.
- JIT / GC interaction — V8 turbofan deopt or GC during a native
callback. Timing-dependent.
- Near-OOM in native code — when RSS approaches the cgroup limit,
native allocations fail and poorly-written addons dereference
NULL → SIGSEGV instead of a clean OOM-kill. A fresh process
with cleaner memory often succeeds.
- Host / hardware issues — bit flips, kernel quirks. Retry lands
on a different host.
The codebase was already inconsistent here: shouldLookupRetrySettings
listed SIGSEGV as retry-config-aware, but the shouldRetryError gate
short-circuited fail_run before that branch could be reached. And we
already retry TASK_RUN_UNCAUGHT_EXCEPTION — clearly a user-code bug —
under the user's retry policy, so refusing to retry SIGSEGV was the
odd one out.
Flip TASK_PROCESS_SIGSEGV from the false branch to the true branch in
shouldRetryError. The existing retrying.ts pipeline then gates the
retry on lockedRetryConfig + maxAttempts — same path SIGTERM and
uncaught-exception already use. No new code paths; tasks without a
retry policy still fail fast.
Tests added in packages/core/test/errors.test.ts lock down the new
classification alongside SIGTERM, SIGKILL_TIMEOUT, and the OOM codes
(still non-retriable here because OOM has its own machine-bump retry
path in retrying.ts that runs before shouldRetryError).
Closes TRI-9234.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>1 parent f7a2bc7 commit cdfe334
3 files changed
Lines changed: 41 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
364 | | - | |
365 | 364 | | |
366 | 365 | | |
367 | 366 | | |
| |||
398 | 397 | | |
399 | 398 | | |
400 | 399 | | |
| 400 | + | |
401 | 401 | | |
402 | 402 | | |
403 | 403 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
3 | 11 | | |
4 | 12 | | |
5 | 13 | | |
| |||
238 | 246 | | |
239 | 247 | | |
240 | 248 | | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
0 commit comments