Fix shutdown race: abort background tasks before closing durability by clockwork-labs-bot · Pull Request #4581 · clockworklabs/SpacetimeDB

clockwork-labs-bot · 2026-03-06T21:28:25Z

Summary

Fixes a race condition where the view_cleanup_task can panic with "durability actor vanished" during database shutdown, crashing the server on Windows.

Root Cause

The shutdown sequence in HostController::exit_module was:

module.exit().await
db.shutdown().await — closes the durability channel
Host::drop — aborts background tasks (view cleanup, metrics)

The view_cleanup_task runs with_auto_commit() on a loop, which calls request_durability(). If the task fires between steps 2 and 3, request_durability() panics because the durability channel is already closed.

Fix

Abort all background tasks before calling db.shutdown(), so they cannot race with durability channel closure:

module.exit().await
Abort background tasks (view cleanup, disk metrics, tx metrics)
db.shutdown().await — closes the durability channel

The tasks are still aborted again in Host::drop (no-op since already aborted).

Testing

This fixes flaky test_all_templates failures on Windows CI, such as:
https://github.com/clockworklabs/SpacetimeDB/actions/runs/22745918903/job/65969841716?pr=4376

The failure pattern: server panics at durability.rs:96 ("durability actor vanished"), server process dies, all subsequent template tests get connection refused.

The view_cleanup_task runs with_auto_commit() on a loop, which calls request_durability(). If db.shutdown() closes the durability channel before the task is aborted (in Host::drop), a request_durability() call panics with 'durability actor vanished'. On Windows, this can crash the server process. Fix: abort all background tasks (view_cleanup, disk_metrics, tx_metrics) before calling db.shutdown(), so they cannot race with durability channel closure. Fixes flaky test_all_templates failures on Windows CI.

bfops

Seems fine to me 🤷

kim

This is nonsense -- the panic doesn't crash the server, it only panics a tokio task. Like most of these bot analyses, the premise is just completely wrong.

That said, if we prefer the stack trace to go away for noisiness reasons, the right way to do that is to drop the host before shutting down the database.

It will not make the test failure go away.

bfops requested a review from kim March 6, 2026 22:16

bfops approved these changes Mar 6, 2026

View reviewed changes

kim requested changes Mar 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix shutdown race: abort background tasks before closing durability#4581

Fix shutdown race: abort background tasks before closing durability#4581
clockwork-labs-bot wants to merge 1 commit intomasterfrom
bot/fix-shutdown-task-race

clockwork-labs-bot commented Mar 6, 2026

Uh oh!

bfops left a comment

Uh oh!

kim left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

clockwork-labs-bot commented Mar 6, 2026

Summary

Root Cause

Fix

Testing

Uh oh!

bfops left a comment

Choose a reason for hiding this comment

Uh oh!

kim left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants