Skip to content

fix: flush replayed stream state and handle orphaned streams after hibernation#989

Merged
threepointone merged 1 commit intomainfrom
fix/hibernated-stream-state
Feb 26, 2026
Merged

fix: flush replayed stream state and handle orphaned streams after hibernation#989
threepointone merged 1 commit intomainfrom
fix/hibernated-stream-state

Conversation

@threepointone
Copy link
Contributor

Problem

When a client reconnects to an active stream, replayChunks() sends all stored chunks from SQLite but the client UI never updates until the next live chunk arrives from the LLM. This creates a jarring UX where the user sees a blank assistant message despite the server having already produced content.

A second, more severe problem: when a Durable Object hibernates during an active stream, the ReadableStream reader from the LLM is lost permanently. On wake, the stream appears active in SQLite but no live chunks will ever arrive — a "dead stream" that leaves the client stuck with a loading indicator forever.

Fixes #896


Root Cause

Bug 1: Replay chunks not flushed to React state

The client-side useAgentChat hook skips flushActiveStreamToMessages() for replay chunks (by design — to avoid intermediate renders during the tight replay loop). However, after all replay chunks were sent, the server never signaled "replay is done but the stream is still live." The accumulated parts sat in activeStreamRef unflushed until the next live chunk arrived from the LLM.

Bug 2: Orphaned streams after hibernation

When a DO is evicted and later wakes, the constructor calls ResumableStream.restore() which reads the active stream from SQLite. But the ReadableStream reader that was consuming the LLM response is gone — it existed only in the previous instance's memory. The stream looks active but will never produce another chunk.


Solution

Server: replayComplete signal for live streams

After replayChunks() sends all stored chunks for a stream that is still live (has an active LLM reader), it now sends a final message with replayComplete: true. This tells the client: "I've sent everything I have stored — flush your accumulated state to the UI. More live chunks may follow."

Server: _isLive flag + orphaned stream detection

ResumableStream now tracks whether the active stream was start()-ed in the current instance (_isLive = true) vs restored from SQLite after hibernation (_isLive = false). When replayChunks() detects an orphaned stream:

  1. Sends done: true to the client (stream is over)
  2. Marks the stream as completed in SQLite
  3. Returns the streamId to the caller

The caller (AIChatAgent) then reconstructs the partial assistant message from stored chunks using applyChunkToParts() and persists it via persistMessages(), so it survives further page refreshes.

Client: flush on replayComplete and done

In react.tsx, the CF_AGENT_USE_CHAT_RESPONSE handler now:

  • On replayComplete: flushes activeStreamRef to React state but keeps it alive (live chunks continue appending)
  • On done during replay: flushes and nulls out activeStreamRef (stream is finalized)

Edge Cases Handled

Scenario Behavior
Orphaned stream with zero chunks Stream completed cleanly, no empty assistant message persisted
Orphaned stream with tool call parts applyChunkToParts reconstructs both text and tool-invocation parts correctly
Double ACK after orphaned stream finalized hasActiveStream() returns false on second ACK → no-op, no duplicate messages
start chunk missing (row-size guard dropped it) Fallback message ID generated; message still persisted

Files Changed

File Change
src/types.ts Add optional replayComplete field to CF_AGENT_USE_CHAT_RESPONSE
src/resumable-stream.ts Add _isLive flag, isLive getter; rewrite replayChunks with three-way branch
src/index.ts ACK handler uses replayChunks return value; new _persistOrphanedStream method
src/react.tsx Flush on replayComplete (keep stream alive) and done during replay (finalize)
src/tests/worker.ts Add testSimulateHibernationWake helper
src/tests/resumable-streaming.test.ts 5 new server tests
src/react-tests/use-agent-chat.test.tsx 3 new React hook tests

Reviewer Notes

  • _isLive intentionally defaults to falserestore() is called in the constructor and leaves it false. Only start() sets it to true. This means any stream that exists in SQLite when the DO wakes is treated as orphaned unless start() was called in the current instance.

  • replayChunks return type changed from void to string | null — internal-only API, not in public exports. The returned streamId is used by the caller to know it should persist the orphaned message.

  • testSimulateHibernationWake replaces _resumableStream but not other instance fields (_streamCompletionPromise, _pendingResumeConnections). In a real hibernation wake the entire DO is reconstructed. This is sufficient for testing because those fields are only relevant for live streams, and the test uses fresh WebSocket connections.

  • The replayComplete message has done: false — this is intentional. The stream is still active; we're just signaling that stored chunks have been replayed. The client keeps activeStreamRef alive for subsequent live chunks.

  • All 237 server tests + 26 React tests pass.

@changeset-bot
Copy link

changeset-bot bot commented Feb 25, 2026

🦋 Changeset detected

Latest commit: aec54c0

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@cloudflare/ai-chat Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@pkg-pr-new
Copy link

pkg-pr-new bot commented Feb 25, 2026

Open in StackBlitz

npm i https://pkg.pr.new/cloudflare/agents@989
npm i https://pkg.pr.new/cloudflare/agents/@cloudflare/ai-chat@989
npm i https://pkg.pr.new/cloudflare/agents/@cloudflare/codemode@989
npm i https://pkg.pr.new/cloudflare/agents/hono-agents@989

commit: aec54c0

…bernation

- Send replayComplete signal after replaying stored chunks for live streams
  so the client flushes accumulated parts to React state immediately.
- Detect orphaned streams (restored from SQLite after hibernation with no
  live LLM reader) via _isLive flag on ResumableStream. On reconnect, send
  done:true, complete the stream, and persist the partial assistant message.
- Client: flush activeStreamRef on replayComplete (keeps stream alive) and
  on done during replay (finalizes orphaned streams).
- Add server tests for replayComplete, orphaned streams, edge cases (zero
  chunks, tool call parts, concurrent ACKs).
- Add React hook tests for client-side flush behavior.

Fixes #896
@threepointone threepointone force-pushed the fix/hibernated-stream-state branch from 38941e4 to aec54c0 Compare February 25, 2026 15:35
@threepointone threepointone merged commit 8404954 into main Feb 26, 2026
6 of 7 checks passed
@threepointone threepointone deleted the fix/hibernated-stream-state branch February 26, 2026 05:07
@github-actions github-actions bot mentioned this pull request Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stream resumption loses reasoning/thinking state after page refresh

1 participant