Fix tool-call leak and add FlushSentinel support in LangGraph adapter#4787
Fix tool-call leak and add FlushSentinel support in LangGraph adapter#4787keenranger wants to merge 4 commits intolivekit:mainfrom
Conversation
…LLMAdapter (livekit#4511)" This reverts commit d3529a7.
Re-implement stream_mode routing that was lost when PR livekit#4511 was reverted. The _run() loop now handles namespace stripping (for subgraphs=True) and mode-aware routing for both "messages" and "custom" stream modes, including multi-mode lists. Replace _extract_message_chunk() (PR livekit#3112) with explicit routing per mode. The original function was designed for messages-mode tuple normalization and cannot generalize to custom mode's arbitrary payloads. Explicit namespace stripping + per-mode dispatch is easier to understand and maintain: - _send_message() extracts tokens from (message, metadata) tuples - _send_custom() extracts text from StreamWriter payloads via _extract_custom_content() Also fix _to_chat_chunk() calling msg.text() instead of msg.text (property, not method).
PR livekit#3933 added FlushSentinel support at the LLMNode/generation.py boundary, but _metrics_monitor_task crashes on .id access when FlushSentinel flows through _event_ch. Forward the sentinel directly in _send_custom and override _metrics_monitor_task to filter non-ChatChunk items.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2209cb33ac
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| skipped since ChatChunk only carries text. | ||
| """ | ||
| if isinstance(data, FlushSentinel): | ||
| self._event_ch.send_nowait(data) # type: ignore[arg-type] |
There was a problem hiding this comment.
Preserve ChatChunk contract for emitted stream events
Sending FlushSentinel directly into _event_ch makes this LLMStream yield non-ChatChunk objects, but shared consumers like LLMStream.to_str_iterable()/collect() (livekit-agents/livekit/agents/llm/llm.py) and FallbackLLMStream._run() (livekit-agents/livekit/agents/llm/fallback_adapter.py) unconditionally access .delta, so a LangGraph custom stream that emits FlushSentinel will now raise AttributeError in those paths. This is a regression introduced by this change because previously all emitted events were ChatChunk instances.
Useful? React with 👍 / 👎.
Sorry about the issue in #4768 -- the tool-call leak was caused by my earlier change in #4511.
Custom and multi stream mode support
Re-implement
stream_moderouting for"messages","custom", and multi-mode lists.The previous approach in #4511 used
_extract_message_chunk()(originally from #3112) to normalize all stream items into chat chunks. That function was designed for messages-mode tuple normalization and could not generalize to custom mode's arbitrary payloads -- it ended up leaking tool-call outputs into the LLM input stream (#4768).This PR reverts #4511 and re-implements with explicit namespace stripping + per-mode dispatch:
_send_message()handles(message, metadata)tuples from messages mode_send_custom()extracts text from StreamWriter payloads via_extract_custom_content()Separating the handlers by mode is easier to understand and maintain than a single extraction function trying to cover both modes.
FlushSentinel support for custom stream mode
PR #3933 added
FlushSentinelat the LLMNode/generation.py boundary to trigger immediate TTS playback. However,FlushSentinelcannot flow throughLLMStream._event_ch(typedChan[ChatChunk]) because_metrics_monitor_taskcrashes on.idaccess.This PR forwards
FlushSentineldirectly through_event_chin_send_customand overrides_metrics_monitor_taskinLangGraphStreamto filter non-ChatChunkitems before delegating to the base implementation.Fixes #4768