Skip to content

1/4 Replace realtime websocket transport with WebRTC#16805

Open
aibrahim-oai wants to merge 20 commits intomainfrom
realtime-webrtc-transport
Open

1/4 Replace realtime websocket transport with WebRTC#16805
aibrahim-oai wants to merge 20 commits intomainfrom
realtime-webrtc-transport

Conversation

@aibrahim-oai
Copy link
Copy Markdown
Collaborator

@aibrahim-oai aibrahim-oai commented Apr 4, 2026

Stack position: 1/4

Stack:

This PR:

  • Move the model-side realtime transport from websocket to WebRTC.
  • Preserve the core-facing session/event API so UI changes can stack separately.
  • Drop the websocket e2e test that no longer matches the transport implementation.

aibrahim-oai and others added 10 commits April 4, 2026 12:43
Move the realtime model transport implementation to WebRTC while keeping the core session/event interface intact for the TUI layer.

Co-authored-by: Codex <noreply@openai.com>
Install cmake on Linux and macOS Bazel runners so audiopus_sys build scripts can build bundled Opus.

Co-authored-by: Codex <noreply@openai.com>
Link audiopus_sys against the Bazel opus module so remote Bazel builds do not depend on host cmake.

Co-authored-by: Codex <noreply@openai.com>
Teach the shared test helper to accept realtime /calls POSTs, answer SDP offers, and relay scripted events over a data channel while logging incoming RTP audio packets as synthetic append requests. Update the one stale handshake-path assertion to the /realtime/calls URL.

Co-authored-by: Codex <noreply@openai.com>
Ignore the workspace `opus` dependency in cargo-shear for core_test_support because Rust imports that package as `audiopus`.

Co-authored-by: Codex <noreply@openai.com>
Include nested response helper modules in the core_test_support Bazel target so the new realtime WebRTC test server source is visible to macOS Bazel and argument-comment-lint jobs.

Co-authored-by: Codex <noreply@openai.com>
Explicitly include first-level response helper modules in the core_test_support Bazel target so realtime_webrtc_server.rs is available to macOS builds and lints.

Co-authored-by: Codex <noreply@openai.com>
Use the workspace opus crate name directly and clone the request sender before moving it into the data-channel callback so the RTP track handler can still enqueue decoded packets.

Co-authored-by: Codex <noreply@openai.com>
Preserve item_id on v1 audio deltas and annotate opaque None arguments in calls URL tests so argument-comment lint passes.

Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
@aibrahim-oai aibrahim-oai changed the title Replace realtime websocket transport with WebRTC 1/4 Replace realtime websocket transport with WebRTC Apr 4, 2026
aibrahim-oai and others added 4 commits April 4, 2026 15:58
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
Co-authored-by: Codex <noreply@openai.com>
aibrahim-oai and others added 6 commits April 5, 2026 00:43
- switch realtime transport and test server to opus-rs
- drop native opus bazel and cmake plumbing

Co-authored-by: Codex <noreply@openai.com>
- use valid 24 kHz mono PCM audio in realtime tests
- keep websocket/WebRTC request sequencing aligned with transport behavior

Co-authored-by: Codex <noreply@openai.com>
- replace silent PCM realtime test fixtures with a deterministic tone
- avoid codec paths optimizing away audio in WebRTC test flows

Co-authored-by: Codex <noreply@openai.com>
Stop the realtime test WebRTC server from hanging after the client closes.

Teach the scripted request loop to exit when the data channel or peer connection closes so compact-remote tests can unwind instead of timing out at shutdown.

Co-authored-by: Codex <noreply@openai.com>
Keep the first realtime audio frame from being dropped during WebRTC startup.

Also give the realtime test server a tiny flush window before it closes the data channel so the last scripted events land reliably.

Co-authored-by: Codex <noreply@openai.com>
Install the rustls crypto provider in the realtime WebRTC test server so the app-server realtime tests stop panicking, and update the core handshake assertion to match the current realtime calls URL.

Co-authored-by: Codex <noreply@openai.com>
@aibrahim-oai
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown
Contributor

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8207ff6ead

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

sample_rate: 24_000,
num_channels: REALTIME_AUDIO_CHANNELS,
samples_per_channel: Some(samples_per_channel as u32),
item_id: None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve output item IDs on WebRTC audio frames

Do not hardcode item_id to None when emitting RealtimeEvent::AudioOut from RTP. core/src/realtime_conversation.rs only tracks output audio for truncation when item_id is present; frames without it are ignored, so V2 speech-start interruption cannot compute audio_end_ms or send conversation.item.truncate reliably.

Useful? React with 👍 / 👎.

Comment on lines +759 to +765
let packet = match track.read_rtp().await {
Ok((packet, _)) => packet,
Err(err) => {
let _ = tx_message.send(Err(ApiError::Stream(format!(
"failed to read realtime WebRTC audio packet: {err}"
))));
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle RTP read errors as shutdown before surfacing stream errors

track.read_rtp() errors are always forwarded as fatal ApiError. On normal connection teardown, this can race with Closed notifications and make next_event() return an error instead of clean end-of-stream. That misclassifies graceful closes as failures and can trigger unnecessary error handling in callers.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant