🐛 make sure we don't postpone batch indefinitely when upserting a view by thomas-lebeau · Pull Request #4583 · DataDog/browser-sdk

thomas-lebeau · 2026-05-08T12:08:50Z

Motivation

Backport of v7 commit 0fd15958a (originally squashed into #4462 on the v7 branch).

FlushController had a bug where the duration_limit timer could be repeatedly cancelled and rescheduled when a batch contained a single upserted message. Because Batch.addOrUpdate called remove(key) then push(...) on every upsert with the same key, currentMessagesCount dropped to 0 between the two calls, which triggered cancelDurationLimitTimeout(). The subsequent push then rescheduled the timer from scratch. As long as the upsert cadence (e.g. throttled view updates every 3s) was shorter than FLUSH_DURATION_LIMIT (30s), the timer never fired.

In practice this happened most often for view events being upserted with no accompanying resource/action/error events in the same batch — a common state for idle / backgrounded tabs and quickly-closed pages. The buffered view event was only flushed when something else triggered it (page exit, session expire, messages_limit, bytes_limit). When the page died before any of those ran (tab discarded, OS reclaim, crash, mobile background kill), the view event was lost entirely, silently under-reporting view durations in RUM.

Observed impact in our own dashboards after the v7 fix landed: total view count rose ~17.5% and view.time_spent p90/p95/p99 jumped sharply (p95 ~+114%) while p50/p75 were flat or slightly lower — the signature of recovering previously-dropped views at both tails of the distribution, not of inflating individual durations.

Changes

Drop notifyAfterRemoveMessage from FlushController entirely.
In Batch.push, when upserting a key that already has a buffered message, call notifyAfterAddMessage(bytesDiff) instead of remove-then-add. This adjusts only the byte count and never touches currentMessagesCount or the duration_limit timer.
Update specs and mockFlushController accordingly.

This is a straight cherry-pick from v7 — no behavioral divergence between the two branches in this area.

Test instructions

`yarn test:unit --spec packages/core/src/transport/flushController.spec.ts --spec packages/core/src/transport/batch.spec.ts` — 41 tests pass.
`yarn typecheck` and `yarn lint` clean.

Checklist

Tested locally
Tested on staging
Added unit tests for this change.
Added e2e/integration tests for this change.
Updated documentation and/or relevant AGENTS.md file

This commit simplifies FlushController so we don't have a 'notify remove' step that can clear the duration timeout if the batch becomes empty. Instead, when we upsert a message, we call notifyAfterAddMessage with the bytes difference between the old and new message. This reuses the existing adjustment mechanism without adding API surface to FlushController. Backport from v7 (commit 0fd1595, originally part of #4462). When a batch contained exactly one upserted message (e.g. a view event with no accompanying resource/action/error events), each subsequent upsert called notifyAfterRemoveMessage which decremented currentMessagesCount to 0 and cancelled the duration_limit timeout. The next push rescheduled it from scratch. As long as the upsert cadence (e.g. throttled view updates) was shorter than FLUSH_DURATION_LIMIT (30s), the timer never fired and the batch was only flushed when something else triggered it (page exit, session expire, messages_limit, bytes_limit). When the page died before any of those fired, the view event was lost entirely - silently under-reporting view durations in RUM.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c0c74ff488

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-08T12:12:26Z

+      }
      upsertBuffer[key] = serializedMessage
-      flushController.notifyAfterAddMessage()
+      flushController.notifyAfterAddMessage(bytesDiff)


Keep page-exit upserts under the beacon limit

When an existing upserted view is replaced during PAGE_MAY_EXIT and the replacement pushes the batch over RECOMMENDED_REQUEST_BYTES_LIMIT, this call flushes only after the new view has been stored in upsertBuffer, so the near-full batch and final view update are sent as one over-limit sendOnExit payload. I checked startRum.ts, where page-exit preparation emits the final view update, and httpRequest.ts lines 109-124, where over-limit exit payloads skip sendBeacon and fall back to fetch; this regresses the near-full action + final view scenario documented in test/e2e/scenario/transport.scenario.ts, where the fetch can be cancelled on unload. Previously the remove-then-notifyBeforeAddMessage path flushed the near-full batch before adding the replacement view, keeping both exit requests beacon-eligible.

Useful? React with 👍 / 👎.

@codex address that feedback

Summary

Addressed the review feedback by making upsert replacements run a pre-update bytes-limit check before the replacement is committed, via flushController.notifyBeforeUpdateMessage(bytesDiff) in Batch.push. This preserves the “flush-before-replacement” behavior needed to keep page-exit payloads beacon-friendly when a replacement would cross the recommended size limit. packages/core/src/transport/batch.tsL32-L44

Added a new FlushController API, notifyBeforeUpdateMessage, that can trigger a bytes-limit flush for positive replacement diffs without incrementing message count or touching duration-limit scheduling, so it does not reintroduce the indefinite-postpone issue fixed by the original PR. packages/core/src/transport/flushController.tsL124-L156

Updated batch unit tests to assert the new update-path callback is used during same-key upsert replacement. packages/core/src/transport/batch.spec.tsL71-L90

Added a flush controller unit test covering bytes-limit notification before message update/replacement. packages/core/src/transport/flushController.spec.tsL144-L151

Extended the mock flush controller with the new method so transport tests remain type-accurate and stable. packages/core/test/emulate/mockFlushController.tsL13-L27

Git / PR actions

Committed changes on the current branch with message: 🐛 preserve beacon-friendly flush behavior for upsert updates.

Created a follow-up PR message via make_pr with the same title and a body summarizing the delta on top of the existing PR.

Testing

⚠️ yarn test:unit --spec packages/core/src/transport/flushController.spec.ts --spec packages/core/src/transport/batch.spec.ts (failed due to environment limitation: ChromeHeadless binary is not available).

View task →

cit-pr-commenter-54b7da · 2026-05-08T12:34:55Z

Bundles Sizes Evolution

📦 Bundle Name	Base Size	Local Size	𝚫	𝚫%	Status
Rum	179.65 KiB	179.52 KiB	-129 B	-0.07%	✅
Rum Profiler	6.17 KiB	6.17 KiB	0 B	0.00%	✅
Rum Recorder	27.03 KiB	27.03 KiB	0 B	0.00%	✅
Logs	56.78 KiB	56.65 KiB	-129 B	-0.22%	✅
Rum Slim	135.50 KiB	135.38 KiB	-129 B	-0.09%	✅
Worker	23.63 KiB	23.63 KiB	0 B	0.00%	✅

🚀 CPU Performance

Pending...

🧠 Memory Performance

Error fetching base memory performance

🔗 RealWorld

datadog-datadog-prod-us1-2 · 2026-05-08T12:36:09Z

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
• Patch Coverage: 83.33%
• Overall Coverage: 77.03% (+0.02%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: c0c74ff | Docs | Datadog PR Page | Give us feedback!}

mormubis · 2026-05-08T13:09:43Z

+     * limit flush can happen before the replacement is committed.
+     */
+    notifyBeforeUpdateMessage(estimatedMessageBytesCountDiff: number) {
+      if (


I think there's an edge case here. If the byte diff pushes the batch over the limit, the flush resets counters to 0 and empties upsertBuffer, but then in push() the new message goes into the empty buffer with currentMessagesCount still at 0 and no timer scheduled. notifyBeforeAddMessage handles this correctly because it re-registers the message after the flush. I think this could happen in practice with a batch close to 16 KiB (a long stack error would do it). What do you think?

thomas-lebeau marked this pull request as ready for review May 8, 2026 12:09

thomas-lebeau requested a review from a team as a code owner May 8, 2026 12:09

chatgpt-codex-connector Bot reviewed May 8, 2026

View reviewed changes

thomas-lebeau force-pushed the thomas.lebeau/fix-flush-controller-upsert branch from d7cc7d6 to 1dc0d8d Compare May 8, 2026 13:05

mormubis reviewed May 8, 2026

View reviewed changes

mormubis approved these changes May 8, 2026

View reviewed changes

thomas-lebeau force-pushed the thomas.lebeau/fix-flush-controller-upsert branch from 1dc0d8d to c0c74ff Compare May 8, 2026 13:18

thomas-lebeau merged commit 34d30ab into main May 8, 2026
21 of 24 checks passed

thomas-lebeau deleted the thomas.lebeau/fix-flush-controller-upsert branch May 8, 2026 13:42

github-actions Bot locked and limited conversation to collaborators May 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🐛 make sure we don't postpone batch indefinitely when upserting a view#4583

🐛 make sure we don't postpone batch indefinitely when upserting a view#4583
thomas-lebeau merged 1 commit into
mainfrom
thomas.lebeau/fix-flush-controller-upsert

thomas-lebeau commented May 8, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Uh oh!

thomas-lebeau May 8, 2026

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Uh oh!

cit-pr-commenter-54b7da Bot commented May 8, 2026 •

edited

Loading

Uh oh!

datadog-datadog-prod-us1-2 Bot commented May 8, 2026 •

edited by datadog-prod-us1-6 Bot

Loading

Uh oh!

mormubis May 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

thomas-lebeau commented May 8, 2026

Motivation

Changes

Test instructions

Checklist

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Choose a reason for hiding this comment

Uh oh!

thomas-lebeau May 8, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 8, 2026

Choose a reason for hiding this comment

Summary

Git / PR actions

Uh oh!

cit-pr-commenter-54b7da Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bundles Sizes Evolution

Uh oh!

datadog-datadog-prod-us1-2 Bot commented May 8, 2026 • edited by datadog-prod-us1-6 Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mormubis May 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cit-pr-commenter-54b7da Bot commented May 8, 2026 •

edited

Loading

datadog-datadog-prod-us1-2 Bot commented May 8, 2026 •

edited by datadog-prod-us1-6 Bot

Loading