Skip to content

fix: observable downloads in CDP-borrowed mode (PR #28 followup)#31

Open
NiceCode666 wants to merge 7 commits into
devfrom
fix/download-events-followups
Open

fix: observable downloads in CDP-borrowed mode (PR #28 followup)#31
NiceCode666 wants to merge 7 commits into
devfrom
fix/download-events-followups

Conversation

@NiceCode666
Copy link
Copy Markdown
Collaborator

@NiceCode666 NiceCode666 commented May 19, 2026

Summary

Completes the agent-observable downloads surface PR #28 introduced — extending it to CDP-borrowed mode, where it previously did not work. Along the way: bound DownloadManager's completion queue, revert one wrong attempt, and align all documentation.

Background

PR #28 made bridgic-browser wait-download / downloads work in non-CDP and CDP-owned modes by always creating a DownloadManager and putting completions on an asyncio.Queue. But the CDP-borrowed pipeline (Browser(cdp=...) against a user's running Chrome) routes downloads through CdpDownloadRenamer instead of Playwright's download event, and the renamer was not wired to DownloadManager — so the new SDK / CLI surface returned "not supported in CDP-borrowed mode" or timed out silently. This PR closes that gap.

Changes (7 commits)

eeb27c9 fix: tighten download-event surface after PR #28

  • Added explicit "not supported in CDP-borrowed mode" return strings to wait_for_next_download / get_downloaded_files_text (so callers could distinguish "no download" from "wrong mode") — later removed in aaa60e4 once the renamer-bridge made the mode actually work.
  • clear_history() now also drains _completed_queue (regression test added).
  • _download.py class docstring documents both wait APIs side-by-side.
  • CLAUDE.md tool/command counts 67 → 69 (PR fix: agent retry file download endlessly in exploration phase #28 missed four occurrences).

a44383b test(integration): adapt popup-follow test to PR #28

5dfd106 (reverted by 96910c3) — attempted page-scoped attach in CDP-borrowed mode

  • Tried to align _start()_switch_self_page_to via attach_to_page(self._page).
  • Empirically failed: Playwright still fires download events in CDP-borrowed mode (the assumption that "path moved out of artifactsDir suppresses the event" is wrong with setDownloadBehavior(allowAndName) on a page session), causing _handle_download.save_as to write 0-byte placeholders to the DM default dir while CdpDownloadRenamer correctly handled the real file.
  • The revert + the new approach below is the proper fix. The reverted commit is kept in history as a documented lesson.

f60b0ce feat(download): bounded queue + external-download API

  • _completed_queue becomes Queue(maxsize=256) with sliding-window eviction in the new _enqueue_completed() helper. Long-lived daemon sessions with many downloads but no consumer no longer accumulate DownloadedFile objects indefinitely; full history is still kept in _downloaded_files.
  • New DownloadManager.record_external_download(file) public API: lets a non-Playwright pipeline (CdpDownloadRenamer) inject a completion record into the unified surface (downloaded_files / _completed_queue + wakes any pending Future waiter).
  • Regression test for the eviction.

aaa60e4 feat(cdp-borrowed): pipe CdpDownloadRenamer completions into DownloadManager

  • CdpDownloadRenamer.__init__ gains on_completed: Callable[[DownloadedFile], None]; _Pending captures url; _finalize_rename builds a full DownloadedFile (filename, real path, byte count via final.stat(), file_type from extension, suggested_filename, url) and invokes the callback. Callback exceptions are swallowed so the rename pipeline cannot be corrupted.
  • Browser._start() constructs the renamer with on_completed=self._download_manager.record_external_download — the bridge between the two pipelines.
  • Removes the _is_cdp_borrowed short-circuits from Browser.wait_for_next_download / get_downloaded_files_text introduced by eeb27c9; both now serve the unified surface.
  • _switch_self_page_to no longer touches DM attach/detach (legacy migration tied to the reverted page-scoped attach).
  • Extracted Browser._format_file_size static helper (was duplicated inline).
  • Tests: dropped two "not supported" tests; added two CDP-borrowed-via-record_external_download happy-path tests, a non-CDP summary-string test, three CdpDownloadRenamer callback tests (populated DownloadedFile, no-callback default, callback exception isolation), and restored test_download_manager_NOT_attached_in_borrowed_context. Integration test rewritten as test_popup_follow_does_not_attach_download_manager.

9c2936d docs: align download docs with unified surface + popup limitation

  • README.md / README_zh.md / docs/API.md / CLAUDE.md / skills/.../cli-sdk-api-mapping.md all rewrite the "CDP-borrowed unsupported" wording into the unified-surface narrative.
  • docs/INTERNALS.md _switch_self_page_to section updated.
  • docs/KNOWN_LIMITATIONS.md new entry: "Popup-triggered Downloads in CDP-borrowed Mode Are Not Captured" — setDownloadBehavior only scopes to the page session it was sent on, so popup-triggered downloads in CDP-borrowed mode fall back to Chrome's native UX. Three workarounds documented.

Real-world verification (CDP-borrowed mode)

$ bridgic-browser open --cdp auto "https://github.com/bitsky-tech/bridgic-browser/releases/tag/v0.0.5"
$ bridgic-browser wait-download 30 &     # background
$ bridgic-browser open ".../bridgic_browser-0.0.5-py3-none-any.whl"
$ wait
Download complete: bridgic_browser-0.0.5-py3-none-any.whl — 242.6 KB
                   — /Users/.../bridgic-browser-3/...whl   ✅

$ bridgic-browser downloads
[1] bridgic_browser-0.0.5-py3-none-any.whl — 242.6 KB — /Users/.../...whl   ✅

Filesystem: only the real 242.6 KB file in CWD; no 0-byte placeholder in ~/Downloads (which was the symptom of the reverted 5dfd106).

Test plan

  • make test-quick1258 passed (dev baseline 1249 → +9 net new)
  • CLI_COMMAND_TO_TOOL_METHOD = 69, cli.commands = 69
  • Real-world CDP-borrowed download verification (above)
  • No 0-byte placeholders generated under the new flow
  • CI: full make test to confirm test_popup_follow_does_not_attach_download_manager integration test passes against a real Chrome

Follow-ups to PR #28's always-on DownloadManager + new wait/list APIs.

- _browser.py: wait_for_next_download / get_downloaded_files_text now
  return a mode-specific explanation in CDP-borrowed mode instead of
  blocking on a queue that will never receive a completion (DownloadManager
  is intentionally not attached in that mode — CdpDownloadRenamer handles
  downloads there). Without this guard the wait silently times out and the
  caller can't tell "no downloads" apart from "wrong mode".
- _download.py: documented the two coexisting wait APIs (one-shot Future
  via wait_for_download vs persistent FIFO queue via wait_for_next_download)
  in the DownloadManager class docstring so callers pick the right one.
- CLAUDE.md: bumped tool/command counts 67 → 69 (4 stale occurrences PR #28
  missed).
- tests/unit/test_download.py: regression test for clear_history() draining
  _completed_queue, covering the invariant introduced in the merge commit.
- tests/unit/test_browser.py: TestBrowserDownloadMethods covers the
  CDP-borrowed unsupported branch on both new methods plus the empty-session
  fallback path.

make test-quick: 1253 passed (was 1249, +4 new).
…28

PR #28 made `Browser._download_manager` always non-None (defaulting to
~/Downloads when downloads_path is unset). The original setup in
`test_download_manager_reattached_on_popup_follow` was gated on
`_download_manager is None` to inject a fresh manager AND attach it to
bridgic's page. After PR #28 the inject branch never runs, but production
code in CDP-borrowed mode intentionally does NOT attach DownloadManager at
_start() (downloads route through CdpDownloadRenamer there) — so
`_page_handlers` was empty and the bookkeeping assertion failed.

Rewrote the setup to be idempotent: keep the optional manager-creation
path, then always ensure `self._page` is attached. The test verifies the
*migration* logic in `_switch_self_page_to`, not the production decision
to skip the initial attach.

No production code change. Only this integration test was affected.
Align DownloadManager attach timing between _start() and _switch_self_page_to.

Before: _start() in CDP-borrowed mode intentionally skipped attaching the
DownloadManager ("Borrowed context: NOT attached"), but _switch_self_page_to
already did a detach-old + attach-new on popup follow, implicitly assuming
the old page was already attached. The two pieces of code disagreed: the
first popup follow's detach was a no-op and its attach was actually a
first-time attach, leaving the bridgic-initial-page's bookkeeping in
_page_handlers permanently empty.

After: _start() in CDP-borrowed mode now attaches the DownloadManager
page-scoped to bridgic's own tab (`self._page`). The registered handler
remains effectively dormant — Playwright doesn't fire `download` events in
CDP-borrowed mode because CdpDownloadRenamer routed the path out of
artifactsDir — but `_page_handlers` is now in sync with `self._page` from
the first moment, giving _switch_self_page_to a well-defined initial state.

Context (does NOT change):
- Owned-context CDP: still attach_to_context (all pages belong to bridgic).
- Borrowed-context: NOT attach_to_context (would hijack user tabs — privacy
  boundary). Only attach_to_page on bridgic's own tab.

Updated test_download_manager_NOT_attached_in_borrowed_context → renamed
to test_download_manager_page_scoped_attach_in_borrowed_context to reflect
the new invariant (attach_to_context NOT called, attach_to_page called
exactly once with bridgic's tab).

make test-quick: 1253 passed.
DownloadManager (`_download.py`) gains two capabilities that the rest of
PR #31's surface depends on:

1. **Sliding-window `_completed_queue`** — bounded at `maxsize=256` so a
   long-lived daemon session with many downloads but no `wait-download`
   consumer no longer accumulates `DownloadedFile` objects indefinitely.
   `_enqueue_completed()` is the single push point: it evicts the oldest
   entry on `QueueFull` so the queue stays a fresh window. Full history is
   still available via `_downloaded_files`. `_handle_download` switched
   to the new helper.

2. **`record_external_download()`** — public entry for non-Playwright
   pipelines (specifically `CdpDownloadRenamer` in CDP-borrowed mode, see
   the next commit) to inject a completion record. Mirrors the post-save
   bookkeeping of `_handle_download`: appends to `_downloaded_files`,
   calls `_enqueue_completed`, and wakes the oldest pending Future
   waiter. Lets `wait_for_next_download` / `downloaded_files` /
   `get_downloaded_files_text` work uniformly across all pipelines.

Class docstring also documents the two coexisting wait APIs
(`wait_for_download` one-shot Future bound to a trigger vs
`wait_for_next_download` persistent FIFO queue) so callers pick the
right one.

Tests: `test_completed_queue_evicts_oldest_when_full` covers the
sliding-window eviction behavior.
…Manager

PR #31 followup: surface CDP-borrowed downloads through the unified
DownloadManager APIs (`wait_for_next_download`, `get_downloaded_files_text`,
`downloaded_files`) instead of returning "not supported in CDP-borrowed
mode" placeholder messages. Before this change, agents using `--cdp auto`
saw the explanatory string and had no way to observe download completions
via the standard CLI / SDK surface.

**`CdpDownloadRenamer` (`_cdp_download_renamer.py`)**
- Optional `on_completed: Callable[[DownloadedFile], None]` constructor arg.
- `_Pending` gains a `url` field captured from `Browser.downloadWillBegin`
  so the callback can build a fully-populated `DownloadedFile`.
- `_finalize_rename()` calls the callback after a successful rename with
  filename, real path, byte count from `final.stat().st_size`, file_type
  from extension, and the original suggestedFilename. Callback exceptions
  are caught and warning-logged so a buggy consumer cannot corrupt the
  rename pipeline.

**`Browser` (`_browser.py`)**
- `_start()` constructs the renamer with
  `on_completed=self._download_manager.record_external_download`. CDP-borrowed
  completions now flow into DM and populate the unified surface.
- DM is still NOT attached to the borrowed context (would hijack user
  tabs) or page-scoped to bridgic's own tab (the latter was tried in the
  reverted commit and produced 0-byte placeholder files because
  Playwright still fires `download` events in CDP-borrowed mode even when
  `setDownloadBehavior(allowAndName)` routes the file away from
  `artifactsDir`). The comment block documents this so the lesson is not
  re-learned.
- `_switch_self_page_to` no longer touches DM attach/detach — the legacy
  CDP-borrowed migration block was tied to the reverted page-scoped
  attach and is no longer needed.
- `wait_for_next_download` / `get_downloaded_files_text` lose their
  `_is_cdp_borrowed` short-circuits — they now serve the unified surface.
- `_format_file_size()` extracted as a static helper, used by both
  methods (was duplicated inline).

**Tests**
- `tests/unit/test_browser.py::TestBrowserDownloadMethods` reworked:
  drop the two "CDP-borrowed unsupported" assertions; add
  `test_wait_for_next_download_cdp_borrowed_via_record_external`,
  `test_get_downloaded_files_text_cdp_borrowed_via_record_external`, and
  a non-CDP happy-path summary-string test.
- `tests/unit/test_browser.py::test_download_manager_NOT_attached_in_borrowed_context`
  restored: both `attach_to_context` and `attach_to_page` must NOT be
  called for the borrowed context.
- `tests/unit/test_cdp_download_renamer.py::TestOnCompletedCallback`: three
  tests for the new callback wiring (populated DownloadedFile delivered,
  no-callback default still renames, callback exception does not break
  the rename pipeline).
- `tests/integration/test_owned_pages.py::test_popup_follow_does_not_attach_download_manager`
  replaces the legacy popup-follow bookkeeping test (which assumed the
  reverted page-scoped attach). New invariant: `_page_handlers` stays
  empty before and after popup follow in CDP-borrowed mode.

Real-world verification: clicked the bridgic 0.0.5 release `.whl` from
`https://github.com/bitsky-tech/bridgic-browser/releases/tag/v0.0.5` in
`--cdp auto` mode. `wait-download 30` returned
`"Download complete: bridgic_browser-0.0.5-py3-none-any.whl — 242.6 KB
— <CWD>/...whl"`, `downloads` listed exactly one entry, and no 0-byte
placeholder appeared in `~/Downloads`.

`make test-quick`: 1258 passed.
Documentation pass to make all download-related references consistent
with the post-PR-#31 reality:

- **Unified surface across modes**: `wait_for_next_download` /
  `get_downloaded_files_text` / `downloaded_files` work the same in
  non-CDP, CDP-owned, and CDP-borrowed modes (CDP-borrowed entries flow
  in via `CdpDownloadRenamer` → `DownloadManager.record_external_download`).
  Removed the older "wait_for_next_download is unsupported in
  CDP-borrowed mode" wording from `README.md`, `README_zh.md`,
  `docs/API.md`, `CLAUDE.md`, and `skills/bridgic-browser/references/
  cli-sdk-api-mapping.md`. The action-bound `wait_for_download(page,
  action, timeout)` variant remains unsupported in CDP-borrowed mode
  (still depends on Playwright's `download` event firing on bridgic's
  tab) and is documented as such.

- **PR #28 corrections**: `CLAUDE.md` table row "If `downloads_path` is
  unset … files are lost" no longer matches the implementation (PR #28
  defaults DM to `~/Downloads`); rewritten to reflect the auto-default.
  Tool/command counts bumped 67 → 69 across CLAUDE.md's package-tree
  section.

- **`docs/INTERNALS.md` `_switch_self_page_to` section**: rewrote the
  legacy "no-op safety net" description. The DM migration is gone (see
  the previous commit) — INTERNALS now states the actual invariant: in
  CDP-borrowed mode DM is intentionally not attached anywhere, popup
  follow doesn't touch attach/detach, and popup-triggered downloads are
  a known limitation rather than a wiring bug.

- **`docs/API.md` reorganization**: section title "DownloadManager
  (non-CDP / CDP-owned modes)" → "DownloadManager (unified across all
  modes)" with an intro listing how each pipeline populates the same
  surface. `download_manager` and `downloaded_files` row descriptions
  updated.

- **`docs/KNOWN_LIMITATIONS.md` new entry**: "Popup-triggered Downloads
  in CDP-borrowed Mode Are Not Captured" with symptom, root cause
  (page-level `setDownloadBehavior` only governs the originating
  target — popup gets its own session), three workarounds, and a
  verification pointer to the new `test_popup_follow_does_not_attach_
  download_manager` integration test.

- **`_download.py` class docstring** (this commit only touches docs
  inside it, not code) documents the two coexisting wait APIs so
  future readers don't mistake them for aliases.

No code changes in this commit.
@NiceCode666 NiceCode666 changed the title fix: tighten download-event surface after PR #28 fix: observable downloads in CDP-borrowed mode (PR #28 followup) May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant