Skip to content

Add ModCDP code mode to new evals#2087

Open
pirate wants to merge 1 commit intomainfrom
evals-modcdp-code-tool
Open

Add ModCDP code mode to new evals#2087
pirate wants to merge 1 commit intomainfrom
evals-modcdp-code-tool

Conversation

@pirate
Copy link
Copy Markdown
Member

@pirate pirate commented May 6, 2026

Summary

Adds a separate modcdp_code core tool surface for evals, alongside the existing cdp_code surface.

Changes

  • Registers modcdp_code in the core tool surface union, registry, default startup profile resolver, CLI help, and architecture diagram.
  • Adds ModCdpCodeTool, backed by the built ModCDP JS client from the same v4 checkout root used by STAGEHAND_V4_SDK_PATH, under modcdp/dist/client/js/ModCDPClient.js; MODCDP_CLIENT_PATH can still override it directly.
  • Reuses the existing CDP page/session implementation through CdpConnectionLike so the ModCDP client can drive the same core browser operations.
  • Wires Claude Code external-agent runs to modcdp_code, exposing a modcdp runtime object.
  • Adds the four ModCDP primitives to the Claude Code system prompt append: Mod.evaluate, Mod.addCustomCommand, Mod.addCustomEvent, and Mod.addMiddleware.

Validation

  • pnpm --filter @browserbasehq/stagehand-evals run typecheck
  • pnpm --filter @browserbasehq/stagehand-evals run test:unit

@changeset-bot

This comment was marked as resolved.

@pirate pirate force-pushed the evals-modcdp-code-tool branch from 4cc3f68 to 1d2d056 Compare May 6, 2026 10:46
@pirate pirate changed the title [codex] Add ModCDP code eval tool surface Add ModCDP code mode to new evals May 6, 2026
@pirate pirate marked this pull request as ready for review May 6, 2026 10:47
@pirate pirate force-pushed the evals-modcdp-code-tool branch from 1d2d056 to 7a2c8eb Compare May 6, 2026 10:50
@pirate pirate force-pushed the evals-modcdp-code-tool branch from 7a2c8eb to aa4040b Compare May 6, 2026 10:54
cubic-dev-ai[bot]

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant