Skip to content

feat: 添加 ExaSearchTool 独立内置搜索工具#432

Open
X1F2Y3 wants to merge 1 commit intoclaude-code-best:mainfrom
X1F2Y3:feat/exa-search-tool
Open

feat: 添加 ExaSearchTool 独立内置搜索工具#432
X1F2Y3 wants to merge 1 commit intoclaude-code-best:mainfrom
X1F2Y3:feat/exa-search-tool

Conversation

@X1F2Y3
Copy link
Copy Markdown

@X1F2Y3 X1F2Y3 commented May 7, 2026

  • Add ExaSearchTool as a separate built-in tool calling https://mcp.exa.ai
  • Register ExaSearchTool before WebSearchTool for higher priority
  • Compatible with third-party API proxy setups where WebSearchTool cloud search fails

View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

Summary by CodeRabbit

  • New Features
    • Introduced Exa AI web search integration with enhanced result formatting and automatic source attribution. The new search tool provides structured output with formatted URLs and titles, includes mandatory "Sources:" citations linking to search results, supports configurable concurrency, and delivers improved visual presentation of search findings.

- Add ExaSearchTool as a separate built-in tool calling https://mcp.exa.ai
- Register ExaSearchTool before WebSearchTool for higher priority
- Compatible with third-party API proxy setups where WebSearchTool cloud search fails
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces an ExaSearchTool that integrates Exa's web search API via MCP protocol. It includes the tool definition with input/output schemas, MCP request handling, result parsing, UI rendering components, and registers the tool before WebSearchTool in the built-in tools list.

Changes

ExaSearchTool Implementation

Layer / File(s) Summary
Data & Contract Definitions
packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts, packages/builtin-tools/src/tools/ExaSearchTool/UI.tsx, packages/builtin-tools/src/tools/ExaSearchTool/prompt.ts
Input schema validates query (min 2 chars) with optional parameters; output schema contains results array and duration. MCP request/response interfaces define the Exa endpoint contract.
Core Implementation — MCP Integration
packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts
Tool metadata, parseExaResults helper (markdown link or line-based URL extraction), call method (builds JSON-RPC request, fetches from Exa with timeout, parses data: event stream, formats results), and mapToolResultToToolResultBlockParam (formats output into readable text block).
UI & Result Rendering
packages/builtin-tools/src/tools/ExaSearchTool/UI.tsx
SearchResultSummary React component with compact/verbose rendering; exported helpers renderToolUseMessage, renderToolUseErrorMessage, renderToolResultMessage, and getToolUseSummary for UI state display.
Integration & Registration
packages/builtin-tools/src/index.ts, src/tools.ts
ExaSearchTool added to main barrel export; imported and registered in getAllBaseTools() before WebSearchTool.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐰 A whisker-twitching search tool hops alive,
Exa's swift wings help it search and thrive,
With schemas set and results parsed just right,
UI blooms in verbose and compact light,
Registered neat before its web-search kin! 🔍

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title is in Chinese and describes adding ExaSearchTool as a separate built-in search tool, which directly matches the main objective of the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts`:
- Around line 16-33: The numeric fields numResults and contextMaxCharacters in
ExaSearchTool's schema currently accept any number; restrict and validate them
before external calls by updating the Zod schemas for numResults and
contextMaxCharacters (in ExaSearchTool.ts) to enforce sensible bounds and types
(e.g., make numResults an integer with .int().min(1).max(50) and a default of 8,
and make contextMaxCharacters an integer with .int().min(100).max(20000) or
similar), and keep them optional only if appropriate; this ensures invalid
values (zero/negative/NaN/Infinity) are rejected or normalized before making
external API calls.
- Around line 236-256: The SSE parsing loop in the for (const line of lines)
block currently JSON.parse's every 'data: ' line and returns on the first
content frame, which is brittle for non-JSON frames and multi-frame payloads;
update the logic to: for each line that startsWith('data: '), try/catch
JSON.parse to skip non-JSON frames, accumulate all valid data.result.content
texts (e.g., push contentText from each parsed McpSearchResponse into an array),
only compute results via parseExaResults after the stream completes or a
terminal marker is seen, and return a single aggregated response including
query, merged results, and durationSeconds computed from startTime; reference
parseExaResults, McpSearchResponse, input.query, and startTime to locate and
modify the code.

In `@src/tools.ts`:
- Around line 231-232: The ExaSearchTool is being included unconditionally in
the exported tool list; gate it behind a runtime feature flag by adding the bun
feature pattern and conditional inclusion: add "import { feature } from
'bun:bundle'" and call feature('EXA_SEARCH'), then only push/include
ExaSearchTool into the tools list (where WebSearchTool and others are listed)
when the EXA_SEARCH flag is enabled (use process.env or your feature-check
helper), leaving other tools like WebSearchTool unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cb35d2af-6426-4c90-a688-a10c1cd2c51f

📥 Commits

Reviewing files that changed from the base of the PR and between 4230f0f and d8813b8.

📒 Files selected for processing (5)
  • packages/builtin-tools/src/index.ts
  • packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts
  • packages/builtin-tools/src/tools/ExaSearchTool/UI.tsx
  • packages/builtin-tools/src/tools/ExaSearchTool/prompt.ts
  • src/tools.ts

Comment on lines +16 to +33
.number()
.optional()
.describe('Number of search results to return (default: 8)'),
livecrawl: z
.enum(['fallback', 'preferred'])
.optional()
.describe(
"Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')",
),
type: z
.enum(['auto', 'fast', 'deep'])
.optional()
.describe(
"Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search",
),
contextMaxCharacters: z
.number()
.optional()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Constrain numeric input parameters before making external calls.

numResults and contextMaxCharacters accept any number currently (including zero/negative/infinite). That can send invalid requests and produce avoidable API failures.

Suggested schema constraints
     numResults: z
       .number()
+      .int()
+      .min(1)
+      .max(20)
       .optional()
       .describe('Number of search results to return (default: 8)'),
...
     contextMaxCharacters: z
       .number()
+      .int()
+      .min(1)
+      .max(100_000)
       .optional()
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.number()
.optional()
.describe('Number of search results to return (default: 8)'),
livecrawl: z
.enum(['fallback', 'preferred'])
.optional()
.describe(
"Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')",
),
type: z
.enum(['auto', 'fast', 'deep'])
.optional()
.describe(
"Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search",
),
contextMaxCharacters: z
.number()
.optional()
.number()
.int()
.min(1)
.max(20)
.optional()
.describe('Number of search results to return (default: 8)'),
livecrawl: z
.enum(['fallback', 'preferred'])
.optional()
.describe(
"Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')",
),
type: z
.enum(['auto', 'fast', 'deep'])
.optional()
.describe(
"Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search",
),
contextMaxCharacters: z
.number()
.int()
.min(1)
.max(100_000)
.optional()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts` around lines
16 - 33, The numeric fields numResults and contextMaxCharacters in
ExaSearchTool's schema currently accept any number; restrict and validate them
before external calls by updating the Zod schemas for numResults and
contextMaxCharacters (in ExaSearchTool.ts) to enforce sensible bounds and types
(e.g., make numResults an integer with .int().min(1).max(50) and a default of 8,
and make contextMaxCharacters an integer with .int().min(100).max(20000) or
similar), and keep them optional only if appropriate; this ensures invalid
values (zero/negative/NaN/Infinity) are rejected or normalized before making
external API calls.

Comment on lines +236 to +256
for (const line of lines) {
if (line.startsWith('data: ')) {
const data: McpSearchResponse = JSON.parse(line.substring(6))
if (
data.result &&
data.result.content &&
data.result.content.length > 0
) {
const contentText = data.result.content[0].text
const results = parseExaResults(contentText)

return {
data: {
query: input.query,
results,
durationSeconds: (performance.now() - startTime) / 1000,
},
}
}
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Make SSE parsing resilient and avoid returning on the first frame.

Current logic JSON.parses every data: line and returns immediately from the first content payload. SSE streams commonly include non-JSON frames (e.g. done markers) and multi-frame payloads, so this can fail or return partial results.

Suggested hardening
-      for (const line of lines) {
+      let aggregatedText = ''
+      for (const line of lines) {
         if (line.startsWith('data: ')) {
-          const data: McpSearchResponse = JSON.parse(line.substring(6))
+          const raw = line.substring(6).trim()
+          if (!raw || raw === '[DONE]') continue
+          let data: McpSearchResponse | null = null
+          try {
+            data = JSON.parse(raw) as McpSearchResponse
+          } catch {
+            continue
+          }
           if (
-            data.result &&
-            data.result.content &&
-            data.result.content.length > 0
+            data?.result?.content &&
+            data.result.content.length > 0 &&
+            data.result.content[0]?.text
           ) {
-            const contentText = data.result.content[0].text
-            const results = parseExaResults(contentText)
-
-            return {
-              data: {
-                query: input.query,
-                results,
-                durationSeconds: (performance.now() - startTime) / 1000,
-              },
-            }
+            aggregatedText += `${data.result.content[0].text}\n`
           }
         }
       }
+      const results = parseExaResults(aggregatedText)
+      return {
+        data: {
+          query: input.query,
+          results,
+          durationSeconds: (performance.now() - startTime) / 1000,
+        },
+      }
-
-      return {
-        data: {
-          query: input.query,
-          results: [],
-          durationSeconds: (performance.now() - startTime) / 1000,
-        },
-      }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts` around lines
236 - 256, The SSE parsing loop in the for (const line of lines) block currently
JSON.parse's every 'data: ' line and returns on the first content frame, which
is brittle for non-JSON frames and multi-frame payloads; update the logic to:
for each line that startsWith('data: '), try/catch JSON.parse to skip non-JSON
frames, accumulate all valid data.result.content texts (e.g., push contentText
from each parsed McpSearchResponse into an array), only compute results via
parseExaResults after the stream completes or a terminal marker is seen, and
return a single aggregated response including query, merged results, and
durationSeconds computed from startTime; reference parseExaResults,
McpSearchResponse, input.query, and startTime to locate and modify the code.

Comment thread src/tools.ts
Comment on lines +231 to 232
ExaSearchTool,
WebSearchTool,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Gate ExaSearchTool behind a feature flag before default enablement.

ExaSearchTool is added unconditionally in the base tool list. For a newly introduced capability, this bypasses the repository’s required rollout pattern and removes safe fallback control if Exa MCP has incidents.

Suggested change
-    ExaSearchTool,
+    ...(feature('EXA_SEARCH_TOOL') ? [ExaSearchTool] : []),
     WebSearchTool,

As per coding guidelines, "New features must follow the pattern: keep import { feature } from 'bun:bundle' + feature('FLAG_NAME'), and control via environment variables or configuration at runtime; do not bypass feature flags with direct imports".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
ExaSearchTool,
WebSearchTool,
...(feature('EXA_SEARCH_TOOL') ? [ExaSearchTool] : []),
WebSearchTool,
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/tools.ts` around lines 231 - 232, The ExaSearchTool is being included
unconditionally in the exported tool list; gate it behind a runtime feature flag
by adding the bun feature pattern and conditional inclusion: add "import {
feature } from 'bun:bundle'" and call feature('EXA_SEARCH'), then only
push/include ExaSearchTool into the tools list (where WebSearchTool and others
are listed) when the EXA_SEARCH flag is enabled (use process.env or your
feature-check helper), leaving other tools like WebSearchTool unchanged.

@claude-code-best
Copy link
Copy Markdown
Owner

@X1F2Y3 你这是不是重复了?

@X1F2Y3
Copy link
Copy Markdown
Author

X1F2Y3 commented May 7, 2026

@claude-code-best 上个提错分支了这个为准

@claude-code-best
Copy link
Copy Markdown
Owner

为啥要单独加一个 工具, 而不是 mcp? 现在内置的工具非常多, 不能再加了 @X1F2Y3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants