feat: 添加 ExaSearchTool 独立内置搜索工具#432
Conversation
- Add ExaSearchTool as a separate built-in tool calling https://mcp.exa.ai - Register ExaSearchTool before WebSearchTool for higher priority - Compatible with third-party API proxy setups where WebSearchTool cloud search fails
📝 WalkthroughWalkthroughThis PR introduces an ExaSearchTool that integrates Exa's web search API via MCP protocol. It includes the tool definition with input/output schemas, MCP request handling, result parsing, UI rendering components, and registers the tool before WebSearchTool in the built-in tools list. ChangesExaSearchTool Implementation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts`:
- Around line 16-33: The numeric fields numResults and contextMaxCharacters in
ExaSearchTool's schema currently accept any number; restrict and validate them
before external calls by updating the Zod schemas for numResults and
contextMaxCharacters (in ExaSearchTool.ts) to enforce sensible bounds and types
(e.g., make numResults an integer with .int().min(1).max(50) and a default of 8,
and make contextMaxCharacters an integer with .int().min(100).max(20000) or
similar), and keep them optional only if appropriate; this ensures invalid
values (zero/negative/NaN/Infinity) are rejected or normalized before making
external API calls.
- Around line 236-256: The SSE parsing loop in the for (const line of lines)
block currently JSON.parse's every 'data: ' line and returns on the first
content frame, which is brittle for non-JSON frames and multi-frame payloads;
update the logic to: for each line that startsWith('data: '), try/catch
JSON.parse to skip non-JSON frames, accumulate all valid data.result.content
texts (e.g., push contentText from each parsed McpSearchResponse into an array),
only compute results via parseExaResults after the stream completes or a
terminal marker is seen, and return a single aggregated response including
query, merged results, and durationSeconds computed from startTime; reference
parseExaResults, McpSearchResponse, input.query, and startTime to locate and
modify the code.
In `@src/tools.ts`:
- Around line 231-232: The ExaSearchTool is being included unconditionally in
the exported tool list; gate it behind a runtime feature flag by adding the bun
feature pattern and conditional inclusion: add "import { feature } from
'bun:bundle'" and call feature('EXA_SEARCH'), then only push/include
ExaSearchTool into the tools list (where WebSearchTool and others are listed)
when the EXA_SEARCH flag is enabled (use process.env or your feature-check
helper), leaving other tools like WebSearchTool unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cb35d2af-6426-4c90-a688-a10c1cd2c51f
📒 Files selected for processing (5)
packages/builtin-tools/src/index.tspackages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.tspackages/builtin-tools/src/tools/ExaSearchTool/UI.tsxpackages/builtin-tools/src/tools/ExaSearchTool/prompt.tssrc/tools.ts
| .number() | ||
| .optional() | ||
| .describe('Number of search results to return (default: 8)'), | ||
| livecrawl: z | ||
| .enum(['fallback', 'preferred']) | ||
| .optional() | ||
| .describe( | ||
| "Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')", | ||
| ), | ||
| type: z | ||
| .enum(['auto', 'fast', 'deep']) | ||
| .optional() | ||
| .describe( | ||
| "Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search", | ||
| ), | ||
| contextMaxCharacters: z | ||
| .number() | ||
| .optional() |
There was a problem hiding this comment.
Constrain numeric input parameters before making external calls.
numResults and contextMaxCharacters accept any number currently (including zero/negative/infinite). That can send invalid requests and produce avoidable API failures.
Suggested schema constraints
numResults: z
.number()
+ .int()
+ .min(1)
+ .max(20)
.optional()
.describe('Number of search results to return (default: 8)'),
...
contextMaxCharacters: z
.number()
+ .int()
+ .min(1)
+ .max(100_000)
.optional()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| .number() | |
| .optional() | |
| .describe('Number of search results to return (default: 8)'), | |
| livecrawl: z | |
| .enum(['fallback', 'preferred']) | |
| .optional() | |
| .describe( | |
| "Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')", | |
| ), | |
| type: z | |
| .enum(['auto', 'fast', 'deep']) | |
| .optional() | |
| .describe( | |
| "Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search", | |
| ), | |
| contextMaxCharacters: z | |
| .number() | |
| .optional() | |
| .number() | |
| .int() | |
| .min(1) | |
| .max(20) | |
| .optional() | |
| .describe('Number of search results to return (default: 8)'), | |
| livecrawl: z | |
| .enum(['fallback', 'preferred']) | |
| .optional() | |
| .describe( | |
| "Live crawl mode - 'fallback': use live crawling as backup if cached content unavailable, 'preferred': prioritize live crawling (default: 'fallback')", | |
| ), | |
| type: z | |
| .enum(['auto', 'fast', 'deep']) | |
| .optional() | |
| .describe( | |
| "Search type - 'auto': balanced search (default), 'fast': quick results, 'deep': comprehensive search", | |
| ), | |
| contextMaxCharacters: z | |
| .number() | |
| .int() | |
| .min(1) | |
| .max(100_000) | |
| .optional() |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts` around lines
16 - 33, The numeric fields numResults and contextMaxCharacters in
ExaSearchTool's schema currently accept any number; restrict and validate them
before external calls by updating the Zod schemas for numResults and
contextMaxCharacters (in ExaSearchTool.ts) to enforce sensible bounds and types
(e.g., make numResults an integer with .int().min(1).max(50) and a default of 8,
and make contextMaxCharacters an integer with .int().min(100).max(20000) or
similar), and keep them optional only if appropriate; this ensures invalid
values (zero/negative/NaN/Infinity) are rejected or normalized before making
external API calls.
| for (const line of lines) { | ||
| if (line.startsWith('data: ')) { | ||
| const data: McpSearchResponse = JSON.parse(line.substring(6)) | ||
| if ( | ||
| data.result && | ||
| data.result.content && | ||
| data.result.content.length > 0 | ||
| ) { | ||
| const contentText = data.result.content[0].text | ||
| const results = parseExaResults(contentText) | ||
|
|
||
| return { | ||
| data: { | ||
| query: input.query, | ||
| results, | ||
| durationSeconds: (performance.now() - startTime) / 1000, | ||
| }, | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Make SSE parsing resilient and avoid returning on the first frame.
Current logic JSON.parses every data: line and returns immediately from the first content payload. SSE streams commonly include non-JSON frames (e.g. done markers) and multi-frame payloads, so this can fail or return partial results.
Suggested hardening
- for (const line of lines) {
+ let aggregatedText = ''
+ for (const line of lines) {
if (line.startsWith('data: ')) {
- const data: McpSearchResponse = JSON.parse(line.substring(6))
+ const raw = line.substring(6).trim()
+ if (!raw || raw === '[DONE]') continue
+ let data: McpSearchResponse | null = null
+ try {
+ data = JSON.parse(raw) as McpSearchResponse
+ } catch {
+ continue
+ }
if (
- data.result &&
- data.result.content &&
- data.result.content.length > 0
+ data?.result?.content &&
+ data.result.content.length > 0 &&
+ data.result.content[0]?.text
) {
- const contentText = data.result.content[0].text
- const results = parseExaResults(contentText)
-
- return {
- data: {
- query: input.query,
- results,
- durationSeconds: (performance.now() - startTime) / 1000,
- },
- }
+ aggregatedText += `${data.result.content[0].text}\n`
}
}
}
+ const results = parseExaResults(aggregatedText)
+ return {
+ data: {
+ query: input.query,
+ results,
+ durationSeconds: (performance.now() - startTime) / 1000,
+ },
+ }
-
- return {
- data: {
- query: input.query,
- results: [],
- durationSeconds: (performance.now() - startTime) / 1000,
- },
- }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/builtin-tools/src/tools/ExaSearchTool/ExaSearchTool.ts` around lines
236 - 256, The SSE parsing loop in the for (const line of lines) block currently
JSON.parse's every 'data: ' line and returns on the first content frame, which
is brittle for non-JSON frames and multi-frame payloads; update the logic to:
for each line that startsWith('data: '), try/catch JSON.parse to skip non-JSON
frames, accumulate all valid data.result.content texts (e.g., push contentText
from each parsed McpSearchResponse into an array), only compute results via
parseExaResults after the stream completes or a terminal marker is seen, and
return a single aggregated response including query, merged results, and
durationSeconds computed from startTime; reference parseExaResults,
McpSearchResponse, input.query, and startTime to locate and modify the code.
| ExaSearchTool, | ||
| WebSearchTool, |
There was a problem hiding this comment.
Gate ExaSearchTool behind a feature flag before default enablement.
ExaSearchTool is added unconditionally in the base tool list. For a newly introduced capability, this bypasses the repository’s required rollout pattern and removes safe fallback control if Exa MCP has incidents.
Suggested change
- ExaSearchTool,
+ ...(feature('EXA_SEARCH_TOOL') ? [ExaSearchTool] : []),
WebSearchTool,As per coding guidelines, "New features must follow the pattern: keep import { feature } from 'bun:bundle' + feature('FLAG_NAME'), and control via environment variables or configuration at runtime; do not bypass feature flags with direct imports".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ExaSearchTool, | |
| WebSearchTool, | |
| ...(feature('EXA_SEARCH_TOOL') ? [ExaSearchTool] : []), | |
| WebSearchTool, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/tools.ts` around lines 231 - 232, The ExaSearchTool is being included
unconditionally in the exported tool list; gate it behind a runtime feature flag
by adding the bun feature pattern and conditional inclusion: add "import {
feature } from 'bun:bundle'" and call feature('EXA_SEARCH'), then only
push/include ExaSearchTool into the tools list (where WebSearchTool and others
are listed) when the EXA_SEARCH flag is enabled (use process.env or your
feature-check helper), leaving other tools like WebSearchTool unchanged.
|
@X1F2Y3 你这是不是重复了? |
|
@claude-code-best 上个提错分支了这个为准 |
|
为啥要单独加一个 工具, 而不是 mcp? 现在内置的工具非常多, 不能再加了 @X1F2Y3 |
Need help on this PR? Tag
@codesmithwith what you need.Summary by CodeRabbit