Problem
When an MCP tool returns a large response the agent breaks in one of two ways:
- With the default
max_old_tool_call_tokens budget, a single large result eats the entire budget and all earlier tool results in the session get silently replaced with [content truncated].
- With
max_old_tool_call_tokens: -1, the full response stays in the session but overflows the model's context window. Auto-compaction can't help because the oversized result is in the most recent messages, so the agent gets stuck.
Worth noting: builtin tools (shell, openapi, api) each have a hardcoded 30 000-char per-result limit. MCP tools have no per-result limit at all.
Workaround
Currently I found a workaround using a tool_response_transform hook with a shell script that saves large responses to disk and returns a compact pointer with a path the model can read later. With the shell toolset also enabled, the agent picks up the modified response and uses cat / grep / jq on the saved file to work with the full data as needed.
agent.yaml
agents:
root:
toolsets:
- type: shell
- type: mcp
remote:
url: http://127.0.0.1:8012/sse
transport_type: sse
hooks:
tool_response_transform:
- matcher: "*"
hooks:
- type: command
command: /path/to/mcp_save_response.sh
mcp_save_response.sh
#!/usr/bin/env bash
THRESHOLD=5000
OUTPUT_DIR="/tmp/mcp_responses"
PREVIEW_CHARS=3000
input=$(cat)
tool_response=$(echo "$input" | jq -r '.tool_response // ""')
response_len=${#tool_response}
# Small responses pass through unchanged
if [ "$response_len" -le "$THRESHOLD" ]; then
exit 0
fi
session_id=$(echo "$input" | jq -r '.session_id // "unknown"')
tool_use_id=$(echo "$input" | jq -r '.tool_use_id // "unknown"')
tool_name=$(echo "$input" | jq -r '.tool_name // "unknown"')
mkdir -p "$OUTPUT_DIR"
outfile="${OUTPUT_DIR}/${session_id}_${tool_use_id}.txt"
printf '%s' "$tool_response" > "$outfile"
preview="${tool_response:0:$PREVIEW_CHARS}"
jq -n \
--arg file "$outfile" \
--arg len "$response_len" \
--arg name "$tool_name" \
--arg preview "$preview" \
'{
hook_specific_output: {
updated_tool_response: "[\($name) response: \($len) chars, full output saved to \($file)]\n\nFirst \($preview | length) chars:\n\($preview)\n\n[Use shell tool to read: cat \($file)]"
}
}'
It works but requires every agent author to write and wire this up manually.
Suggestion
It would be great to have this built in, either as an agent configuration option or as a builtin hook similar to how redact_secrets works.
Problem
When an MCP tool returns a large response the agent breaks in one of two ways:
max_old_tool_call_tokensbudget, a single large result eats the entire budget and all earlier tool results in the session get silently replaced with[content truncated].max_old_tool_call_tokens: -1, the full response stays in the session but overflows the model's context window. Auto-compaction can't help because the oversized result is in the most recent messages, so the agent gets stuck.Worth noting: builtin tools (shell, openapi, api) each have a hardcoded 30 000-char per-result limit. MCP tools have no per-result limit at all.
Workaround
Currently I found a workaround using a
tool_response_transformhook with a shell script that saves large responses to disk and returns a compact pointer with a path the model can read later. With theshelltoolset also enabled, the agent picks up the modified response and usescat/grep/jqon the saved file to work with the full data as needed.agent.yamlmcp_save_response.shIt works but requires every agent author to write and wire this up manually.
Suggestion
It would be great to have this built in, either as an agent configuration option or as a builtin hook similar to how
redact_secretsworks.