Skip to content

Proposal: sensitiveHint annotation for tool outputs and a standardized secret reference content type #110

@jjmaxwell4

Description

@jjmaxwell4

Summary

Tools that generate secrets (API keys, tokens, passwords) or return private data (messages, medical records, financial details) have no way to tell the browser "this output must not be forwarded to agent context." The security best practices doc says "never pass sensitive data to agents" and recommends references, but the core spec does not define sensitivity metadata or redemption semantics.

This proposal adds two things:

  1. A sensitiveHint in ToolAnnotations (and optionally per-field in outputSchema) so tools can declare that their output contains sensitive data
  2. A SecretReference content type so tools can return opaque references instead of raw secrets, with a standardized redemption contract

Problem

A tool that generates an API key today has two options, both bad:

// Option A: Return the secret directly. Agent sees it.
// A compromised agent (via malicious tools from other origins)
// can exfiltrate it.
execute: async () => {
  const key = await createApiKey();
  return { content: [{ type: "text", text: key }] };
}

// Option B: Follow the best-practices doc and use a "reference."
// But there's no standard for what a reference looks like,
// how the browser should treat it, or how the user retrieves
// the actual value.
execute: async () => {
  const ref = await storeSecureData(key);
  return { content: [{ type: "reference", id: ref.id }] };
}

Option A is simpler to implement. Option B is recommended in guidance, but reference is not a spec-defined content type. That creates interoperability risk and implementation variance across sites and agents.

The MCP server-side spec already has outputSchema on tool definitions (added in the 2025-11-25 revision). In WebMCP, issue #9 includes a resolution to add outputSchema to the imperative API and investigate declarative mapping. This proposal builds on that direction by defining sensitivity semantics on top.

What this enables

The browser becomes the trust boundary for sensitive tool outputs. Instead of hoping developers follow a best-practices guide, the spec gives the browser enough information to:

  • Suppress sensitive fields from agent context deterministically
  • Expose a stable result shape for sensitive references
  • Make conformance testing possible for sensitive-output handling

Proposed changes

1. Add sensitiveHint to ToolAnnotations with enforceable semantics

The tool declares that its output may contain sensitive data.

dictionary ToolAnnotations {
  boolean readOnlyHint;
  boolean sensitiveHint;
};

When sensitiveHint is true, the user agent MUST NOT forward raw sensitive values to agent context. The user agent MUST provide either:

  • a redacted structured result, or
  • a secret_reference content item.

sensitiveHint is not advisory. It is a processing requirement so implementations can be tested.

2. Adopt resolved outputSchema direction in ModelContextTool

Align with MCP's tool definition. outputSchema describes the shape of the tool's return value using JSON Schema:

dictionary ModelContextTool {
  required DOMString name;
  required DOMString description;
  object inputSchema;
  object outputSchema;
  required ToolExecuteCallback execute;
  ToolAnnotations annotations;
};

outputSchema is useful independent of sensitivity — it helps agents parse structured responses, enables validation, and improves developer documentation. But it also enables per-field sensitivity marking.

3. Per-field x-sensitive annotation in outputSchema

For tools where only some output fields are sensitive:

navigator.modelContext.registerTool({
  name: "generate_api_key",
  description: "Generate a new API key for the current user",
  inputSchema: {
    type: "object",
    properties: {
      name: { type: "string", description: "Label for the key" }
    }
  },
  outputSchema: {
    type: "object",
    properties: {
      id: { type: "string" },
      name: { type: "string" },
      secret: { type: "string", "x-sensitive": true }
    }
  },
  execute: async ({ name }) => {
    const result = await fetch("/api/keys", {
      method: "POST",
      body: JSON.stringify({ name }),
      credentials: "same-origin"
    });
    return await result.json();
    // Returns { id: "key_123", name: "production", secret: "plr_abc..." }
  }
});

The browser sees "x-sensitive": true on the secret field. It can:

  • Strip secret from the result before passing to the agent
  • Pass { id: "key_123", name: "production" } to the agent with an annotation that fields were redacted

The execute function returns the raw result. The user agent applies redaction before exposing output to agent context.

x-sensitive is an extension keyword. It follows the same extension pattern used by JSON Schema ecosystems (x-* custom members) and avoids introducing an unscoped keyword.

4. secret_reference content type (optional, for developer-managed redemption)

Some developers will want to manage secret storage and retrieval themselves, especially when the secret should never exist in the browser's memory at all (e.g., the backend stores it and the user retrieves it through a separate authenticated channel).

For this case, a standardized content type:

execute: async ({ name }) => {
  // Backend creates the key and stores it, returns only a reference
  const result = await fetch("/api/keys", {
    method: "POST",
    body: JSON.stringify({ name }),
    credentials: "same-origin"
  });
  const { id, name: keyName, redeemUrl } = await result.json();

  return {
    content: [
      { type: "text", text: `Created API key "${keyName}"` },
      {
        type: "secret_reference",
        id: id,
        label: "API Key",
        redeemUrl: redeemUrl,
        // The browser renders a "Reveal" button.
        // When clicked, the browser fetches redeemUrl
        // with the user's credentials and shows the result.
      }
    ]
  };
}

secret_reference properties:

  • id: Opaque identifier for the secret
  • label: Human-readable label for user-facing clients
  • redeemUrl: URL the user agent fetches with credentials: "same-origin"
  • ttl (optional): Seconds until reference expiration

4a. Redemption endpoint contract

To make secret_reference interoperable, define the redemption contract:

  • Method: GET only
  • Origin: redeemUrl MUST be same-origin with the page that registered the tool
  • Credentials: user agent MUST send cookies with credentials: "same-origin"
  • Success response: 200 with JSON body { "value": string }
  • Expired/used: 410 Gone
  • Auth failure: 401 or 403
  • Not found: 404
  • Rate limit: 429
  • Caching: Cache-Control: no-store

Because redemption is GET and same-origin, CSRF token plumbing is not required for the read path. Sites that implement non-idempotent redemption semantics should treat the endpoint as state-changing and apply their CSRF policy accordingly.

4b. Agent exposure rules

  • User agents MUST NOT forward secret_reference payload fields (id, redeemUrl) into agent context.
  • User agents MAY expose non-sensitive text siblings in the same tool result.
  • User agents MAY provide any user-facing UI. UI details are intentionally non-normative in this proposal.

Why this belongs in the spec, not in best practices

The security docs recommend references instead of raw data. Without spec primitives, each implementation defines its own format and behavior. That increases implementation variance and makes conformance testing difficult.

The browser already mediates access to camera, microphone, and clipboard. Tool-output sensitivity is the same class of platform boundary: page code produces data, and the user agent controls exposure.

This also keeps WebMCP aligned with server-side MCP, which already has outputSchema on tool definitions.

Related: #87, #96, webmachinelearning/awesome-webmcp#1, #102, #105, #106. Those issues focus on invocation identity and permissions. This proposal focuses on output handling after execution.

Open questions

  1. Should WebMCP formally register an extension vocabulary for x-sensitive, or keep it as an unregistered x-* convention in the first iteration?

  2. Should sensitiveHint apply only to imperative tools in v1, while declarative mapping is specified in a follow-up?

  3. Should the spec define a cap for ttl values to prevent long-lived references?

  4. For tools exposed through bridges (for example WebMCP-to-MCP adapters), should bridge implementations be required to preserve sensitiveHint/x-sensitive semantics?

Prior art

  • MCP outputSchema (2025-11-25 spec): Server-side MCP added outputSchema for structured tool results. No sensitive annotation, but the schema is the right place for it.
  • MCP audience annotation on content: MCP content items support annotations.audience with values like ["user"] or ["assistant"]. A ["user"] audience annotation already means "show to user, hide from model." sensitiveHint is the tool-level equivalent of this.
  • MCP issue Should output also have a schema? #9 resolution: Add outputSchema to imperative WebMCP and investigate declarative association.

Implementation notes

We've built a version of this system in production. Our tool definitions use per-field sensitivity metadata in output schemas. Tool outputs are redacted before model exposure, and sensitive values are fetched via short-lived references.

The developer experience is good. Tool authors don't think about secret management — they return the raw result and the platform handles it. Moving this logic into the browser would be even better, since the browser can suppress the value before it ever leaves the execute() callback's scope.


Related issues: #87, #96, webmachinelearning/awesome-webmcp#1, #102, #105, #106

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions