Skip to content

API latency: /v0/servers reads consistently take 20–25s #1252

@rhinocap

Description

@rhinocap

Summary

/v0/servers queries on registry.modelcontextprotocol.io return successful responses but with extreme latency (20–25s for trivial queries returning a few KB of JSON). Default-timeout HTTP clients (and most CLI tools) treat these as failures, making the API effectively unusable without explicit long timeouts.

Repro (just now from California, 2026-05-04 ~11:00 PDT)

$ curl -sL --max-time 30 -w "\nHTTP %{http_code} | %{time_total}s | %{size_download}b\n" \
  "https://registry.modelcontextprotocol.io/v0/servers?search=raven&limit=5"
... [valid JSON list of 5 servers] ...
HTTP 200 | 24.903s | 3947b

$ curl -sL --max-time 30 -w "\nHTTP %{http_code} | %{time_total}s\n" \
  "https://registry.modelcontextprotocol.io/v0/servers/com.knowledge-raven%2Fmcp/versions/1.0.0"
... [valid server metadata] ...
HTTP 200 | 20.921s | 716b

$ curl -sL --max-time 30 -w "\nHTTP %{http_code} | %{time_total}s\n" \
  "https://registry.modelcontextprotocol.io/v0/servers?limit=3"
... [valid 3-server response] ...
HTTP 200 | 25.818s

By contrast, the static frontend at / returns in 394ms — same hostname, presumably a different cluster:

$ curl -sI --max-time 8 -w "\nHTTP %{http_code} | %{time_total}s\n" \
  "https://registry.modelcontextprotocol.io/"
HTTP/2 200
HTTP 200 | 0.394s

So this is API-side, not network/DNS.

Why this matters in practice

Most HTTP libraries default to 5–15 second timeouts. With API responses landing at 20–25s, anything calling the API with defaults will see what looks like a hard failure:

  • WebFetch / similar tooling I tried earlier today returned empty bodies and gave up
  • My initial curl --max-time 12 runs all returned HTTP 000 — I assumed the endpoints were broken
  • A first-time integrator doing fetch(...) with default 10s would conclude the API is down

For context: I just published ai.ravenmcp/raven-mcp (registered 2026-05-04T17:26:51Z). Confirming the publish landed required ~10 minutes of guessing the right query shapes because every short-timeout probe returned HTTP 000. The publish succeeded (mcp-publisher exited 0) but verification was needlessly hard.

Hypothesis

The latency profile (~20–25s near-uniform across read endpoints) looks more like a cold-start / per-request resource provisioning issue than slow query execution. A 3-server list response shouldn't be doing 25s of database work. Could be Cloud Run-style cold starts, a per-request DB connection setup, or a synchronous external dependency (e.g. validating against a static schema URL on every request).

Asks

  1. Latency budget on /v0/servers/* — 20–25s for sub-4KB responses is a UX cliff. Sub-1s would be a reasonable target for cached lookups; sub-5s for searches.
  2. If a fix isn't quick — surface a recommended minimum timeout on the API reference docs so integrators don't conclude the API is broken when it's just slow. (Currently nothing in /docs warns about timeouts.)
  3. Health endpoint — a fast /v0/health or similar that confirms "API is responsive" cheaply would let clients distinguish "API hung" from "your timeout is too short."

Happy to share the mcp-publisher publish flow + verification timing data if useful for whoever picks this up.


(Filed by @rhinocap from ai.ravenmcp/raven-mcp publication context.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions