Tool Result Caching

Tool result caching stores upstream responses in memory and serves them directly for repeat calls within the TTL window — reducing latency and upstream API load for read-heavy tool sets.

Configuration

Caching is configured per-upstream. All tools in the upstream share the cache store unless overridden per-tool via an overlay.

upstreams:
  - name: myapi
    type: http
    tool_prefix: api
    base_url: https://api.example.com
    openapi:
      source: spec.yaml
    cache:
      enabled: true
      ttl: 60s          # how long to keep a cached result
      per_user: false   # if true, cache is keyed per authenticated user
      max_size: 1000    # maximum number of entries (LRU eviction)

Per-tool TTL override

Override TTL for individual operations using the x-mcp-cache-ttl extension in an overlay:

# overlay.yaml
- target: "$.paths[\"/ticker\"].get"
  update:
    x-mcp-cache-ttl: 5s       # high-frequency data: cache only briefly

- target: "$.paths[\"/instruments\"].get"
  update:
    x-mcp-cache-ttl: 1h       # slow-changing reference data: cache longer

Disabling cache for a tool

- target: "$.paths[\"/account/balance\"].get"
  update:
    x-mcp-cache-ttl: 0        # 0 disables caching for this operation

Per-user caching

When per_user: true the cache key includes the authenticated user's identity (JWT sub claim or API key identity). Users never see each other's cached data. This requires inbound auth to be configured. See Authentication.

Cache invalidation

Entries expire after their TTL. The entire cache for an upstream is flushed whenever the upstream's OpenAPI spec is refreshed (background spec refresh) or when the proxy hot-reloads its config.

Metrics

Cache hits and misses are tracked under mcp.cache.hits and mcp.cache.misses OpenTelemetry counters, labelled by tool name. See OpenTelemetry.

See also