feat: API key schema isolation — database-level tenant separation by salvormallow · Pull Request #855 · vectorize-io/hindsight

salvormallow · 2026-04-03T04:48:08Z

Summary

Adds ApiKeySchemaTenantExtension — a built-in tenant extension that maps API keys to isolated PostgreSQL schemas, providing database-level memory isolation between tenants. Follows the same pattern as SupabaseTenantExtension but uses static API key mapping instead of JWT auth.

Threat model: prompt injection against AI agents

AI agents execute tool calls — including Hindsight recall, retain, and reflect — based on conversation content. A prompt injection delivered via chat message, email, or web search result can trick an agent into querying another tenant's memory banks.

Example attack:

Attacker sends a message to Agent A containing a crafted prompt injection
The injection tricks Agent A into calling hindsight recall --bank tenant-b-bank --query "private data"
Without schema isolation, this succeeds — both banks are in the same schema, and the single API key grants access to everything
Agent A returns Tenant B's private data in its response

Why application-layer bank filtering isn't enough:

RequestContext.allowed_bank_ids exists on the model but is not enforced by the engine. An OperationValidatorExtension could check it, but:

Requires configuring two extensions correctly (tenant + validator) — configuring only one gives a false sense of security
Fail-open: if allowed_bank_ids is None (the default), all access is granted
Internal operations skip tenant auth, so allowed_bank_ids is never set for background tasks
A single missed code path in the engine bypasses the check entirely

Why schema isolation works:

The API key determines the PostgreSQL schema at authentication time, before any bank lookup or query executes. The SQL itself is scoped via fully-qualified table names. Even a fully compromised agent can only access banks within its assigned schema. Banks from other schemas don't exist in its view of the database.

Attacker → prompt injection → Agent A → hindsight recall --bank tenant_b_bank
                                              ↓
                                API key resolves to schema "tenant_a"
                                              ↓
                                tenant_b_bank doesn't exist in this schema
                                              ↓
                                empty results → attack fails

How it works

Operator configures key-to-schema mapping via environment variable
Each request authenticates by API key → resolves to a dedicated PostgreSQL schema
All database operations are scoped to that schema
Schemas are auto-created with full table migrations on first access (same as SupabaseTenantExtension)
No separate validator extension needed — isolation is in the database

Configuration

HINDSIGHT_API_TENANT_EXTENSION=hindsight_api.extensions.builtin.bank_scoped_tenant:ApiKeySchemaTenantExtension
HINDSIGHT_API_TENANT_KEY_MAP=team_a_key:team_a;team_b_key:team_b

# Optional: prefix all schema names
HINDSIGHT_API_TENANT_SCHEMA_PREFIX=hs    # team_a becomes hs_team_a

# Optional: disable auth for MCP endpoints (falls back to default schema)
HINDSIGHT_API_TENANT_MCP_AUTH_DISABLED=true

Design decisions

Opt-in, zero breaking changes. If HINDSIGHT_API_TENANT_EXTENSION is not set, Hindsight uses DefaultTenantExtension — identical to current behavior. Existing deployments are unaffected.

One key = one schema. Each API key maps to exactly one PostgreSQL schema. A single key cannot access multiple schemas. This is intentional: one key = one blast radius. The TenantContext returns a single schema_name, and the engine scopes all queries to it. Cross-schema queries are not possible without direct Postgres access.

Admin access. There is no "superuser key" that spans all schemas. Operators who need cross-tenant visibility should query Postgres directly or use separate keys per schema. This is a conscious trade-off: admin convenience vs. the guarantee that no single compromised key grants access to all tenants.

MCP auth disabled = default schema only. When mcp_auth_disabled=true, MCP requests fall back to the default schema (from HINDSIGHT_API_DATABASE_SCHEMA), not a tenant schema.

Schema name validation. Schema names must be valid Postgres identifiers (letters, digits, underscores). Hyphens, spaces, and names starting with digits are rejected at startup.

Why not allowed_bank_ids + OperationValidatorExtension? See threat model above. Application-layer checks are defense-in-depth, not a security boundary. Schema isolation moves the enforcement into the database where it can't be bypassed by missed code paths.

Files changed

File	Description
`hindsight-api-slim/.../builtin/bank_scoped_tenant.py`	`ApiKeySchemaTenantExtension` (~170 lines)
`hindsight-api-slim/tests/test_bank_scoped.py`	20 unit tests + prompt injection defense tests

Test plan

Dashboard: multi-tenant support

User-facing behavior

When HINDSIGHT_API_TENANT_KEY_MAP is configured, a tenant selector dropdown appears in the top bar next to the bank selector. Selecting a tenant scopes all dashboard operations — bank listing, recall, reflect, documents, entities, configuration — to that tenant's schema. The selection persists across page navigations via localStorage.

When the key map is not set, the dashboard behaves identically to before — no tenant selector, single-key mode, zero breaking changes.

Architecture

The control plane never talks to the dataplane without tenant scoping. The design has three layers:

Server layer (hindsight-client.ts): A tenant-aware client factory replaces the old singleton. getClientForTenant(name) returns cached SDK clients configured with that tenant's API key. The key map is read from HINDSIGHT_API_TENANT_KEY_MAP — the same env var the API server uses, so operators configure it once. Unknown tenant names throw in multi-tenant mode — fail-closed, not fail-open.

API route layer (~35 routes): Every Next.js API route extracts ?tenant= from the query string, calls getClientForTenant(tenant), and uses the returned scoped client. The tenant param is consumed by the control plane and not forwarded to the dataplane — tenant identity is carried in the Authorization header, not the URL.

Browser layer (tenant-context.tsx → api.ts → bank-context.tsx): TenantProvider loads tenant names from /api/tenants on mount, restores the saved selection from localStorage, and calls client.setTenant(). The ControlPlaneClient singleton auto-appends ?tenant= to every fetchApi() call. BankProvider watches the current tenant and resets the bank list when it changes.

Browser                          Server (Next.js)                    Dataplane
───────                          ────────────────                    ─────────
TenantProvider                   /api/tenants
  → client.setTenant("acme")      → getTenantNames()
                                   → returns ["acme","globex"]

BankProvider                     /api/banks?tenant=acme
  → client.listBanks()            → getClientForTenant("acme")
                                   → SDK call with acme API key
                                                                     → schema "acme"
                                                                     → SELECT * FROM acme.banks

Dashboard files changed

File	Description
`hindsight-control-plane/src/lib/hindsight-client.ts`	Tenant-aware client factory with per-tenant caching
`hindsight-control-plane/src/lib/tenant-context.tsx`	React context for tenant selection + localStorage persistence
`hindsight-control-plane/src/lib/bank-context.tsx`	Reset banks on tenant switch
`hindsight-control-plane/src/lib/api.ts`	Auto-append `?tenant=` to all API calls
`hindsight-control-plane/src/components/bank-selector.tsx`	Tenant selector dropdown (multi-tenant mode only)
`hindsight-control-plane/src/app/layout.tsx`	Wire TenantProvider above BankProvider
`hindsight-control-plane/src/app/api/tenants/route.ts`	New endpoint returning tenant names
`hindsight-control-plane/src/app/api/*/route.ts`	~35 routes: extract tenant, use scoped client
`.env.example`	Document dashboard config

Dashboard test plan

salvormallow · 2026-04-03T08:44:17Z

Dashboard caveat

When bank_scoped_tenant is active, the control plane dashboard shows no banks because it calls the dataplane API without an API key.

Root cause: hindsight-client.ts reads HINDSIGHT_CP_DATAPLANE_API_KEY at startup. Without it, every SDK call returns Authentication failed: Missing API key. The dashboard's /api/banks route catches this and returns {"error":"Failed to fetch banks from API"}.

Workaround: Set HINDSIGHT_CP_DATAPLANE_API_KEY to one of the tenant API keys from your HINDSIGHT_API_TENANT_KEY_MAP. The dashboard will show that tenant's banks. Example:

HINDSIGHT_CP_DATAPLANE_API_KEY=<one-of-your-tenant-keys>

Longer-term: The dashboard should support multi-tenant awareness — a tenant selector that switches which API key is used for dataplane calls. I'm working on a follow-up PR for this.

nicoloboschi

LGTM

salvormallow · 2026-04-07T15:58:45Z

CI Status

Fixed: verify-generated-files — pushed a ruff format/lint fix for bank_scoped_tenant.py (e014b91).

Expected fork failures: The remaining ~15 failing jobs (test-api, test-python-client, test-typescript-client, test-rust-cli, Build Docker (api-slim), test-pip-slim, test-embed, test-hindsight-all, test-doc-examples, etc.) all fail because fork PRs can't access repo secrets. CI uses vertexai + cohere providers, but HINDSIGHT_API_LLM_VERTEXAI_PROJECT_ID and HINDSIGHT_API_EMBEDDINGS_COHERE_API_KEY are empty for fork workflows.

These tests would need a trusted CI re-run from a maintainer to pass.

nicoloboschi · 2026-04-09T08:39:37Z

hey @salvormallow this feature is incomplete without the proper UI changes. can you make some minimal changs in the UI to switch tenant as you suggested?

salvormallow · 2026-04-10T15:55:25Z

hey @salvormallow this feature is incomplete without the proper UI changes. can you make some minimal changs in the UI to switch tenant as you suggested?

Sounds good, I'll add it this weekend.

Adds ApiKeySchemaTenantExtension: maps API keys to isolated PostgreSQL schemas, providing database-level memory isolation between tenants. Threat model: prompt injection against AI agents. Agents execute tool calls based on conversation content. A prompt injection can trick an agent into querying another tenant's banks. Schema isolation scopes all SQL to the authenticated schema — banks from other schemas don't exist. Configuration: HINDSIGHT_API_TENANT_EXTENSION=...bank_scoped_tenant:ApiKeySchemaTenantExtension HINDSIGHT_API_TENANT_KEY_MAP=key_a:schema_a;key_b:schema_b Follows the SupabaseTenantExtension pattern. Opt-in, zero breaking changes. Includes 20 tests.

Replaces singleton HINDSIGHT_CP_DATAPLANE_API_KEY with factory pattern. Supports HINDSIGHT_CP_TENANT_KEY_MAP=key:name;key:name for multi-tenant. Backwards-compatible: single key still works via default export. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add tenant-aware BFF: all API routes read ?tenant= param and use getClientForTenant() instead of singleton lowLevelClient - Add /api/tenants route for tenant discovery - Add TenantContext + TenantProvider for client-side tenant state - ControlPlaneClient.fetchApi() auto-appends ?tenant= to all requests - Tenant selector dropdown in header (hidden in single-tenant mode) - BankProvider re-fetches banks and resets selection on tenant change - Backwards-compatible: HINDSIGHT_CP_DATAPLANE_API_KEY still works - New env var: HINDSIGHT_CP_TENANT_KEY_MAP=key:name;key:name Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix uploadFiles() bypassing fetchApi and missing ?tenant= param - Remove unused sdk import from list/route.ts - Guard BankProvider loadBanks() until tenant is resolved (avoid double-load) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Fix silent cross-tenant data leakage in getClientForTenant() — invalid tenant names now throw in multi-tenant mode instead of silently falling back to the first tenant. Fix race condition in bank-context.tsx where rapid tenant switches could interleave bank list responses, showing banks from the wrong tenant. Uses a monotonic load ID to discard stale responses. Add Playwright e2e test suite (18 tests) covering tenant discovery, switching, bank loading, navigation, and cross-tenant isolation. Includes an mTLS proxy for testing against prod deployments behind mutual TLS. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The control plane now falls back to HINDSIGHT_API_TENANT_KEY_MAP when HINDSIGHT_CP_TENANT_KEY_MAP is not set. Operators no longer need to duplicate API keys across two env vars — both the API and dashboard read from the same source. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Fix ruff import sort and formatting in test_bank_scoped.py - Remove unused deprecated hindsightClient/lowLevelClient exports - Strip tenant query param before forwarding to dataplane in audit-logs routes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Remove Playwright e2e tests, config, and mTLS proxy (manual-run only, not suitable for upstream CI) - Remove HINDSIGHT_CP_TENANT_KEY_MAP — dashboard reads HINDSIGHT_API_TENANT_KEY_MAP directly (one key map for both) - Update .env.example to document the consolidated config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

salvormallow · 2026-04-11T17:51:33Z

Dashboard changes complete

The multi-tenant dashboard UI is implemented, fully tested, and ready for review. PR description has been updated with the design, architecture, and test plan.

What's included:

Tenant selector dropdown (visible only in multi-tenant mode)
All ~35 API routes plumbed with tenant-scoped client selection
TenantProvider → BankProvider context hierarchy with localStorage persistence
Single env var: HINDSIGHT_API_TENANT_KEY_MAP — shared between API and dashboard, no duplication

Testing:

23 Playwright e2e tests passing against an isolated test instance
Cross-tenant isolation verified: bank creation in one tenant is not visible in the other

nicoloboschi approved these changes Apr 7, 2026

View reviewed changes

salvormallow and others added 8 commits April 11, 2026 10:07

style: fix ruff lint in bank_scoped_tenant.py

fd0ffa4

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

salvormallow force-pushed the feat/bank-scoped-access-control branch from d02b668 to b89b101 Compare April 11, 2026 17:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: API key schema isolation — database-level tenant separation#855

feat: API key schema isolation — database-level tenant separation#855
salvormallow wants to merge 9 commits intovectorize-io:mainfrom
salvormallow:feat/bank-scoped-access-control

salvormallow commented Apr 3, 2026 •

edited

Loading

Uh oh!

salvormallow commented Apr 3, 2026

Uh oh!

nicoloboschi left a comment

Uh oh!

salvormallow commented Apr 7, 2026

Uh oh!

nicoloboschi commented Apr 9, 2026

Uh oh!

salvormallow commented Apr 10, 2026

Uh oh!

salvormallow commented Apr 11, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

salvormallow commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Threat model: prompt injection against AI agents

How it works

Configuration

Design decisions

Files changed

Test plan

Dashboard: multi-tenant support

User-facing behavior

Architecture

Dashboard files changed

Dashboard test plan

Uh oh!

salvormallow commented Apr 3, 2026

Dashboard caveat

Uh oh!

nicoloboschi left a comment

Choose a reason for hiding this comment

Uh oh!

salvormallow commented Apr 7, 2026

CI Status

Uh oh!

nicoloboschi commented Apr 9, 2026

Uh oh!

salvormallow commented Apr 10, 2026

Uh oh!

salvormallow commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dashboard changes complete

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

salvormallow commented Apr 3, 2026 •

edited

Loading

salvormallow commented Apr 11, 2026 •

edited

Loading