perf: eliminate N+1 query patterns causing slow Heroku response times

## Problem

TMI on Heroku is significantly slower than local development:
- Login providers list: several seconds (sub-second locally)
- Threat model sub-entity lazy-loading: 10-20 seconds
- Threat model edit page appears to have missing sub-entities until they eventually load

**Root cause**: Excessive database round-trips. Locally each query is <1ms over Unix socket; on Heroku each crosses a network boundary at ~2-5ms per round-trip. The multiplier effect on N+1 query patterns makes this severe.

## Analysis

### Issue 1: N+1 Query Cascade on Threat Model Fetch (CRITICAL)

Every `ThreatModelStore.Get()` (`database_store_gorm.go:123`) calls `convertToAPIModel()` which issues separate queries for each sub-resource:

| Step | Queries | Location |
|------|---------|----------|
| Main TM + preloads (Owner, CreatedBy, SecurityReviewer) | 1+3 | `database_store_gorm.go:137` |
| `loadAuthorization()` — 1 for access entries, then 1 per entry for user/group lookup | 1 + N | `database_store_gorm.go:698-749` |
| `loadMetadata()` | 1 | `database_store_gorm.go:752` |
| `loadThreats()` — 1 for threats, then 1 `loadThreatMetadata()` per threat | 1 + N | `database_store_gorm.go:757-827` |
| `loadDiagramsDynamically()` — 1 for IDs, then `DiagramStore.Get()` per diagram (multiple preloads each) | 1 + N×M | `database_store_gorm.go:849-887` |

A threat model with 3 auth entries, 5 threats, and 2 diagrams = ~20-30 DB round-trips. At 3-5ms each on Heroku = 60-150ms per single GET.

### Issue 2: ThreatModelMiddleware Loads Full Model on Every Sub-Resource Request (CRITICAL)

`middleware.go:328` calls `ThreatModelStore.Get(id)` for every request to `/threat_models/{id}/*` — including sub-resource endpoints. The middleware only needs authorization data for access checks, but loads the entire threat model with all sub-resources.

When the client opens a threat model and lazy-loads 6 sub-entity types in parallel, the middleware alone triggers 6 full threat model loads = 120-180 DB round-trips before any handler code runs.

### Issue 3: List Endpoint Loads Full Models + Count Queries (HIGH)

`ListWithCounts()` (`database_store_gorm.go:297`):
1. Loads all matching threat models
2. Calls `convertToAPIModel()` for each (triggering the N+1 cascade)
3. Runs 6 additional COUNT queries per model (`database_store_gorm.go:371-376`)

For 10 threat models: ~200-300 DB queries to render the list page.

### Issue 4: Cold Start / Login Providers (MODERATE)

First request after dyno wake-up is slow: cold connection pool, SAML metadata parsing, no cache warming.

### Issue 5: No Caching of Authorization Data (MODERATE)

Cache service caches sub-resources but not threat model authorization data (the most frequently needed piece, checked by middleware on every request).

## Remediation Plan

### Phase 1 — Quick wins (major impact, lower effort)

1. **Lightweight authorization-only middleware query**: Replace `ThreatModelStore.Get()` in middleware with `GetAuthorization(id)` that only loads `threat_model_access` with a single JOIN. Eliminates full model load from middleware (80-90% reduction in middleware queries).

2. **Cache authorization data in Redis**: Cache auth list per threat model with short TTL (60-120s), invalidate on auth changes. Most sub-resource requests skip DB for auth entirely.

3. **Batch count queries in ListWithCounts**: Replace 6 individual COUNT queries per model with single query using `GROUP BY`:
   ```sql
   SELECT threat_model_id, COUNT(*) FROM threats WHERE threat_model_id IN (?) GROUP BY threat_model_id
   ```

### Phase 2 — Eliminate N+1 patterns (major impact, moderate effort)

4. **Batch user/group lookup in `loadAuthorization`**: Replace per-entry `WHERE internal_uuid = ?` with single `WHERE internal_uuid IN (?)`.

5. **Batch threat metadata loading**: Replace per-threat `loadThreatMetadata` with single query for all threats.

6. **Batch diagram loading**: Create `DiagramStore.GetBatch(ids)` instead of per-diagram `DiagramStore.Get()`.

### Phase 3 — Architectural improvements (moderate impact, higher effort)

7. **Separate List from Get models**: List endpoint should return lightweight projection (name, status, owner, counts) without loading threats, diagrams, or metadata.

8. **Response caching for GET /threat_models/{id}**: Cache full response in Redis with write-through invalidation.

9. **Connection pool tuning**: Consider reducing `maxOpenConns` to 5 for Heroku's limited connection budget.

### Expected Impact

Phases 1-2 should reduce typical page load times from 10-20s to 1-2s on Heroku by cutting DB round-trips from hundreds to single digits.

## Related Files

- `api/database_store_gorm.go` — Store implementation with N+1 patterns
- `api/middleware.go` — ThreatModelMiddleware full-model load
- `api/cache_service.go` — Cache service (missing auth caching)
- `auth/db/gorm.go` — Connection pool configuration

Step	Queries	Location
Main TM + preloads (Owner, CreatedBy, SecurityReviewer)	1+3	`database_store_gorm.go:137`
`loadAuthorization()` — 1 for access entries, then 1 per entry for user/group lookup	1 + N	`database_store_gorm.go:698-749`
`loadMetadata()`	1	`database_store_gorm.go:752`
`loadThreats()` — 1 for threats, then 1 `loadThreatMetadata()` per threat	1 + N	`database_store_gorm.go:757-827`
`loadDiagramsDynamically()` — 1 for IDs, then `DiagramStore.Get()` per diagram (multiple preloads each)	1 + N×M	`database_store_gorm.go:849-887`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: eliminate N+1 query patterns causing slow Heroku response times #190

Problem

Analysis

Issue 1: N+1 Query Cascade on Threat Model Fetch (CRITICAL)

Issue 2: ThreatModelMiddleware Loads Full Model on Every Sub-Resource Request (CRITICAL)

Issue 3: List Endpoint Loads Full Models + Count Queries (HIGH)

Issue 4: Cold Start / Login Providers (MODERATE)

Issue 5: No Caching of Authorization Data (MODERATE)

Remediation Plan

Phase 1 — Quick wins (major impact, lower effort)

Phase 2 — Eliminate N+1 patterns (major impact, moderate effort)

Phase 3 — Architectural improvements (moderate impact, higher effort)

Expected Impact

Related Files

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

perf: eliminate N+1 query patterns causing slow Heroku response times #190

Description

Problem

Analysis

Issue 1: N+1 Query Cascade on Threat Model Fetch (CRITICAL)

Issue 2: ThreatModelMiddleware Loads Full Model on Every Sub-Resource Request (CRITICAL)

Issue 3: List Endpoint Loads Full Models + Count Queries (HIGH)

Issue 4: Cold Start / Login Providers (MODERATE)

Issue 5: No Caching of Authorization Data (MODERATE)

Remediation Plan

Phase 1 — Quick wins (major impact, lower effort)

Phase 2 — Eliminate N+1 patterns (major impact, moderate effort)

Phase 3 — Architectural improvements (moderate impact, higher effort)

Expected Impact

Related Files

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions