RFC0055 Identity-Aware Routing by rkoster · Pull Request #535 · cloudfoundry/routing-release

rkoster · 2026-03-05T16:26:52Z

RFC0055: Identity-Aware mTLS Routing

Implements Phase 1 (1a + 1b) of RFC0055: App-to-App mTLS Routing.

Tracking: cloudfoundry/community#1481
Acceptance Testing Guide: https://gist.github.com/rkoster/5b252b0edca606f10be2dbdcb81a796f

What This Does

Enables GoRouter to enforce mutual TLS and identity-based authorization on a per-domain basis. Apps calling routes on configured mTLS domains must present a valid Diego instance identity certificate. GoRouter extracts the caller's app/space/org identity and checks it against route policies before forwarding the request.

Phase 1a: mTLS Infrastructure

Per-domain TLS configuration via GetConfigForClient callback (SNI-based)
Domain-specific client certificate validation against configurable CA
Domain-aware XFCC header handling with two formats:
- raw: base64-encoded full certificate (~1.5KB)
- envoy: compact Hash=...;Subject="..." format (~250 bytes)
SNI/Host mismatch protection (prevents connection reuse attacks across domains)
BOSH job properties for router.domains

Phase 1b: Authorization

Identity extraction from Diego instance identity certificates (Subject DN OUs + SPIFFE URIs)
Pre-selection auth: validates mTLS domain, client cert presence, identity extraction
Post-selection auth: enforces route policies (scope and allowed_sources) against selected endpoint
Supports authorization at app, space, and org granularity
Default deny when no route policies are configured
RTR access logs emitted for denied requests (401/403)

Key Design Decisions

Two-layer authorization: Pre-selection (before endpoint is chosen) handles domain/cert/identity checks. Post-selection (after load balancer picks a backend) handles scope and route-policy checks against the specific endpoint's tags.
Feature is dormant by default: No behavior change unless router.domains is configured in the BOSH manifest and a shared domain with --enforce-access-rules is created.
No regression on existing traffic: Non-mTLS domains are completely unaffected.

Testing

Unit tests for all new handlers and config validation
Integration tests for end-to-end mTLS routing flows
BOSH template tests for configuration rendering
CI runs go fmt, go vet, staticcheck, ginkgo with --race

Configuration Example

# BOSH manifest (via ops-file)
router:
  domains:
    - name: "*.apps.identity"
      ca_certs: "((diego_instance_identity_ca.certificate))"
      forwarded_client_cert: sanitize_set
      xfcc_format: envoy

Related PRs

Component	PR	Status
cloud_controller_ng	cloudfoundry/cloud_controller_ng#4910	Open
capi-release	cloudfoundry/capi-release#625	Open
CLI	cloudfoundry/cli#3758	Draft

Merge Ordering

All PRs are independently safe to merge — the feature is dormant without the ops-file and domain creation. No strict ordering required. Recommend merging around the same time once all are approved.

AI Disclosure

This PR was developed with AI assistance. All code has been read and verified manually. Each error path, branch, and edge case has corresponding test coverage.

rkoster · 2026-04-16T12:43:52Z

Latest Update: RFC-Compliant Post-Selection Authorization

Implemented breaking change to replace pre-selection authorization with strict post-selection enforcement per RFC lines 475-517.

Key Changes (commit `cbf0695`)

Architecture:

✅ Composable PostSelectionHandler interface for middleware pipeline
✅ Separation of pre-selection checks (SNI, route lookup, identity) from post-selection authorization
✅ Immediate 403 on authorization failure (non-retriable, per RFC)
✅ Post-selection scope checking with :post-selection suffix in metrics

Implementation:

handlers/post_selection_pipeline.go - Infrastructure for composable checks
handlers/mtls_scope_auth.go - Org/space boundary enforcement
handlers/mtls_access_rules_auth.go - Access rules evaluation (cf:app:, cf:space:, etc.)
handlers/mtls_pre_auth.go - Pre-selection checks only
handlers/mtls_auth_error.go - Custom error type with Rule/Reason/HTTPStatus

Test Coverage:

+44 new tests (14 scope + 17 access rules + 13 pipeline)
+4 integration tests for shared route scenarios
All 393 tests passing

RFC Compliance

✅ Intermittent 403s - Expected for shared routes across scope boundaries (RFC-compliant)
✅ Error messages - Include "caller org X does not match selected backend org Y"
✅ Strict enforcement - Prevents unauthorized cross-scope access

Breaking Change

⚠️ This replaces the permissive pre-selection authorization entirely. No feature flag provided as this is a security improvement required by the RFC.

Deprecated:

handlers/mtls_authorization.go (old implementation with migration notes)
route/pool.go EndpointOrgIDs/SpaceIDs methods

Integration Test Results

All integration tests compile successfully. Shared route scenarios validate:

Intermittent 403s with scope=space (different spaces in same org)
Always succeed with scope=org (same org, different spaces)
Always fail with scope=org (different orgs)
Per-endpoint access rules with intermittent behavior

Ready for full integration test run and review.

rkoster · 2026-04-16T13:37:29Z

Refactoring: AuthError for Future Extensibility

Commit: 4ff64b9

Renamed MtlsAuthError to AuthError to prepare for future authentication methods beyond mTLS, such as SPIFFE JWT tokens.

Changes

✅ Renamed handlers/mtls_auth_error.go → handlers/auth_error.go
✅ Updated struct, constructor functions, and all references
✅ Changed error messages from "mTLS authorization denied" to "authorization denied"
✅ Updated all test files

Benefits

🔮 Future-proof: Ready for SPIFFE JWT token authentication
🏗️ Generic design: Error type not tied to specific auth mechanism
🧩 Reusable: Can be used across different authentication methods
✨ Clean: Better naming convention for authorization errors

No functional changes - pure refactoring for extensibility.

Devbox is used for local development environment setup but should not be tracked in the repository. The files remain in the working directory for developers who use devbox.

routing-api is a local submodule in src/code.cloudfoundry.org/routing-api, not an external dependency. It should not be listed in go.mod. This was incorrectly added during rebase conflict resolution.

- Export MakeEndpoint method for test access - Fix test calls to MakeEndpoint with correct parameters - Remove unnecessary fmt.Sprintf for string argument

This commit fixes all failing integration tests for the identity-aware routing feature by addressing three critical issues: 1. mTLS client certificate trust chain issue - Tests were creating instance identity certs with a different CA than the one configured in GoRouter's mTLS domain settings - Added CreateInstanceIdentityCertWithCA() helper that accepts an existing CA to ensure proper trust chain - Updated all test cases to use the shared mtlsDomainCA 2. Authorization errors returning HTTP 502 instead of 403 - Added custom ErrorHandler to httputil.ReverseProxy that checks for AuthError and returns the appropriate HTTP status code (403) - Previously all transport errors defaulted to 502 Bad Gateway 3. Per-endpoint access rules not working correctly - Authorization handler was checking pool-level access rules (first endpoint only) instead of the selected endpoint's rules - Changed to use endpoint.AccessRules to support different backends with different authorization requirements on the same route 4. Default-deny not enforced for routes without access rules - Changed enforcement logic to apply to all requests with CallerIdentity, regardless of AccessScope setting 5. SNI/Host header mismatch in test requests - Added newMtlsGetRequest() helper with custom DialTLSContext that connects to 127.0.0.1 while preserving hostname for TLS SNI - Updated all identity-aware routing tests to use this helper Test results: 20/20 integration tests passing, 17/17 unit tests passing

This prevents the subscriber's ClosedCB from firing log.Fatal when NATS is stopped first, which was causing the test process to exit prematurely and leading to port binding conflicts in parallel test runs. The cleanup order is now: 1. Terminate gorouter session 2. Stop NATS server 3. Clean up test files This matches the fix from upstream PR #555 (commit b2bf830) which resolved similar issues in router/router_test.go.

Apply comprehensive terminology rebranding from 'access rules' to 'route policies' across the gorouter codebase to align with Cloud Foundry's existing 'network policies' convention. This matches the terminology changes from GitHub PR #1438 commit be8d74c1. Key terminology changes: - Types: MtlsAccessRulesAuth → MtlsRoutePoliciesAuth - Functions: evaluateAccessRules() → evaluateRoutePolicies() - Functions: parseCommaSeparatedSelectors() → parseCommaSeparatedSources() - Constants: AccessScopeAny/Org/Space → RoutePolicyScopeAny/Org/Space - Struct fields: AccessScope → RoutePolicyScope, AccessRules → RoutePolicies - JSON tags: access_scope → route_policy_scope, access_rules → route_policy_sources - Error messages: route:no_access_rules → route:no_route_policies - Error messages: route:access_rules → route:route_policies - Comments: 'access rules' → 'route policies', 'selector' → 'source' Files modified: - Core: route/pool.go, mbus/subscriber.go - Handlers: mtls_route_policies_auth.go (renamed), mtls_pre_auth.go, mtls_scope_auth.go, proxy/proxy.go - Tests: All corresponding test files updated All unit tests passing (972 specs total): - handlers: 360/360 passed - route: 252/252 passed - mbus: 8/8 passed (subset) - proxy: 376/376 passed

Fix struct field alignment in test files to pass CI gofmt validation.

Complete the terminology rebrand by updating integration test helper functions to use RoutePolicyScope and RoutePolicySources instead of the old AccessScope and AccessRules field names. Fixes go vet error: unknown field AccessScope in struct literal

This commit fixes a critical bug where route policies were incorrectly enforced on ALL routes (including public routes and mTLS routes without --enforce-route-policies flag), causing 403 Forbidden errors. Root Cause: - MtlsRoutePoliciesAuth.Check() only checked CallerIdentity == nil to decide whether to skip enforcement - It did NOT check routePolicyScope (which indicates if enforcement is actually enabled for the domain) - This caused enforcement on routes where it should be skipped Fix: - Add routePolicyScope check at the beginning of Check() method - Only enforce route policies when routePolicyScope != "" (enforcement enabled via --enforce-route-policies flag on domain creation) - This mirrors the pattern already used in MtlsScopeAuth handler Impact: - Public routes (non-mTLS domains): No longer incorrectly rejected - mTLS routes WITHOUT --enforce-route-policies: No longer rejected - mTLS routes WITH --enforce-route-policies: Still correctly enforced Test Coverage: - Added regression test for the specific bug scenario: RoutePolicyScope empty + CallerIdentity present - This test would have caught the bug if it existed originally - All 18 MtlsRoutePoliciesAuth tests now pass - Test suite now covers all skip scenario combinations Files Changed: - src/code.cloudfoundry.org/gorouter/handlers/mtls_route_policies_auth.go - src/code.cloudfoundry.org/gorouter/handlers/mtls_route_policies_auth_test.go

HTTP clients that include explicit ports in URLs (e.g., https://app.example.com:443/) result in Go's http.Request.Host containing the port (app.example.com:443). Previously, GetMtlsDomainConfig() did not strip the port before matching against configured mTLS domains (e.g., *.apps.identity), causing: - Domain matching to fail for requests with explicit ports - No XFCC header added (fell back to default behavior) - Identity extraction failure in CallerIdentity - Pre-auth handler denying requests with 403 and reason "identity-extraction-failed" This particularly affected Java Spring Boot HTTP clients which construct URLs with explicit ports by default. Fix: Use net.SplitHostPort() to strip port before domain matching, ensuring consistent behavior regardless of whether clients include explicit ports. Added comprehensive unit tests covering: - Wildcard domain matching with/without ports - Exact domain matching with/without ports - IsMtlsDomain() function with/without ports - Negative test cases for non-mTLS domains

The test was incorrectly expecting 403 Forbidden when a route is registered on an mTLS domain without route policy enforcement enabled. The correct behavior is to allow the request through (200 OK) and let the backend handle authorization. Route policy enforcement is controlled by Cloud Controller via the RoutePolicyScope field. When RoutePolicyScope is empty (enforcement disabled), GoRouter allows authenticated requests through. Default-deny only applies when enforcement IS enabled but no policies are configured.

…f8f2)

- Fix domainMatches wildcard to only match single DNS label (security) - Improve SNI/Host mismatch security checks to prevent domain confusion attacks - Add AuthError.ClientMessage() to prevent information leakage in error responses - Add nil CallerIdentity checks in post-selection auth handlers (defense in depth) - Set AuthResult.Outcome in all auth success paths for proper observability - Add proper error handling for response writes in proxy error handler - Remove unnecessary blank type assertion - Add comprehensive unit tests for mtls_pre_auth handler

- Fix scope=any to populate AuthResult for access log consistency - Add AuthResult assertions to scope=any test case - Add test coverage for unknown RoutePolicyScope default case - Add empty GUID guard to cf:app: policy for consistency with cf:space:/cf:org:

- Fix wildcard domain matching inconsistency between GetMtlsDomainConfig and domainMatches to only match single-level subdomains - Add bounds check for empty Subject field in XFCC header parsing - Change RoutePool nil handling from silent bypass to explicit denial with error logging to prevent authorization bypass - Improve error messages by including malformed DN in error output - Add comprehensive test coverage for edge cases including multi-level subdomains, empty Subject strings, and nil RoutePolicies - Update existing tests to validate new security-focused behavior All tests pass (163 config, 390 handlers) and code passes go vet, gofmt, and staticcheck linting.

…line - Fix dead-code bug: skip internal error handler for *AuthError in proxy_round_tripper so ReverseProxy.ErrorHandler can write the 403 - Fix error leak: replace err.Error() with generic status text in fallback error handler to avoid exposing internal details - Extract handleReverseProxyError() as testable package-level function - Add unit tests for handleReverseProxyError (proxy_error_handler_test.go) - Add post-selection pipeline tests in proxy_round_tripper_test.go - Add Layer 0 security branch test in mtls_pre_auth_test.go

Add ERB template validation that raises a deployment error when xfcc_format is configured alongside forwarded_client_cert: always_forward on an mTLS domain. In always_forward mode the XFCC header is passed through untouched, so xfcc_format has no effect and the combination indicates operator misconfiguration. Add rspec coverage for the new validation and surrounding valid combinations (sanitize_set+envoy, always_forward alone, xfcc_format without explicit forwarded_client_cert).

This was referenced Mar 5, 2026

RFC0055 Identity-Aware Routing cloudfoundry/cloud_controller_ng#4910

Open

Update cloud_controller_ng for mTLS app-to-app routing (RFC draft) cloudfoundry/capi-release#625

Draft

RFC: Domain-Scoped mTLS for GoRouter cloudfoundry/community#1438

Merged

cf-foundation-community-automation Bot added this to Application Runtime Platform Working Group Mar 5, 2026

cf-foundation-community-automation Bot moved this to Inbox in Application Runtime Platform Working Group Mar 5, 2026

rkoster force-pushed the feature/app-to-app-mtls-routing branch from 46b4007 to 23b4a0c Compare April 3, 2026 15:10

rkoster force-pushed the feature/app-to-app-mtls-routing branch 3 times, most recently from 1f9b804 to 79271b7 Compare April 17, 2026 12:12

rkoster added ready-to-run and removed ready-to-run labels Apr 17, 2026

rkoster force-pushed the feature/app-to-app-mtls-routing branch from 5cc4170 to b875867 Compare April 20, 2026 09:18

rkoster added ready-to-run and removed ready-to-run labels Apr 20, 2026

rkoster added 20 commits May 4, 2026 09:47

chore: add devbox files to .gitignore

e0fbb44

Devbox is used for local development environment setup but should not be tracked in the repository. The files remain in the working directory for developers who use devbox.

fix: remove routing-api from go.mod dependencies

e2afc87

routing-api is a local submodule in src/code.cloudfoundry.org/routing-api, not an external dependency. It should not be listed in go.mod. This was incorrectly added during rebase conflict resolution.

chore: update go.sum and vendor/modules.txt after rebase

72a0528

fix: run gofmt on post_selection_pipeline_test.go

ca1d441

fix: run gofmt on all modified Go files

8a2b023

fix: resolve go vet and staticcheck issues

1ecc40f

- Export MakeEndpoint method for test access - Fix test calls to MakeEndpoint with correct parameters - Remove unnecessary fmt.Sprintf for string argument

fix: run gofmt on modified files

7aedf7e

chore: trigger CI after rebase on develop

c9401c6

chore: trigger CI for port conflict fix

3fc1ec8

fix: apply gofmt to test files for CI compliance

9e12b19

Fix struct field alignment in test files to pass CI gofmt validation.

fix: apply gofmt to mtls_route_policies_auth_test.go

024bccc

fix: sync vendor/modules.txt nats.go version to v1.51.0 to match go.mod

9965cc0

fix: update routing-api submodule to gorm v2 compatible commit (93579…

71d7f72

…f8f2)

rkoster force-pushed the feature/app-to-app-mtls-routing branch from d528bad to 71d7f72 Compare May 4, 2026 10:18

rkoster added ready-to-run and removed ready-to-run labels May 4, 2026

rkoster added 5 commits May 4, 2026 14:06

rkoster added ready-to-run and removed ready-to-run labels May 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC0055 Identity-Aware Routing#535

RFC0055 Identity-Aware Routing#535
rkoster wants to merge 59 commits intodevelopfrom
feature/app-to-app-mtls-routing

rkoster commented Mar 5, 2026 •

edited

Loading

Uh oh!

rkoster commented Apr 16, 2026

Uh oh!

rkoster commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rkoster commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

RFC0055: Identity-Aware mTLS Routing

What This Does

Phase 1a: mTLS Infrastructure

Phase 1b: Authorization

Key Design Decisions

Testing

Configuration Example

Related PRs

Merge Ordering

AI Disclosure

Uh oh!

rkoster commented Apr 16, 2026

Latest Update: RFC-Compliant Post-Selection Authorization

Key Changes (commit cbf0695)

RFC Compliance

Breaking Change

Integration Test Results

Uh oh!

rkoster commented Apr 16, 2026

Refactoring: AuthError for Future Extensibility

Changes

Benefits

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rkoster commented Mar 5, 2026 •

edited

Loading

Key Changes (commit `cbf0695`)