Skip to content

feat(gatewayapi): mount operator trust bundle on envoy-gateway and envoy-proxy#4796

Open
electricjesus wants to merge 2 commits into
tigera:masterfrom
electricjesus:seth/gateway-trust-bundle
Open

feat(gatewayapi): mount operator trust bundle on envoy-gateway and envoy-proxy#4796
electricjesus wants to merge 2 commits into
tigera:masterfrom
electricjesus:seth/gateway-trust-bundle

Conversation

@electricjesus
Copy link
Copy Markdown
Member

@electricjesus electricjesus commented May 11, 2026

Description

Adds public-CA roots (extracted from operator's UBI base) + Calico CA to a tigera-ca-bundle ConfigMap in the tigera-gateway namespace, mounts it on the envoy-gateway controller Deployment and on every provisioned envoy-proxy pod (Deployment + DaemonSet variants) at /etc/pki/tls/certs. Uses the existing certificatemanager.CreateTrustedBundleWithSystemRootCertificates() pattern already consumed by intrusion-detection, log-collector, authentication, and core.

Type: bug fix / enhancement (gateway-api render path)
Components affected: pkg/render/gatewayapi, pkg/controller/gatewayapi

Why

Envoy-gateway pulls wasm OCI images and may originate TLS to public upstreams (JWT/OIDC providers, tracing exporters, HTTPS clusters), but calico/base ships no public CA roots. Result: x509: certificate signed by unknown authority on every outbound TLS handshake from envoy-gateway / envoy-proxy.

Per @hjiawei in Slack:

The public root ca certs are mounted by tigera operator as needed. Normally, we don't bake them in the component image.

Confirmed: build/Dockerfile:17,23 copies /etc/pki from ubi9/ubi-minimal into the operator image, and pkg/tls/certificatemanagement/certificatebundle.go:249-273 reads it at runtime via getSystemCertificates() and packages it for downstream components. The gateway-api render path was the only major component not consuming this.

What this replaces

Supersedes the image-side workaround in tigera/calico-private#11876 (baking CA roots into envoy-gateway + envoy-proxy images directly). With this operator change, the image change is no longer needed. Same logic applies to the Istio sibling PR tigera/calico-private#11878.

Test plan

  • go build ./... (excluding pre-existing istio embed failure)
  • go vet ./...
  • go test ./pkg/render/gatewayapi/... — 20/20 specs (new positive test asserts ConfigMap + mount paths on both controller and proxy)
  • go test ./pkg/controller/gatewayapi/... — green (BeforeEach now seeds the operator CA secret, mirroring the compliance controller test pattern)
  • Manual verify on a Kind/OrbStack cluster: install operator with this branch, apply GatewayAPI CR, confirm tigera-ca-bundle exists in tigera-gateway, envoy-gateway and envoy-proxy pods have the volume mounted at /etc/pki/tls/certs, outbound TLS from envoy-gateway (wasm OCI fetch) succeeds without x509: unknown authority

Release Note

Mount the operator-managed trusted CA bundle (public roots + Calico CA) on envoy-gateway and provisioned envoy-proxy pods so outbound TLS to public upstreams (e.g. wasm OCI registries, OIDC providers) succeeds without `x509: certificate signed by unknown authority`.

For PR author

  • Tests for change.
  • If changing pkg/apis/, run make gen-files
  • If changing versions, run make gen-versions

For PR reviewers

A note for code reviewers - all pull requests must have the following:

  • Milestone set according to targeted release.
  • Appropriate labels:
    • kind/bug if this is a bugfix.
    • kind/enhancement if this is a a new feature.
    • enterprise if this PR applies to Calico Enterprise only.

…voy-proxy

Envoy-gateway pulls wasm OCI images and may originate TLS to public
upstreams (JWT/OIDC providers, tracing exporters, HTTPS clusters), but
calico/base ships no public CA roots, leading to "x509: certificate
signed by unknown authority" on every outbound TLS handshake.

Operator already extracts the public CA bundle from its UBI base at
build time and exposes it via certificatemanagement.TrustedBundle --
the same pattern used by intrusion-detection, log-collector,
authentication, and core (calico-node, typha). The gateway-api render
path simply did not consume it.

Build a TrustedBundleWithSystemRootCertificates in the gateway-api
reconciler and mount the resulting tigera-ca-bundle ConfigMap on both
the envoy-gateway controller deployment and every provisioned envoy-
proxy pod (Deployment + DaemonSet variants) at /etc/pki/tls/certs --
the path Envoy reads via SSL_CERT_DIR.
@marvin-tigera marvin-tigera added this to the v1.43.0 milestone May 11, 2026
@electricjesus electricjesus marked this pull request as ready for review May 11, 2026 19:27
@electricjesus electricjesus requested a review from a team as a code owner May 11, 2026 19:27
The FV suite runs only the GatewayAPI controller via setupManagerNoControllers,
so nothing creates the tigera-operator/tigera-ca-private secret. The new
CreateTrustedBundleWithSystemRootCertificates path in the gatewayapi controller
now fails with 'CA secret does not exist yet and is not allowed for this call',
which blocks GatewayClass rendering and times out 4 specs after 10s.

Mirror the unit-test setup: call certificatemanager.Create with AllowCACreation
and create the resulting Secret in tigera-operator, tolerating AlreadyExists
across runs.
return append(volumes, v)
}

func appendVolumeMountsIfMissing(mounts []corev1.VolumeMount, toAdd []corev1.VolumeMount) []corev1.VolumeMount {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do without this func and simply append? The overrides would still be called after this is called anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants