From d3c12f649d98da51eb8b15df66e71bee4dedfa75 Mon Sep 17 00:00:00 2001 From: Michal Fupso Date: Thu, 19 Mar 2026 18:46:03 -0700 Subject: [PATCH 1/2] Add egw host ip docs --- .../egress/egress-gateway-host-ip.mdx | 647 ++++++++++++++++++ .../egress/egress-gateway-host-ip.mdx | 647 ++++++++++++++++++ sidebars-calico-cloud.js | 1 + sidebars-calico-enterprise.js | 1 + 4 files changed, 1296 insertions(+) create mode 100644 calico-cloud/networking/egress/egress-gateway-host-ip.mdx create mode 100644 calico-enterprise/networking/egress/egress-gateway-host-ip.mdx diff --git a/calico-cloud/networking/egress/egress-gateway-host-ip.mdx b/calico-cloud/networking/egress/egress-gateway-host-ip.mdx new file mode 100644 index 0000000000..7be8194933 --- /dev/null +++ b/calico-cloud/networking/egress/egress-gateway-host-ip.mdx @@ -0,0 +1,647 @@ +--- +description: Configure egress gateways to use the host node IP as the source address for traffic leaving the cluster. +--- + +# Configure egress gateways with Host IP support + +## Big picture + +Configure specific application traffic to exit the cluster through an egress gateway, using the +gateway's **host (node) IP** as the source address for traffic leaving the cluster. + +## Value + +When traffic from particular applications leaves the cluster to access an external destination, it +can be useful to control the source IP of that traffic. For example, there may be an additional +firewall around the cluster, whose purpose includes policing external accesses from the cluster, and +specifically that particular external destinations can only be accessed from authorized workloads +within the cluster. + +In this mode, outbound traffic passing through an egress gateway is source-NATed (SNAT) to the +**node IP of the host** where the egress gateway pod is running, rather than the gateway's own pod +IP. This is useful when external firewalls or services need to allowlist traffic based on a +stable set of known node IPs, or when pod IPs are not routable outside the cluster. + +By scheduling egress gateways to specific nodes and setting `natOutgoing: true` on the egress IP +pool, you ensure that all application traffic routed through those gateways exits the cluster with +the node IP of the gateway's host as the source address. Any number of application pods can have +their outbound connections multiplexed through a fixed small number of egress gateways, and all of +those outbound connections will appear to come from the gateway nodes' IPs. + +:::note + +The source port of an outbound flow through an egress gateway can generally _not_ be +preserved. Changing the source port is how Linux maps flows from many upstream IPs onto a single +downstream IP. + +::: + +Egress gateways with host IP support are particularly useful when you want all outbound traffic from +a particular application to leave the cluster through a particular node or nodes, and to appear as +traffic originating from those nodes' IPs. The gateways are scheduled to the desired nodes, and the +application pods/namespaces are configured to use those gateways. + +## Concepts + +### Egress gateway + +An egress gateway acts as a transit pod for the outbound application traffic that is configured to +use it. As traffic leaving the cluster passes through the egress gateway, its source IP is changed +before the traffic is forwarded on. + +### Source IP with host IP mode + +When an outbound application flow leaves the cluster through an egress gateway, the source IP +depends on whether `natOutgoing` is enabled on the egress gateway's [IP pool](../../reference/resources/ippool.mdx). + +- If the egress gateway's IP pool has `natOutgoing: true`, the flow's source IP is the **node (host) + IP** of the node where the egress gateway pod is running. This is the **host IP mode** described in + this guide. +- If `natOutgoing: false` (or unset), the flow's source IP is the egress gateway's **pod IP**. + +In host IP mode, external services and firewalls see connections arriving from the egress gateway's +node IP. This is useful when node IPs are stable and well-known, making them suitable for firewall +allowlisting. + +### Control the use of egress gateways + +If a cluster ascribes special meaning to traffic flowing through egress gateways, it will be +important to control when cluster users can configure their pods and namespaces to use them, so that +non-special pods cannot impersonate the special meaning. + +If namespaces in a cluster can only be provisioned by cluster admins, one option is to enable egress +gateway function only on a per-namespace basis. Then only cluster admins will be able to configure +any egress gateway usage. + +Otherwise -- if namespace provisioning is open to users in general, or if it's desirable for egress +gateway function to be enabled both per-namespace and per-pod -- a [Kubernetes admission controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/) + will be +needed. This is a task for each deployment to implement for itself, but possible approaches include +the following. + +1. Decide whether a given Namespace or Pod is permitted to use egress annotations at all, based on + other details of the Namespace or Pod definition. + +1. Evaluate egress annotation selectors to determine the egress gateways that they map to, and + decide whether that usage is acceptable. + +1. Impose the cluster's own bespoke scheme for a Namespace or Pod to identify the egress gateways + that it wants to use, less general than $[prodname]'s egress annotations. Then the + admission controller would police those bespoke annotations (that that cluster's users could + place on Namespace or Pod resources) and either reject the operation in hand, or allow it + through after adding the corresponding $[prodname] egress annotations. + +### Policy enforcement for flows via an egress gateway + +For an outbound connection from a client pod, via an egress gateway, to a destination outside the +cluster, there is more than one possible enforcement point for policy: + +The path of the traffic through policy is as follows: + +1. Packet leaves the client pod and passes through its egress policy. +2. The packet is encapsulated by the client pod's host and sent to the egress gateway +3. The encapsulated packet is sent from the host to the egress gateway pod. +4. The egress gateway pod de-encapsulates the packet and sends the packet out again with its own address. +5. The packet leaves the egress gateway pod through its egress policy. + +To ensure correct operation, (as of v3.15) the encapsulated traffic between host and egress gateway is auto-allowed by +$[prodname] and other ingress traffic is blocked. That means that there are effectively two places where +policy can be applied: + +1. on egress from the client pod +2. on egress from the egress gateway pod (see limitations below). + +The policy applied at (1) is the most powerful since it implicitly sees the original source of the traffic (by +virtue of being attached to that original source). It also sees the external destination of the traffic. + +Since an egress gateway will never originate its own traffic, one option is to rely on policy applied at (1) and +to allow all traffic to at (2) (either by applying no policy or by applying an "allow all"). + +Alternatively, for maximum "defense in depth" applying policy at both (1) and (2) provides extra protection should +the policy at (1) be disabled or bypassed by an attacker. Policy at (2) has the following limitations: + +- [Domain-based policy](../../network-policy/domain-based-policy.mdx) is not supported at egress from egress + gateways. It will either fail to match the expected traffic, or it will work intermittently if the egress gateway + happens to be scheduled to the same node as its clients. This is because any DNS lookup happens at the client pod. + By the time the policy reaches (2) the DNS information is lost and only the IP addresses of the traffic are available. + +- The traffic source will appear to be the egress gateway pod, the source information is lost in the address + translation that occurs inside the egress gateway pod. + +That means that policies at (2) will usually take the form of rules that match only on destination port and IP address, +either directly in the rule (via a CIDR match) or via a (non-domain based) NetworkSet. Matching on source has little +utility since the IP will always be the egress gateway and the port of translated traffic is not always preserved. + +:::note + +Since v3.15.0, $[prodname] also sends health probes to the egress gateway pods from the nodes where +their clients are located. In iptables mode, this traffic is auto-allowed at egress from the host and ingress +to the egress gateway. In eBPF mode, the probe traffic can be blocked by policy, so you must ensure that this traffic allowed; this should be fixed in an upcoming +patch release. + +::: + +## Before you begin + +**Required** + +- Calico CNI +- Open port UDP 4790 on the host + +**Not Supported** + +- GKE +- Windows + +## How to + +- [Enable egress gateway support](#enable-egress-gateway-support) +- [Provision an egress IP pool](#provision-an-egress-ip-pool) +- [Deploy a group of egress gateways](#deploy-a-group-of-egress-gateways) +- [Configure iptables backend for egress gateways](#configure-iptables-backend-for-egress-gateways) +- [Affine a client pod to a specific node](#affine-a-client-pod-to-a-specific-node) +- [Configure namespaces and pods to use egress gateways](#configure-namespaces-and-pods-to-use-egress-gateways) +- [Optionally enable ECMP load balancing](#optionally-enable-ecmp-load-balancing) +- [Verify the feature operation](#verify-the-feature-operation) +- [Control the use of egress gateways](#control-the-use-of-egress-gateways) +- [Upgrade egress gateways](#upgrade-egress-gateways) + +### Enable egress gateway support + +In the default **FelixConfiguration**, set the `egressIPSupport` field to `EnabledPerNamespace` or +`EnabledPerNamespaceOrPerPod`, according to the level of support that you need in your cluster. For +support on a per-namespace basis only: + +```bash +kubectl patch felixconfiguration default --type='merge' -p \ + '{"spec":{"egressIPSupport":"EnabledPerNamespace"}}' +``` + +Or for support both per-namespace and per-pod: + +```bash +kubectl patch felixconfiguration default --type='merge' -p \ + '{"spec":{"egressIPSupport":"EnabledPerNamespaceOrPerPod"}}' +``` + +:::note + +- `egressIPSupport` must be the same on all cluster nodes, so you should set them only in the + `default` FelixConfiguration resource. +- The operator automatically enables the required policy sync API in the FelixConfiguration. + +::: + +### Provision an egress IP pool + +Provision a small IP Pool with `natOutgoing: true`. This ensures that traffic exiting through egress +gateways using this pool is source-NATed to the host IP of the node running the gateway pod. + +```bash +kubectl apply -f - < + terminationGracePeriodSeconds: 0 +EOF +``` + +Replace `` with the hostname of the node where you want the egress gateway to +run. Traffic passing through this gateway will exit the cluster with this node's IP as the source +address. + +:::note + +When deploying egress gateway in a non-default namespace on OpenShift, the namespace needs to be set privileged by adding the following to the namespace: + +##### Label +``` +openshift.io/run-level: "0" +pod-security.kubernetes.io/enforce: privileged +pod-security.kubernetes.io/enforce-version: latest +``` +##### Annotation +``` +security.openshift.io/scc.podSecurityLabelSync: "false" +``` +::: + +Where: + +- It is advisable to have more than one egress gateway per group, so that the egress IP function continues if one of the gateways crashes or needs to be restarted. When there are multiple gateways in a group, outbound traffic from the applications using that group is load-balanced across the available gateways. The number of `replicas` specified must be less than or equal to the number of free IP addresses in the IP Pool. + +- IPPool can be specified either by its name (e.g. `-name: egress-ippool-1`) or by its CIDR (e.g. `-cidr: 10.10.10.0/31`). + +- The labels are arbitrary. You can choose whatever names and values are convenient for your cluster's Namespaces and Pods to refer to in their egress selectors. + + If labels are not specified, a default label `projectcalico.org/egw`:`name` will be added by the Tigera Operator. + +- icmpProbe may be used to specify the Probe IPs, ICMP interval and timeout in seconds. `ips` if set, the + egress gateway pod will probe each IP periodically using an ICMP ping. If all pings fail then the egress + gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. + `timeoutSeconds` controls the timeout before reporting non-ready if no probes succeed. + + ```yaml + icmpProbe: + ips: + - probeIP + - probeIP + timeoutSeconds: 20 + intervalSeconds: 10 + ``` + +- httpProbe may be used to specify the Probe URLs, HTTP interval and timeout in seconds. `urls` if set, the + egress gateway pod will probe each external service periodically. If all probes fail then the egress + gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. + `timeoutSeconds` controls the timeout before reporting non-ready if all probes are failing. + + ```yaml + httpProbe: + urls: + - probeURL + - probeURL + timeoutSeconds: 30 + intervalSeconds: 10 + ``` +- Please refer to the [operator reference docs](../../reference/installation/api.mdx) for details about the egress gateway resource type. + +The health port `8080` is used by: + +- The Kubernetes `readinessProbe` to expose the status of the egress gateway pod (and any ICMP/HTTP + probes). +- Remote pods to check if the egress gateway is "ready". Only "ready" egress + gateways will be used for remote client traffic. This traffic is automatically allowed by $[prodname] and + no policy is required to allow it. $[prodname] only sends probes to egress gateway pods that have a named + "health" port. This ensures that during an upgrade, health probes are only sent to upgraded egress gateways. + +### Deploy on a RKE2 CIS-hardened cluster + +If you are deploying `egress-gateway` on a RKE2 CIS-hardened cluster, its `PodSecurityPolicies` restrict the `securityContext` and `volumes` required by egress gateway. When deploying using the egress gateway custom resource, the Tigera Operator sets up `PodSecurityPolicy`, `Role`, `RoleBinding` and associated `ServiceAccount`. + +### Configure iptables backend for egress gateways + +The Tigera Operator configures egress gateways to use the same iptables backend as `calico-node`. +To modify the iptables backend for egress gateways, you must change the `iptablesBackend` field in the [Felix configuration](../../reference/resources/felixconfig.mdx). + +### Configure IP autodetection for dual-ToR clusters. + +If you plan to use Egress Gateways in a [dual-ToR cluster](../configuring/dual-tor.mdx), you must also adjust the $[nodecontainer] IP +auto-detection method to pick up the stable IP, for example using the `interface: lo` setting +(The default first-found setting skips over the lo interface). This can be configured via the +$[prodname] [Installation resource](../../reference/installation/api.mdx#nodeaddressautodetection). + +### Affine a client pod to a specific node + +In host IP mode, you may want to control which node your client pods run on to ensure deterministic +routing through a specific egress gateway. Use `nodeSelector` to schedule a client pod to a specific +node: + +```bash +kubectl apply -f - < + containers: + - name: alpine + image: alpine + command: ["/bin/sleep"] + args: ["infinity"] +EOF +``` + +Replace `` with the hostname of the desired node. When combined with an egress gateway +policy that uses `gatewayPreference: PreferNodeLocal`, the client pod will prefer to route traffic +through an egress gateway on the same node, ensuring the traffic exits with that node's IP. + +### Configure namespaces and pods to use egress gateways + +You can configure namespaces and pods to use an egress gateway by: +* annotating the namespace or pod +* applying an egress gateway policy to the namespace or pod. + +Using an egress gateway policy is more complicated, but it allows advanced use cases. + +#### Configure a namespace or pod to use an egress gateway (annotation method) + +In a $[prodname] deployment, the Kubernetes namespace and pod resources honor annotations that +tell that namespace or pod to use particular egress gateways. These annotations are selectors, and +their meaning is "the set of pods, anywhere in the cluster, that match those selectors". + +So, to configure all the pods in a namespace to use the egress gateways that are +labelled with `egress-code: red`, you would annotate that namespace like this: + +```bash +kubectl annotate ns egress.projectcalico.org/selector="egress-code == 'red'" +``` + +By default, that selector can only match egress gateways in the same namespace. To select gateways +in a different namespace, specify a `namespaceSelector` annotation as well, like this: + +```bash +kubectl annotate ns egress.projectcalico.org/namespaceSelector="projectcalico.org/name == 'default'" +``` + +Egress gateway annotations have the same [syntax and range of expressions](../../reference/resources/networkpolicy.mdx#selector) as the selector fields in +$[prodname] [network policy](../../reference/resources/networkpolicy.mdx#entityrule). + +To configure a specific Kubernetes Pod to use egress gateways, specify the same annotations when +creating the pod. For example: + +```bash +kubectl apply -f - < egress.projectcalico.org/egressGatewayPolicy="egw-policy1" +``` + +To configure a specific Kubernetes pod to use the same policy, specify the same annotations when +creating the pod. +For example: + +```bash +kubectl apply -f - < -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}' +``` + +By way of a concrete example, you could use netcat to run a test server outside the cluster; for +example: + +```bash +docker run --net=host --privileged subfuzion/netcat -v -l -k -p 8089 +``` + +Then provision an egress IP Pool (with `natOutgoing: true`), and egress gateways, as above. + +Then deploy a pod, with egress annotations as above, and with any image that includes netcat, for example: + +```bash +kubectl apply -f - < -n -- nc 8089 ` should be the IP address of the netcat server. + +Then, if you check the logs or output of the netcat server, you should see: + +``` +Connection from received +``` + +with `` being the **node IP** of the host where the egress gateway pod is running (not +the gateway's pod IP or the egress IP pool IP). + +## Upgrade egress gateways + +From v3.16, egress gateway deployments are managed by the Tigera Operator. + +- When upgrading from a pre-v3.16 release, no automatic upgrade will occur. To upgrade a pre-v3.16 egress gateway deployment, + create an equivalent EgressGateway resource with the same namespace and the same name as mentioned [above](#deploy-a-group-of-egress-gateways); + the operator will then take over management of the old Deployment resource, replacing it with the upgraded version. + +- Use `kubectl apply` to create the egress gateway resource. Tigera Operator will read the newly created resource and wait + for the other $[prodname] components to be upgraded. Once the other $[prodname] components are upgraded, Tigera Operator + will upgrade the existing egress gateway deployment with the new image. + +By default, upgrading egress gateways will sever any connections that are flowing through them. To minimise impact, +the egress gateway feature supports some advanced options that give feedback to affected pods. For more details see +the [egress gateway maintenance guide](egress-gateway-maintenance.mdx). + +## Additional resources + +Please see also: + +- The `egressIP...` fields of the [FelixConfiguration resource](../../reference/resources/felixconfig.mdx#spec). +- [Additional configuration for egress gateway maintenance](egress-gateway-maintenance.mdx) diff --git a/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx b/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx new file mode 100644 index 0000000000..7be8194933 --- /dev/null +++ b/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx @@ -0,0 +1,647 @@ +--- +description: Configure egress gateways to use the host node IP as the source address for traffic leaving the cluster. +--- + +# Configure egress gateways with Host IP support + +## Big picture + +Configure specific application traffic to exit the cluster through an egress gateway, using the +gateway's **host (node) IP** as the source address for traffic leaving the cluster. + +## Value + +When traffic from particular applications leaves the cluster to access an external destination, it +can be useful to control the source IP of that traffic. For example, there may be an additional +firewall around the cluster, whose purpose includes policing external accesses from the cluster, and +specifically that particular external destinations can only be accessed from authorized workloads +within the cluster. + +In this mode, outbound traffic passing through an egress gateway is source-NATed (SNAT) to the +**node IP of the host** where the egress gateway pod is running, rather than the gateway's own pod +IP. This is useful when external firewalls or services need to allowlist traffic based on a +stable set of known node IPs, or when pod IPs are not routable outside the cluster. + +By scheduling egress gateways to specific nodes and setting `natOutgoing: true` on the egress IP +pool, you ensure that all application traffic routed through those gateways exits the cluster with +the node IP of the gateway's host as the source address. Any number of application pods can have +their outbound connections multiplexed through a fixed small number of egress gateways, and all of +those outbound connections will appear to come from the gateway nodes' IPs. + +:::note + +The source port of an outbound flow through an egress gateway can generally _not_ be +preserved. Changing the source port is how Linux maps flows from many upstream IPs onto a single +downstream IP. + +::: + +Egress gateways with host IP support are particularly useful when you want all outbound traffic from +a particular application to leave the cluster through a particular node or nodes, and to appear as +traffic originating from those nodes' IPs. The gateways are scheduled to the desired nodes, and the +application pods/namespaces are configured to use those gateways. + +## Concepts + +### Egress gateway + +An egress gateway acts as a transit pod for the outbound application traffic that is configured to +use it. As traffic leaving the cluster passes through the egress gateway, its source IP is changed +before the traffic is forwarded on. + +### Source IP with host IP mode + +When an outbound application flow leaves the cluster through an egress gateway, the source IP +depends on whether `natOutgoing` is enabled on the egress gateway's [IP pool](../../reference/resources/ippool.mdx). + +- If the egress gateway's IP pool has `natOutgoing: true`, the flow's source IP is the **node (host) + IP** of the node where the egress gateway pod is running. This is the **host IP mode** described in + this guide. +- If `natOutgoing: false` (or unset), the flow's source IP is the egress gateway's **pod IP**. + +In host IP mode, external services and firewalls see connections arriving from the egress gateway's +node IP. This is useful when node IPs are stable and well-known, making them suitable for firewall +allowlisting. + +### Control the use of egress gateways + +If a cluster ascribes special meaning to traffic flowing through egress gateways, it will be +important to control when cluster users can configure their pods and namespaces to use them, so that +non-special pods cannot impersonate the special meaning. + +If namespaces in a cluster can only be provisioned by cluster admins, one option is to enable egress +gateway function only on a per-namespace basis. Then only cluster admins will be able to configure +any egress gateway usage. + +Otherwise -- if namespace provisioning is open to users in general, or if it's desirable for egress +gateway function to be enabled both per-namespace and per-pod -- a [Kubernetes admission controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/) + will be +needed. This is a task for each deployment to implement for itself, but possible approaches include +the following. + +1. Decide whether a given Namespace or Pod is permitted to use egress annotations at all, based on + other details of the Namespace or Pod definition. + +1. Evaluate egress annotation selectors to determine the egress gateways that they map to, and + decide whether that usage is acceptable. + +1. Impose the cluster's own bespoke scheme for a Namespace or Pod to identify the egress gateways + that it wants to use, less general than $[prodname]'s egress annotations. Then the + admission controller would police those bespoke annotations (that that cluster's users could + place on Namespace or Pod resources) and either reject the operation in hand, or allow it + through after adding the corresponding $[prodname] egress annotations. + +### Policy enforcement for flows via an egress gateway + +For an outbound connection from a client pod, via an egress gateway, to a destination outside the +cluster, there is more than one possible enforcement point for policy: + +The path of the traffic through policy is as follows: + +1. Packet leaves the client pod and passes through its egress policy. +2. The packet is encapsulated by the client pod's host and sent to the egress gateway +3. The encapsulated packet is sent from the host to the egress gateway pod. +4. The egress gateway pod de-encapsulates the packet and sends the packet out again with its own address. +5. The packet leaves the egress gateway pod through its egress policy. + +To ensure correct operation, (as of v3.15) the encapsulated traffic between host and egress gateway is auto-allowed by +$[prodname] and other ingress traffic is blocked. That means that there are effectively two places where +policy can be applied: + +1. on egress from the client pod +2. on egress from the egress gateway pod (see limitations below). + +The policy applied at (1) is the most powerful since it implicitly sees the original source of the traffic (by +virtue of being attached to that original source). It also sees the external destination of the traffic. + +Since an egress gateway will never originate its own traffic, one option is to rely on policy applied at (1) and +to allow all traffic to at (2) (either by applying no policy or by applying an "allow all"). + +Alternatively, for maximum "defense in depth" applying policy at both (1) and (2) provides extra protection should +the policy at (1) be disabled or bypassed by an attacker. Policy at (2) has the following limitations: + +- [Domain-based policy](../../network-policy/domain-based-policy.mdx) is not supported at egress from egress + gateways. It will either fail to match the expected traffic, or it will work intermittently if the egress gateway + happens to be scheduled to the same node as its clients. This is because any DNS lookup happens at the client pod. + By the time the policy reaches (2) the DNS information is lost and only the IP addresses of the traffic are available. + +- The traffic source will appear to be the egress gateway pod, the source information is lost in the address + translation that occurs inside the egress gateway pod. + +That means that policies at (2) will usually take the form of rules that match only on destination port and IP address, +either directly in the rule (via a CIDR match) or via a (non-domain based) NetworkSet. Matching on source has little +utility since the IP will always be the egress gateway and the port of translated traffic is not always preserved. + +:::note + +Since v3.15.0, $[prodname] also sends health probes to the egress gateway pods from the nodes where +their clients are located. In iptables mode, this traffic is auto-allowed at egress from the host and ingress +to the egress gateway. In eBPF mode, the probe traffic can be blocked by policy, so you must ensure that this traffic allowed; this should be fixed in an upcoming +patch release. + +::: + +## Before you begin + +**Required** + +- Calico CNI +- Open port UDP 4790 on the host + +**Not Supported** + +- GKE +- Windows + +## How to + +- [Enable egress gateway support](#enable-egress-gateway-support) +- [Provision an egress IP pool](#provision-an-egress-ip-pool) +- [Deploy a group of egress gateways](#deploy-a-group-of-egress-gateways) +- [Configure iptables backend for egress gateways](#configure-iptables-backend-for-egress-gateways) +- [Affine a client pod to a specific node](#affine-a-client-pod-to-a-specific-node) +- [Configure namespaces and pods to use egress gateways](#configure-namespaces-and-pods-to-use-egress-gateways) +- [Optionally enable ECMP load balancing](#optionally-enable-ecmp-load-balancing) +- [Verify the feature operation](#verify-the-feature-operation) +- [Control the use of egress gateways](#control-the-use-of-egress-gateways) +- [Upgrade egress gateways](#upgrade-egress-gateways) + +### Enable egress gateway support + +In the default **FelixConfiguration**, set the `egressIPSupport` field to `EnabledPerNamespace` or +`EnabledPerNamespaceOrPerPod`, according to the level of support that you need in your cluster. For +support on a per-namespace basis only: + +```bash +kubectl patch felixconfiguration default --type='merge' -p \ + '{"spec":{"egressIPSupport":"EnabledPerNamespace"}}' +``` + +Or for support both per-namespace and per-pod: + +```bash +kubectl patch felixconfiguration default --type='merge' -p \ + '{"spec":{"egressIPSupport":"EnabledPerNamespaceOrPerPod"}}' +``` + +:::note + +- `egressIPSupport` must be the same on all cluster nodes, so you should set them only in the + `default` FelixConfiguration resource. +- The operator automatically enables the required policy sync API in the FelixConfiguration. + +::: + +### Provision an egress IP pool + +Provision a small IP Pool with `natOutgoing: true`. This ensures that traffic exiting through egress +gateways using this pool is source-NATed to the host IP of the node running the gateway pod. + +```bash +kubectl apply -f - < + terminationGracePeriodSeconds: 0 +EOF +``` + +Replace `` with the hostname of the node where you want the egress gateway to +run. Traffic passing through this gateway will exit the cluster with this node's IP as the source +address. + +:::note + +When deploying egress gateway in a non-default namespace on OpenShift, the namespace needs to be set privileged by adding the following to the namespace: + +##### Label +``` +openshift.io/run-level: "0" +pod-security.kubernetes.io/enforce: privileged +pod-security.kubernetes.io/enforce-version: latest +``` +##### Annotation +``` +security.openshift.io/scc.podSecurityLabelSync: "false" +``` +::: + +Where: + +- It is advisable to have more than one egress gateway per group, so that the egress IP function continues if one of the gateways crashes or needs to be restarted. When there are multiple gateways in a group, outbound traffic from the applications using that group is load-balanced across the available gateways. The number of `replicas` specified must be less than or equal to the number of free IP addresses in the IP Pool. + +- IPPool can be specified either by its name (e.g. `-name: egress-ippool-1`) or by its CIDR (e.g. `-cidr: 10.10.10.0/31`). + +- The labels are arbitrary. You can choose whatever names and values are convenient for your cluster's Namespaces and Pods to refer to in their egress selectors. + + If labels are not specified, a default label `projectcalico.org/egw`:`name` will be added by the Tigera Operator. + +- icmpProbe may be used to specify the Probe IPs, ICMP interval and timeout in seconds. `ips` if set, the + egress gateway pod will probe each IP periodically using an ICMP ping. If all pings fail then the egress + gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. + `timeoutSeconds` controls the timeout before reporting non-ready if no probes succeed. + + ```yaml + icmpProbe: + ips: + - probeIP + - probeIP + timeoutSeconds: 20 + intervalSeconds: 10 + ``` + +- httpProbe may be used to specify the Probe URLs, HTTP interval and timeout in seconds. `urls` if set, the + egress gateway pod will probe each external service periodically. If all probes fail then the egress + gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. + `timeoutSeconds` controls the timeout before reporting non-ready if all probes are failing. + + ```yaml + httpProbe: + urls: + - probeURL + - probeURL + timeoutSeconds: 30 + intervalSeconds: 10 + ``` +- Please refer to the [operator reference docs](../../reference/installation/api.mdx) for details about the egress gateway resource type. + +The health port `8080` is used by: + +- The Kubernetes `readinessProbe` to expose the status of the egress gateway pod (and any ICMP/HTTP + probes). +- Remote pods to check if the egress gateway is "ready". Only "ready" egress + gateways will be used for remote client traffic. This traffic is automatically allowed by $[prodname] and + no policy is required to allow it. $[prodname] only sends probes to egress gateway pods that have a named + "health" port. This ensures that during an upgrade, health probes are only sent to upgraded egress gateways. + +### Deploy on a RKE2 CIS-hardened cluster + +If you are deploying `egress-gateway` on a RKE2 CIS-hardened cluster, its `PodSecurityPolicies` restrict the `securityContext` and `volumes` required by egress gateway. When deploying using the egress gateway custom resource, the Tigera Operator sets up `PodSecurityPolicy`, `Role`, `RoleBinding` and associated `ServiceAccount`. + +### Configure iptables backend for egress gateways + +The Tigera Operator configures egress gateways to use the same iptables backend as `calico-node`. +To modify the iptables backend for egress gateways, you must change the `iptablesBackend` field in the [Felix configuration](../../reference/resources/felixconfig.mdx). + +### Configure IP autodetection for dual-ToR clusters. + +If you plan to use Egress Gateways in a [dual-ToR cluster](../configuring/dual-tor.mdx), you must also adjust the $[nodecontainer] IP +auto-detection method to pick up the stable IP, for example using the `interface: lo` setting +(The default first-found setting skips over the lo interface). This can be configured via the +$[prodname] [Installation resource](../../reference/installation/api.mdx#nodeaddressautodetection). + +### Affine a client pod to a specific node + +In host IP mode, you may want to control which node your client pods run on to ensure deterministic +routing through a specific egress gateway. Use `nodeSelector` to schedule a client pod to a specific +node: + +```bash +kubectl apply -f - < + containers: + - name: alpine + image: alpine + command: ["/bin/sleep"] + args: ["infinity"] +EOF +``` + +Replace `` with the hostname of the desired node. When combined with an egress gateway +policy that uses `gatewayPreference: PreferNodeLocal`, the client pod will prefer to route traffic +through an egress gateway on the same node, ensuring the traffic exits with that node's IP. + +### Configure namespaces and pods to use egress gateways + +You can configure namespaces and pods to use an egress gateway by: +* annotating the namespace or pod +* applying an egress gateway policy to the namespace or pod. + +Using an egress gateway policy is more complicated, but it allows advanced use cases. + +#### Configure a namespace or pod to use an egress gateway (annotation method) + +In a $[prodname] deployment, the Kubernetes namespace and pod resources honor annotations that +tell that namespace or pod to use particular egress gateways. These annotations are selectors, and +their meaning is "the set of pods, anywhere in the cluster, that match those selectors". + +So, to configure all the pods in a namespace to use the egress gateways that are +labelled with `egress-code: red`, you would annotate that namespace like this: + +```bash +kubectl annotate ns egress.projectcalico.org/selector="egress-code == 'red'" +``` + +By default, that selector can only match egress gateways in the same namespace. To select gateways +in a different namespace, specify a `namespaceSelector` annotation as well, like this: + +```bash +kubectl annotate ns egress.projectcalico.org/namespaceSelector="projectcalico.org/name == 'default'" +``` + +Egress gateway annotations have the same [syntax and range of expressions](../../reference/resources/networkpolicy.mdx#selector) as the selector fields in +$[prodname] [network policy](../../reference/resources/networkpolicy.mdx#entityrule). + +To configure a specific Kubernetes Pod to use egress gateways, specify the same annotations when +creating the pod. For example: + +```bash +kubectl apply -f - < egress.projectcalico.org/egressGatewayPolicy="egw-policy1" +``` + +To configure a specific Kubernetes pod to use the same policy, specify the same annotations when +creating the pod. +For example: + +```bash +kubectl apply -f - < -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}' +``` + +By way of a concrete example, you could use netcat to run a test server outside the cluster; for +example: + +```bash +docker run --net=host --privileged subfuzion/netcat -v -l -k -p 8089 +``` + +Then provision an egress IP Pool (with `natOutgoing: true`), and egress gateways, as above. + +Then deploy a pod, with egress annotations as above, and with any image that includes netcat, for example: + +```bash +kubectl apply -f - < -n -- nc 8089 ` should be the IP address of the netcat server. + +Then, if you check the logs or output of the netcat server, you should see: + +``` +Connection from received +``` + +with `` being the **node IP** of the host where the egress gateway pod is running (not +the gateway's pod IP or the egress IP pool IP). + +## Upgrade egress gateways + +From v3.16, egress gateway deployments are managed by the Tigera Operator. + +- When upgrading from a pre-v3.16 release, no automatic upgrade will occur. To upgrade a pre-v3.16 egress gateway deployment, + create an equivalent EgressGateway resource with the same namespace and the same name as mentioned [above](#deploy-a-group-of-egress-gateways); + the operator will then take over management of the old Deployment resource, replacing it with the upgraded version. + +- Use `kubectl apply` to create the egress gateway resource. Tigera Operator will read the newly created resource and wait + for the other $[prodname] components to be upgraded. Once the other $[prodname] components are upgraded, Tigera Operator + will upgrade the existing egress gateway deployment with the new image. + +By default, upgrading egress gateways will sever any connections that are flowing through them. To minimise impact, +the egress gateway feature supports some advanced options that give feedback to affected pods. For more details see +the [egress gateway maintenance guide](egress-gateway-maintenance.mdx). + +## Additional resources + +Please see also: + +- The `egressIP...` fields of the [FelixConfiguration resource](../../reference/resources/felixconfig.mdx#spec). +- [Additional configuration for egress gateway maintenance](egress-gateway-maintenance.mdx) diff --git a/sidebars-calico-cloud.js b/sidebars-calico-cloud.js index eeb0e0105f..8eb1d11923 100644 --- a/sidebars-calico-cloud.js +++ b/sidebars-calico-cloud.js @@ -371,6 +371,7 @@ module.exports = { link: { type: 'doc', id: 'networking/egress/index' }, items: [ 'networking/egress/egress-gateway-on-prem', + 'networking/egress/egress-gateway-host-ip', 'networking/egress/egress-gateway-aws', 'networking/egress/egress-gateway-azure', 'networking/egress/egress-gateway-maintenance', diff --git a/sidebars-calico-enterprise.js b/sidebars-calico-enterprise.js index 1f4a792134..6296d3a831 100644 --- a/sidebars-calico-enterprise.js +++ b/sidebars-calico-enterprise.js @@ -191,6 +191,7 @@ module.exports = { link: { type: 'doc', id: 'networking/egress/index' }, items: [ 'networking/egress/egress-gateway-on-prem', + 'networking/egress/egress-gateway-host-ip', 'networking/egress/egress-gateway-azure', 'networking/egress/egress-gateway-aws', 'networking/egress/egress-gateway-maintenance', From b47c88a23e0e98f1c4e8cd9851309d459b17e1b1 Mon Sep 17 00:00:00 2001 From: Michal Fupso Date: Tue, 12 May 2026 15:16:10 -0700 Subject: [PATCH 2/2] Update host ip docs --- .../egress/egress-gateway-host-ip.mdx | 643 ++---------------- .../egress/egress-gateway-host-ip.mdx | 643 ++---------------- 2 files changed, 124 insertions(+), 1162 deletions(-) diff --git a/calico-cloud/networking/egress/egress-gateway-host-ip.mdx b/calico-cloud/networking/egress/egress-gateway-host-ip.mdx index 7be8194933..a63d87fb51 100644 --- a/calico-cloud/networking/egress/egress-gateway-host-ip.mdx +++ b/calico-cloud/networking/egress/egress-gateway-host-ip.mdx @@ -2,646 +2,127 @@ description: Configure egress gateways to use the host node IP as the source address for traffic leaving the cluster. --- -# Configure egress gateways with Host IP support +# Configure egress gateways with host IP support ## Big picture -Configure specific application traffic to exit the cluster through an egress gateway, using the -gateway's **host (node) IP** as the source address for traffic leaving the cluster. +Configure an existing egress gateway deployment so that traffic exiting through it appears to come +from the **host address** of the host running the gateway pod, rather than the gateway's pod IP. ## Value -When traffic from particular applications leaves the cluster to access an external destination, it -can be useful to control the source IP of that traffic. For example, there may be an additional -firewall around the cluster, whose purpose includes policing external accesses from the cluster, and -specifically that particular external destinations can only be accessed from authorized workloads -within the cluster. +External firewalls and services often allowlist traffic by source IP. When egress gateway pod IPs +are not routable outside the cluster or simply not convenient to manage using the gateway's +**host address** as the source address gives external systems a stable, well-known set of IPs to +allowlist. -In this mode, outbound traffic passing through an egress gateway is source-NATed (SNAT) to the -**node IP of the host** where the egress gateway pod is running, rather than the gateway's own pod -IP. This is useful when external firewalls or services need to allowlist traffic based on a -stable set of known node IPs, or when pod IPs are not routable outside the cluster. - -By scheduling egress gateways to specific nodes and setting `natOutgoing: true` on the egress IP -pool, you ensure that all application traffic routed through those gateways exits the cluster with -the node IP of the gateway's host as the source address. Any number of application pods can have -their outbound connections multiplexed through a fixed small number of egress gateways, and all of -those outbound connections will appear to come from the gateway nodes' IPs. +Any number of application pods can multiplex their outbound traffic through a small fixed set of +egress gateways, and all of those connections will appear to come from the gateways' host address. :::note -The source port of an outbound flow through an egress gateway can generally _not_ be -preserved. Changing the source port is how Linux maps flows from many upstream IPs onto a single -downstream IP. +Host IP mode is most commonly used with the [on-premises setup](egress-gateway-on-prem.mdx), where +the alternative source IP would be a non-routable pod IP. On [AWS](egress-gateway-aws.mdx) and +[Azure](egress-gateway-azure.mdx) setups, gateways already use native VPC/VNet IPs that are +routable on the underlying network - enabling host IP mode there replaces those native IPs with +the host address. ::: -Egress gateways with host IP support are particularly useful when you want all outbound traffic from -a particular application to leave the cluster through a particular node or nodes, and to appear as -traffic originating from those nodes' IPs. The gateways are scheduled to the desired nodes, and the -application pods/namespaces are configured to use those gateways. - ## Concepts -### Egress gateway - -An egress gateway acts as a transit pod for the outbound application traffic that is configured to -use it. As traffic leaving the cluster passes through the egress gateway, its source IP is changed -before the traffic is forwarded on. - -### Source IP with host IP mode - -When an outbound application flow leaves the cluster through an egress gateway, the source IP -depends on whether `natOutgoing` is enabled on the egress gateway's [IP pool](../../reference/resources/ippool.mdx). - -- If the egress gateway's IP pool has `natOutgoing: true`, the flow's source IP is the **node (host) - IP** of the node where the egress gateway pod is running. This is the **host IP mode** described in - this guide. -- If `natOutgoing: false` (or unset), the flow's source IP is the egress gateway's **pod IP**. - -In host IP mode, external services and firewalls see connections arriving from the egress gateway's -node IP. This is useful when node IPs are stable and well-known, making them suitable for firewall -allowlisting. - -### Control the use of egress gateways - -If a cluster ascribes special meaning to traffic flowing through egress gateways, it will be -important to control when cluster users can configure their pods and namespaces to use them, so that -non-special pods cannot impersonate the special meaning. - -If namespaces in a cluster can only be provisioned by cluster admins, one option is to enable egress -gateway function only on a per-namespace basis. Then only cluster admins will be able to configure -any egress gateway usage. - -Otherwise -- if namespace provisioning is open to users in general, or if it's desirable for egress -gateway function to be enabled both per-namespace and per-pod -- a [Kubernetes admission controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/) - will be -needed. This is a task for each deployment to implement for itself, but possible approaches include -the following. - -1. Decide whether a given Namespace or Pod is permitted to use egress annotations at all, based on - other details of the Namespace or Pod definition. - -1. Evaluate egress annotation selectors to determine the egress gateways that they map to, and - decide whether that usage is acceptable. - -1. Impose the cluster's own bespoke scheme for a Namespace or Pod to identify the egress gateways - that it wants to use, less general than $[prodname]'s egress annotations. Then the - admission controller would police those bespoke annotations (that that cluster's users could - place on Namespace or Pod resources) and either reject the operation in hand, or allow it - through after adding the corresponding $[prodname] egress annotations. - -### Policy enforcement for flows via an egress gateway - -For an outbound connection from a client pod, via an egress gateway, to a destination outside the -cluster, there is more than one possible enforcement point for policy: - -The path of the traffic through policy is as follows: - -1. Packet leaves the client pod and passes through its egress policy. -2. The packet is encapsulated by the client pod's host and sent to the egress gateway -3. The encapsulated packet is sent from the host to the egress gateway pod. -4. The egress gateway pod de-encapsulates the packet and sends the packet out again with its own address. -5. The packet leaves the egress gateway pod through its egress policy. - -To ensure correct operation, (as of v3.15) the encapsulated traffic between host and egress gateway is auto-allowed by -$[prodname] and other ingress traffic is blocked. That means that there are effectively two places where -policy can be applied: - -1. on egress from the client pod -2. on egress from the egress gateway pod (see limitations below). - -The policy applied at (1) is the most powerful since it implicitly sees the original source of the traffic (by -virtue of being attached to that original source). It also sees the external destination of the traffic. - -Since an egress gateway will never originate its own traffic, one option is to rely on policy applied at (1) and -to allow all traffic to at (2) (either by applying no policy or by applying an "allow all"). +### Source IP and `natOutgoing` -Alternatively, for maximum "defense in depth" applying policy at both (1) and (2) provides extra protection should -the policy at (1) be disabled or bypassed by an attacker. Policy at (2) has the following limitations: +When an outbound application flow leaves the cluster through an egress gateway, the source IP seen +by external services depends on the `natOutgoing` setting of the egress gateway's +[IP pool](../../reference/resources/ippool.mdx): -- [Domain-based policy](../../network-policy/domain-based-policy.mdx) is not supported at egress from egress - gateways. It will either fail to match the expected traffic, or it will work intermittently if the egress gateway - happens to be scheduled to the same node as its clients. This is because any DNS lookup happens at the client pod. - By the time the policy reaches (2) the DNS information is lost and only the IP addresses of the traffic are available. - -- The traffic source will appear to be the egress gateway pod, the source information is lost in the address - translation that occurs inside the egress gateway pod. - -That means that policies at (2) will usually take the form of rules that match only on destination port and IP address, -either directly in the rule (via a CIDR match) or via a (non-domain based) NetworkSet. Matching on source has little -utility since the IP will always be the egress gateway and the port of translated traffic is not always preserved. - -:::note - -Since v3.15.0, $[prodname] also sends health probes to the egress gateway pods from the nodes where -their clients are located. In iptables mode, this traffic is auto-allowed at egress from the host and ingress -to the egress gateway. In eBPF mode, the probe traffic can be blocked by policy, so you must ensure that this traffic allowed; this should be fixed in an upcoming -patch release. - -::: +- `natOutgoing: false` - the flow's source IP is the egress gateway's **pod IP**. This + is the default for all egress gateway setup guides. +- `natOutgoing: true` - the flow's source IP is the **host address** of the node where the egress + gateway pod is running. ## Before you begin -**Required** - -- Calico CNI -- Open port UDP 4790 on the host - -**Not Supported** - -- GKE -- Windows +These instructions require a functioning egress gateway deployment. +For setup, see [our egress gateway guides](index.mdx). ## How to -- [Enable egress gateway support](#enable-egress-gateway-support) -- [Provision an egress IP pool](#provision-an-egress-ip-pool) -- [Deploy a group of egress gateways](#deploy-a-group-of-egress-gateways) -- [Configure iptables backend for egress gateways](#configure-iptables-backend-for-egress-gateways) -- [Affine a client pod to a specific node](#affine-a-client-pod-to-a-specific-node) -- [Configure namespaces and pods to use egress gateways](#configure-namespaces-and-pods-to-use-egress-gateways) -- [Optionally enable ECMP load balancing](#optionally-enable-ecmp-load-balancing) -- [Verify the feature operation](#verify-the-feature-operation) -- [Control the use of egress gateways](#control-the-use-of-egress-gateways) -- [Upgrade egress gateways](#upgrade-egress-gateways) - -### Enable egress gateway support +- [Enable natOutgoing on the egress IP pool](#enable-host-ip-mode-on-the-egress-ip-pool) +- [Pin egress gateways to specific nodes](#pin-egress-gateways-to-specific-nodes) +- [Affine client pods to a specific node](#affine-client-pods-to-a-specific-node) +- [Verify the source IP](#verify-the-source-ip) -In the default **FelixConfiguration**, set the `egressIPSupport` field to `EnabledPerNamespace` or -`EnabledPerNamespaceOrPerPod`, according to the level of support that you need in your cluster. For -support on a per-namespace basis only: +### Enable natOutgoing on the egress IP pool -```bash -kubectl patch felixconfiguration default --type='merge' -p \ - '{"spec":{"egressIPSupport":"EnabledPerNamespace"}}' -``` - -Or for support both per-namespace and per-pod: +Set `natOutgoing: true` on the IP pool used by your egress gateways: ```bash -kubectl patch felixconfiguration default --type='merge' -p \ - '{"spec":{"egressIPSupport":"EnabledPerNamespaceOrPerPod"}}' +kubectl patch ippool egress-ippool-1 --type='merge' -p '{"spec":{"natOutgoing":true}}' ``` -:::note - -- `egressIPSupport` must be the same on all cluster nodes, so you should set them only in the - `default` FelixConfiguration resource. -- The operator automatically enables the required policy sync API in the FelixConfiguration. +Outbound traffic leaving the cluster through a gateway in this pool will now be SNAT'd to the node +IP of the gateway's host, instead of the gateway's pod IP. -::: +### Pin egress gateways to specific nodes -### Provision an egress IP pool - -Provision a small IP Pool with `natOutgoing: true`. This ensures that traffic exiting through egress -gateways using this pool is source-NATed to the host IP of the node running the gateway pod. +The source IP that external services see depends on which node the gateway pod is +scheduled to. To make this deterministic, set a `nodeSelector` on the gateway template: ```bash -kubectl apply -f - <"}}}}}' ``` -Where: - -- `natOutgoing: true` is required for host IP mode. This causes traffic leaving the cluster through - an egress gateway to be SNAT'd to the **node IP** of the gateway's host, instead of the gateway's - pod IP. - -- It is best to set the `blockSize` to 32 so that each block contains only a single IP address: - - - Scheduling a single egress gateway to a node causes the node to claim a whole block. The other IPs in the block - are wasted unless a second egress gateway (with the same pool configuration) is scheduled to the same node. +Traffic passing through this gateway will exit the cluster with that node's IP as the +source address. - - Empty /32 blocks can always be reclaimed from other nodes if the pool runs out of blocks. This ensures that an - egress gateway can always be scheduled if there are free IPs in the pool. +Without pinning, the source IP will still be a node IP, but it could be any node the gateway +happens to land on. - Setting `strictAffinity` to `false` in the [IPAM configuration](../../reference/resources/ipamconfig) also prevents the - above problems by allowing nodes to "borrow" IPs from other nodes' blocks. However, using /32 blocks: +### Affine client pods to a specific node - - Avoids a dependency on a setting that is shared with other IP pools. +This step is optional. If you want a particular client's traffic to deterministically exit through +a particular node's IP, schedule the client to the same node as a gateway and apply an +[EgressGatewayPolicy](egress-gateway-on-prem.mdx#configure-a-namespace-or-pod-to-use-an-egress-gateway-egress-gateway-policy-method) +with `gatewayPreference: PreferNodeLocal`. The client will then prefer the gateway on its own node, +ensuring traffic exits with that node's IP. - - Results in simpler, uniform route advertisements (rather than a mix of block size routes and /32 routes). +For example, to pin a workload to a specific node, add a `nodeSelector` to its pod spec: - - Results in less route churn. - -- `nodeSelector: "!all()"` is recommended so that this egress IP pool is not accidentally used for cluster pods in general. Specifying this `nodeSelector` means that the IP pool is only used for pods that explicitly identify it in their `cni.projectcalico.org/ipv4pools` annotation. - -- Set `ipipMode` or `vxlanMode` to `Always` if the pod network has [IPIP or VXLAN](../configuring/vxlan-ipip.mdx) enabled. - - :::note - - This setting is not specific to egress gateway. In some cases where nodes happen to be in the same subnet, setting the value to `Never`will work the same as `Always`. It all depends on the hop from the client node to the egress gateway node. For example, if the client nodes are in the same AWS subnet, and you are using `Always` because some of the nodes are in different subnets, then `Never` will work for the egress IP Pool when the client and gateway nodes are in the same subnet. - - ::: - -### Deploy a group of egress gateways - -Use an egress gateway custom resource to deploy a group of egress gateways, using the egress IP Pool. -Because we are using host IP mode, you should schedule the egress gateway to a specific node using -`nodeSelector` so that you know which node IP will be used as the source address for egress traffic. - -```bash -kubectl apply -f - < - terminationGracePeriodSeconds: 0 -EOF -``` - -Replace `` with the hostname of the node where you want the egress gateway to -run. Traffic passing through this gateway will exit the cluster with this node's IP as the source -address. - -:::note - -When deploying egress gateway in a non-default namespace on OpenShift, the namespace needs to be set privileged by adding the following to the namespace: - -##### Label -``` -openshift.io/run-level: "0" -pod-security.kubernetes.io/enforce: privileged -pod-security.kubernetes.io/enforce-version: latest -``` -##### Annotation -``` -security.openshift.io/scc.podSecurityLabelSync: "false" -``` -::: - -Where: - -- It is advisable to have more than one egress gateway per group, so that the egress IP function continues if one of the gateways crashes or needs to be restarted. When there are multiple gateways in a group, outbound traffic from the applications using that group is load-balanced across the available gateways. The number of `replicas` specified must be less than or equal to the number of free IP addresses in the IP Pool. - -- IPPool can be specified either by its name (e.g. `-name: egress-ippool-1`) or by its CIDR (e.g. `-cidr: 10.10.10.0/31`). - -- The labels are arbitrary. You can choose whatever names and values are convenient for your cluster's Namespaces and Pods to refer to in their egress selectors. - - If labels are not specified, a default label `projectcalico.org/egw`:`name` will be added by the Tigera Operator. - -- icmpProbe may be used to specify the Probe IPs, ICMP interval and timeout in seconds. `ips` if set, the - egress gateway pod will probe each IP periodically using an ICMP ping. If all pings fail then the egress - gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. - `timeoutSeconds` controls the timeout before reporting non-ready if no probes succeed. - - ```yaml - icmpProbe: - ips: - - probeIP - - probeIP - timeoutSeconds: 20 - intervalSeconds: 10 - ``` - -- httpProbe may be used to specify the Probe URLs, HTTP interval and timeout in seconds. `urls` if set, the - egress gateway pod will probe each external service periodically. If all probes fail then the egress - gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. - `timeoutSeconds` controls the timeout before reporting non-ready if all probes are failing. - - ```yaml - httpProbe: - urls: - - probeURL - - probeURL - timeoutSeconds: 30 - intervalSeconds: 10 - ``` -- Please refer to the [operator reference docs](../../reference/installation/api.mdx) for details about the egress gateway resource type. - -The health port `8080` is used by: - -- The Kubernetes `readinessProbe` to expose the status of the egress gateway pod (and any ICMP/HTTP - probes). -- Remote pods to check if the egress gateway is "ready". Only "ready" egress - gateways will be used for remote client traffic. This traffic is automatically allowed by $[prodname] and - no policy is required to allow it. $[prodname] only sends probes to egress gateway pods that have a named - "health" port. This ensures that during an upgrade, health probes are only sent to upgraded egress gateways. - -### Deploy on a RKE2 CIS-hardened cluster - -If you are deploying `egress-gateway` on a RKE2 CIS-hardened cluster, its `PodSecurityPolicies` restrict the `securityContext` and `volumes` required by egress gateway. When deploying using the egress gateway custom resource, the Tigera Operator sets up `PodSecurityPolicy`, `Role`, `RoleBinding` and associated `ServiceAccount`. - -### Configure iptables backend for egress gateways - -The Tigera Operator configures egress gateways to use the same iptables backend as `calico-node`. -To modify the iptables backend for egress gateways, you must change the `iptablesBackend` field in the [Felix configuration](../../reference/resources/felixconfig.mdx). - -### Configure IP autodetection for dual-ToR clusters. - -If you plan to use Egress Gateways in a [dual-ToR cluster](../configuring/dual-tor.mdx), you must also adjust the $[nodecontainer] IP -auto-detection method to pick up the stable IP, for example using the `interface: lo` setting -(The default first-found setting skips over the lo interface). This can be configured via the -$[prodname] [Installation resource](../../reference/installation/api.mdx#nodeaddressautodetection). - -### Affine a client pod to a specific node - -In host IP mode, you may want to control which node your client pods run on to ensure deterministic -routing through a specific egress gateway. Use `nodeSelector` to schedule a client pod to a specific -node: - -```bash -kubectl apply -f - < - containers: - - name: alpine - image: alpine - command: ["/bin/sleep"] - args: ["infinity"] -EOF ``` -Replace `` with the hostname of the desired node. When combined with an egress gateway -policy that uses `gatewayPreference: PreferNodeLocal`, the client pod will prefer to route traffic -through an egress gateway on the same node, ensuring the traffic exits with that node's IP. - -### Configure namespaces and pods to use egress gateways +### Verify the source IP -You can configure namespaces and pods to use an egress gateway by: -* annotating the namespace or pod -* applying an egress gateway policy to the namespace or pod. - -Using an egress gateway policy is more complicated, but it allows advanced use cases. - -#### Configure a namespace or pod to use an egress gateway (annotation method) - -In a $[prodname] deployment, the Kubernetes namespace and pod resources honor annotations that -tell that namespace or pod to use particular egress gateways. These annotations are selectors, and -their meaning is "the set of pods, anywhere in the cluster, that match those selectors". - -So, to configure all the pods in a namespace to use the egress gateways that are -labelled with `egress-code: red`, you would annotate that namespace like this: - -```bash -kubectl annotate ns egress.projectcalico.org/selector="egress-code == 'red'" -``` - -By default, that selector can only match egress gateways in the same namespace. To select gateways -in a different namespace, specify a `namespaceSelector` annotation as well, like this: - -```bash -kubectl annotate ns egress.projectcalico.org/namespaceSelector="projectcalico.org/name == 'default'" -``` - -Egress gateway annotations have the same [syntax and range of expressions](../../reference/resources/networkpolicy.mdx#selector) as the selector fields in -$[prodname] [network policy](../../reference/resources/networkpolicy.mdx#entityrule). - -To configure a specific Kubernetes Pod to use egress gateways, specify the same annotations when -creating the pod. For example: - -```bash -kubectl apply -f - < egress.projectcalico.org/egressGatewayPolicy="egw-policy1" -``` - -To configure a specific Kubernetes pod to use the same policy, specify the same annotations when -creating the pod. -For example: - -```bash -kubectl apply -f - < -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}' ``` -By way of a concrete example, you could use netcat to run a test server outside the cluster; for -example: - -```bash -docker run --net=host --privileged subfuzion/netcat -v -l -k -p 8089 -``` +Initiate an outbound connection from one of your client pods to a server outside the cluster, and +observe the source IP on the server. With host IP mode, it should match the node's InternalIP +above -- not the egress gateway's pod IP or any IP from the egress IP pool. -Then provision an egress IP Pool (with `natOutgoing: true`), and egress gateways, as above. - -Then deploy a pod, with egress annotations as above, and with any image that includes netcat, for example: - -```bash -kubectl apply -f - < -n -- nc 8089 ` should be the IP address of the netcat server. - -Then, if you check the logs or output of the netcat server, you should see: - -``` -Connection from received -``` - -with `` being the **node IP** of the host where the egress gateway pod is running (not -the gateway's pod IP or the egress IP pool IP). - -## Upgrade egress gateways - -From v3.16, egress gateway deployments are managed by the Tigera Operator. - -- When upgrading from a pre-v3.16 release, no automatic upgrade will occur. To upgrade a pre-v3.16 egress gateway deployment, - create an equivalent EgressGateway resource with the same namespace and the same name as mentioned [above](#deploy-a-group-of-egress-gateways); - the operator will then take over management of the old Deployment resource, replacing it with the upgraded version. +:::note -- Use `kubectl apply` to create the egress gateway resource. Tigera Operator will read the newly created resource and wait - for the other $[prodname] components to be upgraded. Once the other $[prodname] components are upgraded, Tigera Operator - will upgrade the existing egress gateway deployment with the new image. +For return traffic to reach the gateway, the external server must know how to route to the egress +gateway's node IP. -By default, upgrading egress gateways will sever any connections that are flowing through them. To minimise impact, -the egress gateway feature supports some advanced options that give feedback to affected pods. For more details see -the [egress gateway maintenance guide](egress-gateway-maintenance.mdx). +::: ## Additional resources -Please see also: - -- The `egressIP...` fields of the [FelixConfiguration resource](../../reference/resources/felixconfig.mdx#spec). -- [Additional configuration for egress gateway maintenance](egress-gateway-maintenance.mdx) +- [Egress gateway maintenance](egress-gateway-maintenance.mdx) +- [FelixConfiguration `egressIP...` fields](../../reference/resources/felixconfig.mdx#spec) diff --git a/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx b/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx index 7be8194933..a63d87fb51 100644 --- a/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx +++ b/calico-enterprise/networking/egress/egress-gateway-host-ip.mdx @@ -2,646 +2,127 @@ description: Configure egress gateways to use the host node IP as the source address for traffic leaving the cluster. --- -# Configure egress gateways with Host IP support +# Configure egress gateways with host IP support ## Big picture -Configure specific application traffic to exit the cluster through an egress gateway, using the -gateway's **host (node) IP** as the source address for traffic leaving the cluster. +Configure an existing egress gateway deployment so that traffic exiting through it appears to come +from the **host address** of the host running the gateway pod, rather than the gateway's pod IP. ## Value -When traffic from particular applications leaves the cluster to access an external destination, it -can be useful to control the source IP of that traffic. For example, there may be an additional -firewall around the cluster, whose purpose includes policing external accesses from the cluster, and -specifically that particular external destinations can only be accessed from authorized workloads -within the cluster. +External firewalls and services often allowlist traffic by source IP. When egress gateway pod IPs +are not routable outside the cluster or simply not convenient to manage using the gateway's +**host address** as the source address gives external systems a stable, well-known set of IPs to +allowlist. -In this mode, outbound traffic passing through an egress gateway is source-NATed (SNAT) to the -**node IP of the host** where the egress gateway pod is running, rather than the gateway's own pod -IP. This is useful when external firewalls or services need to allowlist traffic based on a -stable set of known node IPs, or when pod IPs are not routable outside the cluster. - -By scheduling egress gateways to specific nodes and setting `natOutgoing: true` on the egress IP -pool, you ensure that all application traffic routed through those gateways exits the cluster with -the node IP of the gateway's host as the source address. Any number of application pods can have -their outbound connections multiplexed through a fixed small number of egress gateways, and all of -those outbound connections will appear to come from the gateway nodes' IPs. +Any number of application pods can multiplex their outbound traffic through a small fixed set of +egress gateways, and all of those connections will appear to come from the gateways' host address. :::note -The source port of an outbound flow through an egress gateway can generally _not_ be -preserved. Changing the source port is how Linux maps flows from many upstream IPs onto a single -downstream IP. +Host IP mode is most commonly used with the [on-premises setup](egress-gateway-on-prem.mdx), where +the alternative source IP would be a non-routable pod IP. On [AWS](egress-gateway-aws.mdx) and +[Azure](egress-gateway-azure.mdx) setups, gateways already use native VPC/VNet IPs that are +routable on the underlying network - enabling host IP mode there replaces those native IPs with +the host address. ::: -Egress gateways with host IP support are particularly useful when you want all outbound traffic from -a particular application to leave the cluster through a particular node or nodes, and to appear as -traffic originating from those nodes' IPs. The gateways are scheduled to the desired nodes, and the -application pods/namespaces are configured to use those gateways. - ## Concepts -### Egress gateway - -An egress gateway acts as a transit pod for the outbound application traffic that is configured to -use it. As traffic leaving the cluster passes through the egress gateway, its source IP is changed -before the traffic is forwarded on. - -### Source IP with host IP mode - -When an outbound application flow leaves the cluster through an egress gateway, the source IP -depends on whether `natOutgoing` is enabled on the egress gateway's [IP pool](../../reference/resources/ippool.mdx). - -- If the egress gateway's IP pool has `natOutgoing: true`, the flow's source IP is the **node (host) - IP** of the node where the egress gateway pod is running. This is the **host IP mode** described in - this guide. -- If `natOutgoing: false` (or unset), the flow's source IP is the egress gateway's **pod IP**. - -In host IP mode, external services and firewalls see connections arriving from the egress gateway's -node IP. This is useful when node IPs are stable and well-known, making them suitable for firewall -allowlisting. - -### Control the use of egress gateways - -If a cluster ascribes special meaning to traffic flowing through egress gateways, it will be -important to control when cluster users can configure their pods and namespaces to use them, so that -non-special pods cannot impersonate the special meaning. - -If namespaces in a cluster can only be provisioned by cluster admins, one option is to enable egress -gateway function only on a per-namespace basis. Then only cluster admins will be able to configure -any egress gateway usage. - -Otherwise -- if namespace provisioning is open to users in general, or if it's desirable for egress -gateway function to be enabled both per-namespace and per-pod -- a [Kubernetes admission controller](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/) - will be -needed. This is a task for each deployment to implement for itself, but possible approaches include -the following. - -1. Decide whether a given Namespace or Pod is permitted to use egress annotations at all, based on - other details of the Namespace or Pod definition. - -1. Evaluate egress annotation selectors to determine the egress gateways that they map to, and - decide whether that usage is acceptable. - -1. Impose the cluster's own bespoke scheme for a Namespace or Pod to identify the egress gateways - that it wants to use, less general than $[prodname]'s egress annotations. Then the - admission controller would police those bespoke annotations (that that cluster's users could - place on Namespace or Pod resources) and either reject the operation in hand, or allow it - through after adding the corresponding $[prodname] egress annotations. - -### Policy enforcement for flows via an egress gateway - -For an outbound connection from a client pod, via an egress gateway, to a destination outside the -cluster, there is more than one possible enforcement point for policy: - -The path of the traffic through policy is as follows: - -1. Packet leaves the client pod and passes through its egress policy. -2. The packet is encapsulated by the client pod's host and sent to the egress gateway -3. The encapsulated packet is sent from the host to the egress gateway pod. -4. The egress gateway pod de-encapsulates the packet and sends the packet out again with its own address. -5. The packet leaves the egress gateway pod through its egress policy. - -To ensure correct operation, (as of v3.15) the encapsulated traffic between host and egress gateway is auto-allowed by -$[prodname] and other ingress traffic is blocked. That means that there are effectively two places where -policy can be applied: - -1. on egress from the client pod -2. on egress from the egress gateway pod (see limitations below). - -The policy applied at (1) is the most powerful since it implicitly sees the original source of the traffic (by -virtue of being attached to that original source). It also sees the external destination of the traffic. - -Since an egress gateway will never originate its own traffic, one option is to rely on policy applied at (1) and -to allow all traffic to at (2) (either by applying no policy or by applying an "allow all"). +### Source IP and `natOutgoing` -Alternatively, for maximum "defense in depth" applying policy at both (1) and (2) provides extra protection should -the policy at (1) be disabled or bypassed by an attacker. Policy at (2) has the following limitations: +When an outbound application flow leaves the cluster through an egress gateway, the source IP seen +by external services depends on the `natOutgoing` setting of the egress gateway's +[IP pool](../../reference/resources/ippool.mdx): -- [Domain-based policy](../../network-policy/domain-based-policy.mdx) is not supported at egress from egress - gateways. It will either fail to match the expected traffic, or it will work intermittently if the egress gateway - happens to be scheduled to the same node as its clients. This is because any DNS lookup happens at the client pod. - By the time the policy reaches (2) the DNS information is lost and only the IP addresses of the traffic are available. - -- The traffic source will appear to be the egress gateway pod, the source information is lost in the address - translation that occurs inside the egress gateway pod. - -That means that policies at (2) will usually take the form of rules that match only on destination port and IP address, -either directly in the rule (via a CIDR match) or via a (non-domain based) NetworkSet. Matching on source has little -utility since the IP will always be the egress gateway and the port of translated traffic is not always preserved. - -:::note - -Since v3.15.0, $[prodname] also sends health probes to the egress gateway pods from the nodes where -their clients are located. In iptables mode, this traffic is auto-allowed at egress from the host and ingress -to the egress gateway. In eBPF mode, the probe traffic can be blocked by policy, so you must ensure that this traffic allowed; this should be fixed in an upcoming -patch release. - -::: +- `natOutgoing: false` - the flow's source IP is the egress gateway's **pod IP**. This + is the default for all egress gateway setup guides. +- `natOutgoing: true` - the flow's source IP is the **host address** of the node where the egress + gateway pod is running. ## Before you begin -**Required** - -- Calico CNI -- Open port UDP 4790 on the host - -**Not Supported** - -- GKE -- Windows +These instructions require a functioning egress gateway deployment. +For setup, see [our egress gateway guides](index.mdx). ## How to -- [Enable egress gateway support](#enable-egress-gateway-support) -- [Provision an egress IP pool](#provision-an-egress-ip-pool) -- [Deploy a group of egress gateways](#deploy-a-group-of-egress-gateways) -- [Configure iptables backend for egress gateways](#configure-iptables-backend-for-egress-gateways) -- [Affine a client pod to a specific node](#affine-a-client-pod-to-a-specific-node) -- [Configure namespaces and pods to use egress gateways](#configure-namespaces-and-pods-to-use-egress-gateways) -- [Optionally enable ECMP load balancing](#optionally-enable-ecmp-load-balancing) -- [Verify the feature operation](#verify-the-feature-operation) -- [Control the use of egress gateways](#control-the-use-of-egress-gateways) -- [Upgrade egress gateways](#upgrade-egress-gateways) - -### Enable egress gateway support +- [Enable natOutgoing on the egress IP pool](#enable-host-ip-mode-on-the-egress-ip-pool) +- [Pin egress gateways to specific nodes](#pin-egress-gateways-to-specific-nodes) +- [Affine client pods to a specific node](#affine-client-pods-to-a-specific-node) +- [Verify the source IP](#verify-the-source-ip) -In the default **FelixConfiguration**, set the `egressIPSupport` field to `EnabledPerNamespace` or -`EnabledPerNamespaceOrPerPod`, according to the level of support that you need in your cluster. For -support on a per-namespace basis only: +### Enable natOutgoing on the egress IP pool -```bash -kubectl patch felixconfiguration default --type='merge' -p \ - '{"spec":{"egressIPSupport":"EnabledPerNamespace"}}' -``` - -Or for support both per-namespace and per-pod: +Set `natOutgoing: true` on the IP pool used by your egress gateways: ```bash -kubectl patch felixconfiguration default --type='merge' -p \ - '{"spec":{"egressIPSupport":"EnabledPerNamespaceOrPerPod"}}' +kubectl patch ippool egress-ippool-1 --type='merge' -p '{"spec":{"natOutgoing":true}}' ``` -:::note - -- `egressIPSupport` must be the same on all cluster nodes, so you should set them only in the - `default` FelixConfiguration resource. -- The operator automatically enables the required policy sync API in the FelixConfiguration. +Outbound traffic leaving the cluster through a gateway in this pool will now be SNAT'd to the node +IP of the gateway's host, instead of the gateway's pod IP. -::: +### Pin egress gateways to specific nodes -### Provision an egress IP pool - -Provision a small IP Pool with `natOutgoing: true`. This ensures that traffic exiting through egress -gateways using this pool is source-NATed to the host IP of the node running the gateway pod. +The source IP that external services see depends on which node the gateway pod is +scheduled to. To make this deterministic, set a `nodeSelector` on the gateway template: ```bash -kubectl apply -f - <"}}}}}' ``` -Where: - -- `natOutgoing: true` is required for host IP mode. This causes traffic leaving the cluster through - an egress gateway to be SNAT'd to the **node IP** of the gateway's host, instead of the gateway's - pod IP. - -- It is best to set the `blockSize` to 32 so that each block contains only a single IP address: - - - Scheduling a single egress gateway to a node causes the node to claim a whole block. The other IPs in the block - are wasted unless a second egress gateway (with the same pool configuration) is scheduled to the same node. +Traffic passing through this gateway will exit the cluster with that node's IP as the +source address. - - Empty /32 blocks can always be reclaimed from other nodes if the pool runs out of blocks. This ensures that an - egress gateway can always be scheduled if there are free IPs in the pool. +Without pinning, the source IP will still be a node IP, but it could be any node the gateway +happens to land on. - Setting `strictAffinity` to `false` in the [IPAM configuration](../../reference/resources/ipamconfig) also prevents the - above problems by allowing nodes to "borrow" IPs from other nodes' blocks. However, using /32 blocks: +### Affine client pods to a specific node - - Avoids a dependency on a setting that is shared with other IP pools. +This step is optional. If you want a particular client's traffic to deterministically exit through +a particular node's IP, schedule the client to the same node as a gateway and apply an +[EgressGatewayPolicy](egress-gateway-on-prem.mdx#configure-a-namespace-or-pod-to-use-an-egress-gateway-egress-gateway-policy-method) +with `gatewayPreference: PreferNodeLocal`. The client will then prefer the gateway on its own node, +ensuring traffic exits with that node's IP. - - Results in simpler, uniform route advertisements (rather than a mix of block size routes and /32 routes). +For example, to pin a workload to a specific node, add a `nodeSelector` to its pod spec: - - Results in less route churn. - -- `nodeSelector: "!all()"` is recommended so that this egress IP pool is not accidentally used for cluster pods in general. Specifying this `nodeSelector` means that the IP pool is only used for pods that explicitly identify it in their `cni.projectcalico.org/ipv4pools` annotation. - -- Set `ipipMode` or `vxlanMode` to `Always` if the pod network has [IPIP or VXLAN](../configuring/vxlan-ipip.mdx) enabled. - - :::note - - This setting is not specific to egress gateway. In some cases where nodes happen to be in the same subnet, setting the value to `Never`will work the same as `Always`. It all depends on the hop from the client node to the egress gateway node. For example, if the client nodes are in the same AWS subnet, and you are using `Always` because some of the nodes are in different subnets, then `Never` will work for the egress IP Pool when the client and gateway nodes are in the same subnet. - - ::: - -### Deploy a group of egress gateways - -Use an egress gateway custom resource to deploy a group of egress gateways, using the egress IP Pool. -Because we are using host IP mode, you should schedule the egress gateway to a specific node using -`nodeSelector` so that you know which node IP will be used as the source address for egress traffic. - -```bash -kubectl apply -f - < - terminationGracePeriodSeconds: 0 -EOF -``` - -Replace `` with the hostname of the node where you want the egress gateway to -run. Traffic passing through this gateway will exit the cluster with this node's IP as the source -address. - -:::note - -When deploying egress gateway in a non-default namespace on OpenShift, the namespace needs to be set privileged by adding the following to the namespace: - -##### Label -``` -openshift.io/run-level: "0" -pod-security.kubernetes.io/enforce: privileged -pod-security.kubernetes.io/enforce-version: latest -``` -##### Annotation -``` -security.openshift.io/scc.podSecurityLabelSync: "false" -``` -::: - -Where: - -- It is advisable to have more than one egress gateway per group, so that the egress IP function continues if one of the gateways crashes or needs to be restarted. When there are multiple gateways in a group, outbound traffic from the applications using that group is load-balanced across the available gateways. The number of `replicas` specified must be less than or equal to the number of free IP addresses in the IP Pool. - -- IPPool can be specified either by its name (e.g. `-name: egress-ippool-1`) or by its CIDR (e.g. `-cidr: 10.10.10.0/31`). - -- The labels are arbitrary. You can choose whatever names and values are convenient for your cluster's Namespaces and Pods to refer to in their egress selectors. - - If labels are not specified, a default label `projectcalico.org/egw`:`name` will be added by the Tigera Operator. - -- icmpProbe may be used to specify the Probe IPs, ICMP interval and timeout in seconds. `ips` if set, the - egress gateway pod will probe each IP periodically using an ICMP ping. If all pings fail then the egress - gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. - `timeoutSeconds` controls the timeout before reporting non-ready if no probes succeed. - - ```yaml - icmpProbe: - ips: - - probeIP - - probeIP - timeoutSeconds: 20 - intervalSeconds: 10 - ``` - -- httpProbe may be used to specify the Probe URLs, HTTP interval and timeout in seconds. `urls` if set, the - egress gateway pod will probe each external service periodically. If all probes fail then the egress - gateway will report non-ready via its health port. `intervalSeconds` controls the interval between probes. - `timeoutSeconds` controls the timeout before reporting non-ready if all probes are failing. - - ```yaml - httpProbe: - urls: - - probeURL - - probeURL - timeoutSeconds: 30 - intervalSeconds: 10 - ``` -- Please refer to the [operator reference docs](../../reference/installation/api.mdx) for details about the egress gateway resource type. - -The health port `8080` is used by: - -- The Kubernetes `readinessProbe` to expose the status of the egress gateway pod (and any ICMP/HTTP - probes). -- Remote pods to check if the egress gateway is "ready". Only "ready" egress - gateways will be used for remote client traffic. This traffic is automatically allowed by $[prodname] and - no policy is required to allow it. $[prodname] only sends probes to egress gateway pods that have a named - "health" port. This ensures that during an upgrade, health probes are only sent to upgraded egress gateways. - -### Deploy on a RKE2 CIS-hardened cluster - -If you are deploying `egress-gateway` on a RKE2 CIS-hardened cluster, its `PodSecurityPolicies` restrict the `securityContext` and `volumes` required by egress gateway. When deploying using the egress gateway custom resource, the Tigera Operator sets up `PodSecurityPolicy`, `Role`, `RoleBinding` and associated `ServiceAccount`. - -### Configure iptables backend for egress gateways - -The Tigera Operator configures egress gateways to use the same iptables backend as `calico-node`. -To modify the iptables backend for egress gateways, you must change the `iptablesBackend` field in the [Felix configuration](../../reference/resources/felixconfig.mdx). - -### Configure IP autodetection for dual-ToR clusters. - -If you plan to use Egress Gateways in a [dual-ToR cluster](../configuring/dual-tor.mdx), you must also adjust the $[nodecontainer] IP -auto-detection method to pick up the stable IP, for example using the `interface: lo` setting -(The default first-found setting skips over the lo interface). This can be configured via the -$[prodname] [Installation resource](../../reference/installation/api.mdx#nodeaddressautodetection). - -### Affine a client pod to a specific node - -In host IP mode, you may want to control which node your client pods run on to ensure deterministic -routing through a specific egress gateway. Use `nodeSelector` to schedule a client pod to a specific -node: - -```bash -kubectl apply -f - < - containers: - - name: alpine - image: alpine - command: ["/bin/sleep"] - args: ["infinity"] -EOF ``` -Replace `` with the hostname of the desired node. When combined with an egress gateway -policy that uses `gatewayPreference: PreferNodeLocal`, the client pod will prefer to route traffic -through an egress gateway on the same node, ensuring the traffic exits with that node's IP. - -### Configure namespaces and pods to use egress gateways +### Verify the source IP -You can configure namespaces and pods to use an egress gateway by: -* annotating the namespace or pod -* applying an egress gateway policy to the namespace or pod. - -Using an egress gateway policy is more complicated, but it allows advanced use cases. - -#### Configure a namespace or pod to use an egress gateway (annotation method) - -In a $[prodname] deployment, the Kubernetes namespace and pod resources honor annotations that -tell that namespace or pod to use particular egress gateways. These annotations are selectors, and -their meaning is "the set of pods, anywhere in the cluster, that match those selectors". - -So, to configure all the pods in a namespace to use the egress gateways that are -labelled with `egress-code: red`, you would annotate that namespace like this: - -```bash -kubectl annotate ns egress.projectcalico.org/selector="egress-code == 'red'" -``` - -By default, that selector can only match egress gateways in the same namespace. To select gateways -in a different namespace, specify a `namespaceSelector` annotation as well, like this: - -```bash -kubectl annotate ns egress.projectcalico.org/namespaceSelector="projectcalico.org/name == 'default'" -``` - -Egress gateway annotations have the same [syntax and range of expressions](../../reference/resources/networkpolicy.mdx#selector) as the selector fields in -$[prodname] [network policy](../../reference/resources/networkpolicy.mdx#entityrule). - -To configure a specific Kubernetes Pod to use egress gateways, specify the same annotations when -creating the pod. For example: - -```bash -kubectl apply -f - < egress.projectcalico.org/egressGatewayPolicy="egw-policy1" -``` - -To configure a specific Kubernetes pod to use the same policy, specify the same annotations when -creating the pod. -For example: - -```bash -kubectl apply -f - < -o jsonpath='{.status.addresses[?(@.type=="InternalIP")].address}' ``` -By way of a concrete example, you could use netcat to run a test server outside the cluster; for -example: - -```bash -docker run --net=host --privileged subfuzion/netcat -v -l -k -p 8089 -``` +Initiate an outbound connection from one of your client pods to a server outside the cluster, and +observe the source IP on the server. With host IP mode, it should match the node's InternalIP +above -- not the egress gateway's pod IP or any IP from the egress IP pool. -Then provision an egress IP Pool (with `natOutgoing: true`), and egress gateways, as above. - -Then deploy a pod, with egress annotations as above, and with any image that includes netcat, for example: - -```bash -kubectl apply -f - < -n -- nc 8089 ` should be the IP address of the netcat server. - -Then, if you check the logs or output of the netcat server, you should see: - -``` -Connection from received -``` - -with `` being the **node IP** of the host where the egress gateway pod is running (not -the gateway's pod IP or the egress IP pool IP). - -## Upgrade egress gateways - -From v3.16, egress gateway deployments are managed by the Tigera Operator. - -- When upgrading from a pre-v3.16 release, no automatic upgrade will occur. To upgrade a pre-v3.16 egress gateway deployment, - create an equivalent EgressGateway resource with the same namespace and the same name as mentioned [above](#deploy-a-group-of-egress-gateways); - the operator will then take over management of the old Deployment resource, replacing it with the upgraded version. +:::note -- Use `kubectl apply` to create the egress gateway resource. Tigera Operator will read the newly created resource and wait - for the other $[prodname] components to be upgraded. Once the other $[prodname] components are upgraded, Tigera Operator - will upgrade the existing egress gateway deployment with the new image. +For return traffic to reach the gateway, the external server must know how to route to the egress +gateway's node IP. -By default, upgrading egress gateways will sever any connections that are flowing through them. To minimise impact, -the egress gateway feature supports some advanced options that give feedback to affected pods. For more details see -the [egress gateway maintenance guide](egress-gateway-maintenance.mdx). +::: ## Additional resources -Please see also: - -- The `egressIP...` fields of the [FelixConfiguration resource](../../reference/resources/felixconfig.mdx#spec). -- [Additional configuration for egress gateway maintenance](egress-gateway-maintenance.mdx) +- [Egress gateway maintenance](egress-gateway-maintenance.mdx) +- [FelixConfiguration `egressIP...` fields](../../reference/resources/felixconfig.mdx#spec)