Preserving Real Client IPs Across an Edge → Production Envoy Chain

This document describes how to preserve the original client IP across an edge → tunnel → production proxy chain when using Envoy Gateway with mergeGateways, mixed HTTPS / SSH traffic, and a netbird overlay between sites. The same principles apply to any setup with an L4 NAT in the path (AWS NLB, kube-proxy, HAProxy, etc.) — netbird is just one example.

:::note Why Gitea as the worked example? Gitea is convenient here because it exposes both HTTP (web UI, API) and SSH (git operations) on separate listeners, which lets a single document cover both mental models side by side: XFF carrying the client IP after L7 termination, and PPv2 carrying it across pure-L4 hops to a non-HTTP application. The patterns aren't Gitea-specific. Every app you'd put behind this gateway falls into one of the two paths — pure-HTTP services (a web app, an API, a registry) follow the HTTPS path; pure-TCP services (Postgres, Redis, MQTT, an SMTP relay) follow the SSH-style chain. Substitute your own backend and the policies don't change. :::

Architecture

              ┌──────────────────────────────────────┐
              │            Edge cluster              │
              │   ┌─────────────┐                    │
  client ───► │   │ edge envoy  │── outbound ─┐      │
              │   │ (L4 only)   │             │      │
              │   └─────────────┘             │      │
              │              ┌────────────────▼─────┐│
              │              │ netbird sidecar (WG) ││
              │              └──────────┬───────────┘│
              └─────────────────────────┼────────────┘
                                        │  encrypted WG
              ┌─────────────────────────┼────────────┐
              │     Production cluster  │            │
              │              ┌──────────▼─────────┐  │
              │              │ netbird peer (NAT) │  │
              │              └──────────┬─────────┘  │
              │              ┌──────────▼─────────┐  │
              │              │ production envoy   │  │
              │              │  TLS terminate +   │  │
              │              │  HTTPRoute / TCP   │  │
              │              └────┬──────────┬────┘  │
              │                   │          │       │
              │           ┌───────▼────┐ ┌───▼─────┐ │
              │           │  HTTP      │ │  Gitea  │ │
              │           │  backends  │ │ rootless│ │
              │           │ (cleartext │ │ SSH:2222│ │
              │           │  in-cluster)│└─────────┘ │
              │           └────────────┘             │
              └──────────────────────────────────────┘

Routes:

HTTPS: TLSRoute (SNI passthrough) at the edge, HTTPRoute at production. The production envoy terminates TLS using a wildcard certificate; the edge envoy never sees cleartext or holds any cert. Once production has terminated, traffic flows in cleartext to in-cluster backends with X-Forwarded-For carrying the client IP.
SSH: TCPRoute end to end, terminating at Gitea's built-in SSH server (rootless image, listens on 2222).

This split — passthrough on the edge, termination in production — keeps the wildcard cert isolated to the trusted zone and gives production envoy full L7 visibility for header-based filtering, geoip, etc.

Why the client IP disappears

The netbird peer in production acts as a routing gateway. When it forwards a packet out of the WireGuard tunnel into the prod network, it rewrites the IP-layer source to its own address. This is normal masquerade behavior — anything not in the WireGuard mesh address space gets SNATed.

By the time a connection reaches the production envoy, the IP-layer source is the netbird peer's address. The real client's IP, and even the edge envoy's IP, have been overwritten and are unrecoverable from the IP layer.

X-Forwarded-For solves this for HTTP, but XFF lives inside the application protocol. It requires some proxy in the chain to terminate TLS and parse HTTP. With the edge envoy doing TLSRoute passthrough and the SSH path having no application-layer headers ever, there's no place for XFF until production envoy terminates TLS — or at all, in the SSH case.

The solution: PROXY protocol v2

PPv2 is a small binary prefix prepended to the TCP stream before any TLS handshake or application data. It carries the original source/destination IP and port. Because it lives in the TCP payload, NAT can't touch it — netbird forwards the bytes verbatim. The production envoy reads the prefix at the connection layer, knows the real source, and from there can either re-emit PPv2 on egress (for SSH) or fold the IP into XFF on egress (for HTTP after TLS termination).

The wire format on each PPv2-wrapped hop looks like:

[PPv2 header (~28 bytes IPv4)] [TLS handshake or SSH banner ...]

Two knobs in Envoy Gateway

PPv2 has two independent directions, configured by two different policies:

ClientTrafficPolicy.proxyProtocol — ingress. Listener accepts PPv2 on incoming connections and uses the header's IP as the logical source.
BackendTrafficPolicy.proxyProtocol — egress. Cluster prepends PPv2 when proxying to a specific backend, carrying whatever IP envoy currently sees.

These are independent. You enable each one separately on the side of the connection it applies to. A given proxy can do one, the other, both, or neither.

Deprecation note: spec.enableProxyProtocol: true is deprecated as of Envoy Gateway v1.5. Use spec.proxyProtocol: { optional: false } instead. optional: true is a permissive mode that accepts both PPv2 and bare connections, useful when one listener serves both proxy-aware peers and direct browser clients.

Mental model: HTTPS vs SSH

This is the part that's worth internalizing before touching any YAML.

HTTPS path (passthrough at edge, terminate at production)

client ─[1]─► edge ─[2]─► production envoy ─[3]─► HTTP backend
                          (terminates TLS)

Hop	Mechanism	Why
1	nothing extra	Direct TCP, source IP is intact at TCP level
2	PPv2 (egress edge → ingress production)	Production sees netbird IP otherwise
3	XFF in HTTP	Production envoy terminates TLS and adds it natively

Once TLS is terminated at production envoy, that envoy is a normal HTTP-aware proxy. It reads the real IP from the PPv2 prefix it just consumed, then writes that IP into the X-Forwarded-For header on every HTTP request it forwards to a backend. From there, XFF carries the IP to the application. One PPv2 hop is enough on the HTTPS path.

SSH path (TCPRoute end to end)

client ─[1]─► edge ─[2]─► production ─[3]─► Gitea SSH (terminates SSH)

Hop	Mechanism	Why
1	nothing extra	Direct TCP
2	PPv2 (edge → production)	Same NAT problem as HTTPS
3	PPv2 (production → Gitea)	SSH has no application-layer header for client IP, ever

The asymmetry: SSH needs PPv2 on every L4 hop, all the way to its terminator. HTTPS needs PPv2 only on the L4 hops before TLS termination; once production envoy terminates HTTPS, XFF takes over.

For this topology, that means:

One ingress ClientTrafficPolicy on production envoy (covers both paths — PPv2 acceptance happens before route dispatch).
One egress BackendTrafficPolicy on production for the SSH TCPRoute (no equivalent needed for the HTTP backends — XFF handles it).

Step-by-step configuration

1. Edge cluster: emit PPv2 outbound

One BackendTrafficPolicy per route kind, targeting the routes that traverse netbird to production.

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata: { name: send-ppv2-https }
spec:
  targetRefs:
    - { group: gateway.networking.k8s.io, kind: TLSRoute, name: to-production-https }
  proxyProtocol: { version: V2 }
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata: { name: send-ppv2-ssh }
spec:
  targetRefs:
    - { group: gateway.networking.k8s.io, kind: TCPRoute, name: to-production-ssh }
  proxyProtocol: { version: V2 }

2. Production: HTTPS listener with TLS termination

The Gateway holds the wildcard cert and terminates TLS on a TLS-mode listener. HTTPRoutes attach to this listener for L7 dispatch.

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata: { name: production-eg }
spec:
  gatewayClassName: eg
  listeners:
    - name: https
      port: 443
      protocol: HTTPS
      hostname: "*.example.com"
      tls:
        mode: Terminate
        certificateRefs:
          - { kind: Secret, name: wildcard-example-com }
      allowedRoutes: { kinds: [{ kind: HTTPRoute }] }
    - name: gitea-ssh
      port: 2222
      protocol: TCP
      allowedRoutes: { kinds: [{ kind: TCPRoute }] }

3. Production: accept PPv2 on the Gateway

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata: { name: accept-ppv2 }
spec:
  targetRefs:
    - { group: gateway.networking.k8s.io, kind: Gateway, name: production-eg }
  proxyProtocol:
    optional: false   # strict mode: every connection must speak PPv2

With mergeGateways: true, all Gateways sharing the GatewayClass become listeners on one Envoy. If any listener on the merged Envoy serves direct browser clients (no upstream proxy adding PPv2), don't enable PPv2 globally — those connections will be reset at L4. Either:

Scope the policy with sectionName to specific listeners, or
Use optional: true to accept both, at a small detection cost.

PPv2 acceptance happens at the connection layer, before TLS termination and before route dispatch, so a single policy on the Gateway covers both the HTTPS and TCPRoute listeners.

4. Production: emit PPv2 to Gitea SSH

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata: { name: send-ppv2-gitea }
spec:
  targetRefs:
    - { group: gateway.networking.k8s.io, kind: TCPRoute, name: gitea-ssh }
  proxyProtocol: { version: V2 }

This is the policy that's easy to forget — without it, Gitea sees the production envoy's pod IP because the egress hop carries no PPv2. The ClientTrafficPolicy only handles the ingress side. No equivalent policy is needed for the HTTP backends — production envoy terminates TLS, parses HTTP, and adds XFF on outgoing HTTP requests automatically.

5. Gitea: read PPv2 on the SSH listener (rootless, env vars)

Gitea has two separate PROXY protocol toggles. USE_PROXY_PROTOCOL only affects the HTTP listener; SSH has its own SSH_SERVER_USE_PROXY_PROTOCOL. Since Gitea's HTTP is reached via production envoy (which already adds XFF), turn off the HTTP wrapper and use XFF natively. SSH is the only listener that needs PPv2.

env:
  - { name: GITEA__server__USE_PROXY_PROTOCOL,               value: "false" }
  - { name: GITEA__server__SSH_SERVER_USE_PROXY_PROTOCOL,    value: "true" }
  - { name: GITEA__server__PROXY_PROTOCOL_HEADER_TIMEOUT,    value: "5s" }
  - { name: GITEA__server__PROXY_PROTOCOL_TRUSTED_ADDRESSES, value: "10.100.0.0/16" }
  - { name: GITEA__server__REVERSE_PROXY_TRUSTED_PROXIES,    value: "10.100.0.0/16" }
  - { name: GITEA__server__START_SSH_SERVER,                 value: "true" }

Notes:

The format is GITEA__<section>__<KEY>: section lowercase, key as in app.ini, double underscores.
PROXY_PROTOCOL_HEADER_TIMEOUT and PROXY_PROTOCOL_TRUSTED_ADDRESSES are shared parsing knobs — they apply to whichever listener has the wrapper enabled (SSH only here). PROXY_PROTOCOL_TLS_BRIDGING is dropped because it only affects the HTTP wrapper.
TRUSTED_ADDRESSES is critical for SSH safety. Without it, anyone reachable on the SSH port could spoof a source IP via a forged PPv2 header. Pin to the production envoy's pod CIDR.
REVERSE_PROXY_TRUSTED_PROXIES tells Gitea which sources are trusted to set XFF — this is how the real client IP reaches Gitea on the HTTP side. Same CIDR as above.
START_SSH_SERVER=true is required: Gitea's PPv2 support only wraps Gitea's own SSH listener. If you delegate to host sshd, OpenSSH doesn't speak PPv2 and you'd need a wrapper like mmproxy or socat.
Restart the pod after adding env vars.

6. (Optional) HTTP healthcheck note

With USE_PROXY_PROTOCOL=false (option above), kubelet httpGet probes work normally — no extra change. If you ever flip the HTTP wrapper back on, switch the probe to tcpSocket: { port: 3000 } so kubelet doesn't get rejected for not speaking PPv2.

Verification

Status conditions on each policy:

kubectl get clienttrafficpolicy,backendtrafficpolicy -A -o yaml | \
  yq '.items[] | {name: .metadata.name, conds: .status.ancestors[].conditions}'

Look for Accepted: True. TargetNotFound means a typo in targetRefs.name or wrong namespace.

Live config dump: the receiver-side filter and the sender-side transport socket are visible in Envoy's admin endpoint:

# receiver: proxy_protocol listener filter present?
kubectl exec -n envoy-gateway-system <production-envoy-pod> -c envoy -- \
  curl -s localhost:19000/config_dump | \
  jq '.. | .listener_filters? // empty | map(.name)'

# sender: upstream_proxy_protocol transport socket present?
kubectl exec -n envoy-gateway-system <production-envoy-pod> -c envoy -- \
  curl -s localhost:19000/config_dump | \
  jq '.. | select(.transport_socket?.name? == "envoy.transport_sockets.upstream_proxy_protocol")'

Bytes on the wire: the most direct check. From inside the receiving pod (production envoy or Gitea):

tcpdump -i eth0 -X -s 200 'tcp port 2222 and tcp[tcpflags] & tcp-push != 0' | head -40

A successful hop starts with the PPv2 v2 magic — visible in the -X hex dump as:

0d 0a 0d 0a 00 0d 0a 51 55 49 54 0a   ...QUIT.

If those bytes aren't there, the previous hop isn't emitting PPv2.

End-to-end IP check: trigger a connection from a known external IP and verify it appears in application logs:

# Note your public IP first
curl -s ifconfig.me; echo
# → e.g. 203.0.113.42

# SSH (use a fake user so auth fails fast and a log line is produced)
ssh -p 22 -T -o StrictHostKeyChecking=no -o BatchMode=yes \
    not-a-real-user@gitea.example.com

# Watch Gitea logs
kubectl logs -n <gitea-ns> deploy/gitea -f --tail=20 | \
    grep -E 'Failed connection|Failed authentication'

The IP that appears tells you exactly which hop is intact:

What you see in logs	What it means
`203.0.113.42` (your real IP)	All hops working end-to-end
Production envoy pod IP	Production → Gitea hop missing PPv2 (`BackendTrafficPolicy` or `SSH_SERVER_USE_PROXY_PROTOCOL`)
Netbird peer IP	Edge → production hop missing PPv2 (`BackendTrafficPolicy` on edge or `ClientTrafficPolicy` on production)

Common pitfalls

Browser hits a strict PPv2 listener. The connection is reset at L4 with no error page. Symptom: ERR_CONNECTION_RESET in the browser, WRONG_VERSION_NUMBER from the TLS inspector in Envoy logs. Use optional: true or split listeners.
mergeGateways listener coherence. Two Gateways on the same merged Envoy can't have conflicting PPv2 settings on the same port. Status will say something like "ClientTrafficPolicy is being applied to multiple http listeners on the same port".
HTTPS works, SSH doesn't. Almost always means the egress BackendTrafficPolicy for the SSH route is missing or didn't apply, or SSH_SERVER_USE_PROXY_PROTOCOL isn't set on Gitea. XFF papers over the gap on HTTP, but SSH has no fallback.
Confusing the two Gitea flags. USE_PROXY_PROTOCOL is HTTP-only. SSH needs SSH_SERVER_USE_PROXY_PROTOCOL separately. Setting only the first leaves SSH reading PPv2 bytes as the SSH version string and overflowing.
Gitea logs still show envoy's pod IP after enabling everything. Either bytes aren't arriving (sender side broken), or START_SSH_SERVER=false (host sshd in use), or TRUSTED_ADDRESSES excludes the envoy pod CIDR (Gitea reads the header but ignores it as untrusted).
HTTP healthcheck rejected with Unexpected proxy header: [71 69 84 ...]. The bytes decode to "GET ..." — kubelet's HTTP probe hit a listener that requires PPv2. Either disable USE_PROXY_PROTOCOL (recommended; XFF covers the IP visibility) or switch the probe to tcpSocket.

Security note

Production envoy holds the wildcard certificate and terminates TLS — that's where any L7 trust boundary lives. The edge envoy is intentionally L4-only and never sees cleartext or holds private keys, which keeps the cert material isolated to the trusted zone.

mTLS between the two envoys isn't a natural fit for this topology. The bytes between edge and production are the original client's TLS handshake plus the PPv2 prefix; the edge isn't terminating anything, so there's no second TLS layer to mutually authenticate on. Proxy-to-proxy confidentiality and identity in this setup are delegated to netbird/WireGuard: anything that can reach production envoy's listener has already passed WireGuard authentication. If you ever need cryptographic identity at the application layer between the proxies, the path is to wrap the inter-proxy hop in an additional TLS tunnel — at the cost of double-encryption and more cert plumbing.

Reference

Envoy Gateway: ClientTrafficPolicy
Envoy Gateway: BackendTrafficPolicy
PROXY Protocol spec (HAProxy)
Gitea config cheat sheet — [server] section, especially USE_PROXY_PROTOCOL vs SSH_SERVER_USE_PROXY_PROTOCOL

Architecture​

Why the client IP disappears​

The solution: PROXY protocol v2​

Two knobs in Envoy Gateway​

Mental model: HTTPS vs SSH​

HTTPS path (passthrough at edge, terminate at production)​

SSH path (TCPRoute end to end)​

Step-by-step configuration​

1. Edge cluster: emit PPv2 outbound​

2. Production: HTTPS listener with TLS termination​

3. Production: accept PPv2 on the Gateway​

4. Production: emit PPv2 to Gitea SSH​

5. Gitea: read PPv2 on the SSH listener (rootless, env vars)​

6. (Optional) HTTP healthcheck note​

Verification​

Common pitfalls​

Security note​

Reference​