Preserving Real Client IPs Across an Edge → Production Envoy Chain
This document describes how to preserve the original client IP across an edge → tunnel → production proxy chain when using Envoy Gateway with mergeGateways, mixed HTTPS / SSH traffic, and a netbird overlay between sites. The same principles apply to any setup with an L4 NAT in the path (AWS NLB, kube-proxy, HAProxy, etc.) — netbird is just one example.
:::note Why Gitea as the worked example? Gitea is convenient here because it exposes both HTTP (web UI, API) and SSH (git operations) on separate listeners, which lets a single document cover both mental models side by side: XFF carrying the client IP after L7 termination, and PPv2 carrying it across pure-L4 hops to a non-HTTP application. The patterns aren't Gitea-specific. Every app you'd put behind this gateway falls into one of the two paths — pure-HTTP services (a web app, an API, a registry) follow the HTTPS path; pure-TCP services (Postgres, Redis, MQTT, an SMTP relay) follow the SSH-style chain. Substitute your own backend and the policies don't change. :::
Architecture
┌──────────────────────────────────────┐
│ Edge cluster │
│ ┌─────────────┐ │
client ───► │ │ edge envoy │── outbound ─┐ │
│ │ (L4 only) │ │ │
│ └─────────────┘ │ │
│ ┌────────────────▼─────┐│
│ │ netbird sidecar (WG) ││
│ └──────────┬───────────┘│
└─────────────────────────┼────────────┘
│ encrypted WG
┌─────────────────────────┼────────────┐
│ Production cluster │ │
│ ┌──────────▼─────────┐ │
│ │ netbird peer (NAT) │ │
│ └──────────┬─────────┘ │
│ ┌──────────▼─────────┐ │
│ │ production envoy │ │
│ │ TLS terminate + │ │
│ │ HTTPRoute / TCP │ │
│ └────┬──────────┬────┘ │
│ │ │ │
│ ┌───────▼────┐ ┌───▼─────┐ │
│ │ HTTP │ │ Gitea │ │
│ │ backends │ │ rootless│ │
│ │ (cleartext │ │ SSH:2222│ │
│ │ in-cluster)│└─────────┘ │
│ └────────────┘ │
└──────────────────────────────────────┘
Routes:
- HTTPS:
TLSRoute(SNI passthrough) at the edge,HTTPRouteat production. The production envoy terminates TLS using a wildcard certificate; the edge envoy never sees cleartext or holds any cert. Once production has terminated, traffic flows in cleartext to in-cluster backends withX-Forwarded-Forcarrying the client IP. - SSH:
TCPRouteend to end, terminating at Gitea's built-in SSH server (rootless image, listens on2222).
This split — passthrough on the edge, termination in production — keeps the wildcard cert isolated to the trusted zone and gives production envoy full L7 visibility for header-based filtering, geoip, etc.
Why the client IP disappears
The netbird peer in production acts as a routing gateway. When it forwards a packet out of the WireGuard tunnel into the prod network, it rewrites the IP-layer source to its own address. This is normal masquerade behavior — anything not in the WireGuard mesh address space gets SNATed.
By the time a connection reaches the production envoy, the IP-layer source is the netbird peer's address. The real client's IP, and even the edge envoy's IP, have been overwritten and are unrecoverable from the IP layer.
X-Forwarded-For solves this for HTTP, but XFF lives inside the application protocol. It requires some proxy in the chain to terminate TLS and parse HTTP. With the edge envoy doing TLSRoute passthrough and the SSH path having no application-layer headers ever, there's no place for XFF until production envoy terminates TLS — or at all, in the SSH case.
The solution: PROXY protocol v2
PPv2 is a small binary prefix prepended to the TCP stream before any TLS handshake or application data. It carries the original source/destination IP and port. Because it lives in the TCP payload, NAT can't touch it — netbird forwards the bytes verbatim. The production envoy reads the prefix at the connection layer, knows the real source, and from there can either re-emit PPv2 on egress (for SSH) or fold the IP into XFF on egress (for HTTP after TLS termination).
The wire format on each PPv2-wrapped hop looks like:
[PPv2 header (~28 bytes IPv4)] [TLS handshake or SSH banner ...]
Two knobs in Envoy Gateway
PPv2 has two independent directions, configured by two different policies:
ClientTrafficPolicy.proxyProtocol— ingress. Listener accepts PPv2 on incoming connections and uses the header's IP as the logical source.BackendTrafficPolicy.proxyProtocol— egress. Cluster prepends PPv2 when proxying to a specific backend, carrying whatever IP envoy currently sees.
These are independent. You enable each one separately on the side of the connection it applies to. A given proxy can do one, the other, both, or neither.
Deprecation note:
spec.enableProxyProtocol: trueis deprecated as of Envoy Gateway v1.5. Usespec.proxyProtocol: { optional: false }instead.optional: trueis a permissive mode that accepts both PPv2 and bare connections, useful when one listener serves both proxy-aware peers and direct browser clients.
Mental model: HTTPS vs SSH
This is the part that's worth internalizing before touching any YAML.
HTTPS path (passthrough at edge, terminate at production)
client ─[1]─► edge ─[2]─► production envoy ─[3]─► HTTP backend
(terminates TLS)
| Hop | Mechanism | Why |
|---|---|---|
| 1 | nothing extra | Direct TCP, source IP is intact at TCP level |
| 2 | PPv2 (egress edge → ingress production) | Production sees netbird IP otherwise |
| 3 | XFF in HTTP | Production envoy terminates TLS and adds it natively |
Once TLS is terminated at production envoy, that envoy is a normal HTTP-aware proxy. It reads the real IP from the PPv2 prefix it just consumed, then writes that IP into the X-Forwarded-For header on every HTTP request it forwards to a backend. From there, XFF carries the IP to the application. One PPv2 hop is enough on the HTTPS path.
SSH path (TCPRoute end to end)
client ─[1]─► edge ─[2]─► production ─[3]─► Gitea SSH (terminates SSH)
| Hop | Mechanism | Why |
|---|---|---|
| 1 | nothing extra | Direct TCP |
| 2 | PPv2 (edge → production) | Same NAT problem as HTTPS |
| 3 | PPv2 (production → Gitea) | SSH has no application-layer header for client IP, ever |
The asymmetry: SSH needs PPv2 on every L4 hop, all the way to its terminator. HTTPS needs PPv2 only on the L4 hops before TLS termination; once production envoy terminates HTTPS, XFF takes over.
For this topology, that means:
- One ingress
ClientTrafficPolicyon production envoy (covers both paths — PPv2 acceptance happens before route dispatch). - One egress
BackendTrafficPolicyon production for the SSH TCPRoute (no equivalent needed for the HTTP backends — XFF handles it).
Step-by-step configuration
1. Edge cluster: emit PPv2 outbound
One BackendTrafficPolicy per route kind, targeting the routes that traverse netbird to production.
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata: { name: send-ppv2-https }
spec:
targetRefs:
- { group: gateway.networking.k8s.io, kind: TLSRoute, name: to-production-https }
proxyProtocol: { version: V2 }
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata: { name: send-ppv2-ssh }
spec:
targetRefs:
- { group: gateway.networking.k8s.io, kind: TCPRoute, name: to-production-ssh }
proxyProtocol: { version: V2 }
2. Production: HTTPS listener with TLS termination
The Gateway holds the wildcard cert and terminates TLS on a TLS-mode listener. HTTPRoutes attach to this listener for L7 dispatch.
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata: { name: production-eg }
spec:
gatewayClassName: eg
listeners:
- name: https
port: 443
protocol: HTTPS
hostname: "*.example.com"
tls:
mode: Terminate
certificateRefs:
- { kind: Secret, name: wildcard-example-com }
allowedRoutes: { kinds: [{ kind: HTTPRoute }] }
- name: gitea-ssh
port: 2222
protocol: TCP
allowedRoutes: { kinds: [{ kind: TCPRoute }] }
3. Production: accept PPv2 on the Gateway
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: ClientTrafficPolicy
metadata: { name: accept-ppv2 }
spec:
targetRefs:
- { group: gateway.networking.k8s.io, kind: Gateway, name: production-eg }
proxyProtocol:
optional: false # strict mode: every connection must speak PPv2
With mergeGateways: true, all Gateways sharing the GatewayClass become listeners on one Envoy. If any listener on the merged Envoy serves direct browser clients (no upstream proxy adding PPv2), don't enable PPv2 globally — those connections will be reset at L4. Either:
- Scope the policy with
sectionNameto specific listeners, or - Use
optional: trueto accept both, at a small detection cost.
PPv2 acceptance happens at the connection layer, before TLS termination and before route dispatch, so a single policy on the Gateway covers both the HTTPS and TCPRoute listeners.
4. Production: emit PPv2 to Gitea SSH
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata: { name: send-ppv2-gitea }
spec:
targetRefs:
- { group: gateway.networking.k8s.io, kind: TCPRoute, name: gitea-ssh }
proxyProtocol: { version: V2 }
This is the policy that's easy to forget — without it, Gitea sees the production envoy's pod IP because the egress hop carries no PPv2. The ClientTrafficPolicy only handles the ingress side. No equivalent policy is needed for the HTTP backends — production envoy terminates TLS, parses HTTP, and adds XFF on outgoing HTTP requests automatically.
5. Gitea: read PPv2 on the SSH listener (rootless, env vars)
Gitea has two separate PROXY protocol toggles. USE_PROXY_PROTOCOL only affects the HTTP listener; SSH has its own SSH_SERVER_USE_PROXY_PROTOCOL. Since Gitea's HTTP is reached via production envoy (which already adds XFF), turn off the HTTP wrapper and use XFF natively. SSH is the only listener that needs PPv2.
env:
- { name: GITEA__server__USE_PROXY_PROTOCOL, value: "false" }
- { name: GITEA__server__SSH_SERVER_USE_PROXY_PROTOCOL, value: "true" }
- { name: GITEA__server__PROXY_PROTOCOL_HEADER_TIMEOUT, value: "5s" }
- { name: GITEA__server__PROXY_PROTOCOL_TRUSTED_ADDRESSES, value: "10.100.0.0/16" }
- { name: GITEA__server__REVERSE_PROXY_TRUSTED_PROXIES, value: "10.100.0.0/16" }
- { name: GITEA__server__START_SSH_SERVER, value: "true" }
Notes:
- The format is
GITEA__<section>__<KEY>: section lowercase, key as inapp.ini, double underscores. PROXY_PROTOCOL_HEADER_TIMEOUTandPROXY_PROTOCOL_TRUSTED_ADDRESSESare shared parsing knobs — they apply to whichever listener has the wrapper enabled (SSH only here).PROXY_PROTOCOL_TLS_BRIDGINGis dropped because it only affects the HTTP wrapper.TRUSTED_ADDRESSESis critical for SSH safety. Without it, anyone reachable on the SSH port could spoof a source IP via a forged PPv2 header. Pin to the production envoy's pod CIDR.REVERSE_PROXY_TRUSTED_PROXIEStells Gitea which sources are trusted to set XFF — this is how the real client IP reaches Gitea on the HTTP side. Same CIDR as above.START_SSH_SERVER=trueis required: Gitea's PPv2 support only wraps Gitea's own SSH listener. If you delegate to hostsshd, OpenSSH doesn't speak PPv2 and you'd need a wrapper likemmproxyorsocat.- Restart the pod after adding env vars.
6. (Optional) HTTP healthcheck note
With USE_PROXY_PROTOCOL=false (option above), kubelet httpGet probes work normally — no extra change. If you ever flip the HTTP wrapper back on, switch the probe to tcpSocket: { port: 3000 } so kubelet doesn't get rejected for not speaking PPv2.
Verification
Status conditions on each policy:
kubectl get clienttrafficpolicy,backendtrafficpolicy -A -o yaml | \
yq '.items[] | {name: .metadata.name, conds: .status.ancestors[].conditions}'
Look for Accepted: True. TargetNotFound means a typo in targetRefs.name or wrong namespace.
Live config dump: the receiver-side filter and the sender-side transport socket are visible in Envoy's admin endpoint:
# receiver: proxy_protocol listener filter present?
kubectl exec -n envoy-gateway-system <production-envoy-pod> -c envoy -- \
curl -s localhost:19000/config_dump | \
jq '.. | .listener_filters? // empty | map(.name)'
# sender: upstream_proxy_protocol transport socket present?
kubectl exec -n envoy-gateway-system <production-envoy-pod> -c envoy -- \
curl -s localhost:19000/config_dump | \
jq '.. | select(.transport_socket?.name? == "envoy.transport_sockets.upstream_proxy_protocol")'
Bytes on the wire: the most direct check. From inside the receiving pod (production envoy or Gitea):
tcpdump -i eth0 -X -s 200 'tcp port 2222 and tcp[tcpflags] & tcp-push != 0' | head -40
A successful hop starts with the PPv2 v2 magic — visible in the -X hex dump as:
0d 0a 0d 0a 00 0d 0a 51 55 49 54 0a ...QUIT.
If those bytes aren't there, the previous hop isn't emitting PPv2.
End-to-end IP check: trigger a connection from a known external IP and verify it appears in application logs:
# Note your public IP first
curl -s ifconfig.me; echo
# → e.g. 203.0.113.42
# SSH (use a fake user so auth fails fast and a log line is produced)
ssh -p 22 -T -o StrictHostKeyChecking=no -o BatchMode=yes \
not-a-real-user@gitea.example.com
# Watch Gitea logs
kubectl logs -n <gitea-ns> deploy/gitea -f --tail=20 | \
grep -E 'Failed connection|Failed authentication'
The IP that appears tells you exactly which hop is intact:
| What you see in logs | What it means |
|---|---|
203.0.113.42 (your real IP) | All hops working end-to-end |
| Production envoy pod IP | Production → Gitea hop missing PPv2 (BackendTrafficPolicy or SSH_SERVER_USE_PROXY_PROTOCOL) |
| Netbird peer IP | Edge → production hop missing PPv2 (BackendTrafficPolicy on edge or ClientTrafficPolicy on production) |
Common pitfalls
- Browser hits a strict PPv2 listener. The connection is reset at L4 with no error page. Symptom:
ERR_CONNECTION_RESETin the browser,WRONG_VERSION_NUMBERfrom the TLS inspector in Envoy logs. Useoptional: trueor split listeners. mergeGatewayslistener coherence. Two Gateways on the same merged Envoy can't have conflicting PPv2 settings on the same port. Status will say something like "ClientTrafficPolicy is being applied to multiple http listeners on the same port".- HTTPS works, SSH doesn't. Almost always means the egress
BackendTrafficPolicyfor the SSH route is missing or didn't apply, orSSH_SERVER_USE_PROXY_PROTOCOLisn't set on Gitea. XFF papers over the gap on HTTP, but SSH has no fallback. - Confusing the two Gitea flags.
USE_PROXY_PROTOCOLis HTTP-only. SSH needsSSH_SERVER_USE_PROXY_PROTOCOLseparately. Setting only the first leaves SSH reading PPv2 bytes as the SSH version string and overflowing. - Gitea logs still show envoy's pod IP after enabling everything. Either bytes aren't arriving (sender side broken), or
START_SSH_SERVER=false(host sshd in use), orTRUSTED_ADDRESSESexcludes the envoy pod CIDR (Gitea reads the header but ignores it as untrusted). - HTTP healthcheck rejected with
Unexpected proxy header: [71 69 84 ...]. The bytes decode to"GET ..."— kubelet's HTTP probe hit a listener that requires PPv2. Either disableUSE_PROXY_PROTOCOL(recommended; XFF covers the IP visibility) or switch the probe totcpSocket.
Security note
Production envoy holds the wildcard certificate and terminates TLS — that's where any L7 trust boundary lives. The edge envoy is intentionally L4-only and never sees cleartext or holds private keys, which keeps the cert material isolated to the trusted zone.
mTLS between the two envoys isn't a natural fit for this topology. The bytes between edge and production are the original client's TLS handshake plus the PPv2 prefix; the edge isn't terminating anything, so there's no second TLS layer to mutually authenticate on. Proxy-to-proxy confidentiality and identity in this setup are delegated to netbird/WireGuard: anything that can reach production envoy's listener has already passed WireGuard authentication. If you ever need cryptographic identity at the application layer between the proxies, the path is to wrap the inter-proxy hop in an additional TLS tunnel — at the cost of double-encryption and more cert plumbing.
Reference
- Envoy Gateway: ClientTrafficPolicy
- Envoy Gateway: BackendTrafficPolicy
- PROXY Protocol spec (HAProxy)
- Gitea config cheat sheet —
[server]section, especiallyUSE_PROXY_PROTOCOLvsSSH_SERVER_USE_PROXY_PROTOCOL