Skip to main content

Talos production cluster

The production cluster is the primary Kubernetes cluster of the homelab. It runs the bulk of workloads: media (Jellyfin, Immich, Tube Archivist), productivity (Nextcloud, Outline, Paperless), DevOps (Gitea, n8n, Keycloak), and observability.

This page is a top-down overview that links into every layer.

Stack

┌──────────────────────────────┐
│ Apps (45+) │ ← docs/apps
└──────────────┬───────────────┘
│ scheduled by
┌──────────────▼───────────────┐
│ Platform controllers │ ← docs/platform
│ Cilium · Longhorn · CNPG │
│ Envoy · k8up · Kyverno · … │
└──────────────┬───────────────┘
│ runs on
┌──────────────▼───────────────┐
│ Kubernetes (kubeadm-less) │
│ provided by Talos │
└──────────────┬───────────────┘
│ runs on
┌──────────────▼───────────────┐
│ Talos Linux (OS) │ ← docs/foundation/talos
└──────────────┬───────────────┘
│ inside VMs on
┌──────────────▼───────────────┐
│ Proxmox VE (3 nodes) │ ← docs/foundation/proxmox
└──────────────┬───────────────┘
│ ON
┌──────────────▼───────────────┐
│ Intel NUC 13 Pro × 3 │ ← docs/hardware/talos
└──────────────────────────────┘

Layers at a glance

LayerWhatDoc
HardwareIntel NUC 13 Pro × 3 (i5-1340P, 64 GB RAM, 2 TB NVMe each)hardware/talos
HypervisorProxmox VE (3-node cluster proxmox1/2/3); 3 control-plane VMs + 3 worker VMs + 3 NetBird LXCsfoundation/proxmox
Cluster OSTalos Linux, configured by Talhelperfoundation/talos
GitOpsFlux pulling from Gitea, decrypting secrets via SOPSfoundation/flux
NetworkCilium (kube-proxy replacement, WireGuard, L2 announcements)platform/cilium
StorageLonghorn (cluster-native block) + NFS to TrueNAS (bulk media)platform/longhorn · hardware/nas
IngressEnvoy Gateway with PROXY-protocol-v2 from the edge clusterplatform/envoy-gateway · topics/envoy-gateway-proxy-protocol-v2
SecretsExternal Secrets + SOPSplatform/external-secrets · operations/sops
Backupsk8up Schedule → Restic → Hetzner S3 (warm) → Synology NAS (hot) → encrypted WD drives (cold)operations/

Cluster machines

RoleNameHostNotes
Control planetalos-cp-01proxmox1etcd member
Control planetalos-cp-02proxmox2etcd member
Control planetalos-cp-03proxmox3etcd member
Workertalos-worker-01proxmox1extra NICs into VLAN 104+105, GPU passthrough
Workertalos-worker-02proxmox2extra NICs into VLAN 104+105, GPU passthrough
Workertalos-worker-03proxmox3extra NICs into VLAN 104+105, GPU passthrough

Networking summary

PlaneWhat
Pod CIDR10.100.0.0/16
Service CIDRdefault (10.96.0.0/12)
Node mgmtVLAN 100 (192.168.100.0/24) — control planes + workers
StorageVLAN 104 (192.168.104.0/24) — workers only, for NFS to TrueNAS
PublicVLAN 105 (192.168.105.0/24) — workers only, for ingress
Inter-siteNetBird mesh — production network exposes those VLANs to peers

Full network map: Fabric overview · NetBird · UniFi.

What runs here

  • All persistent apps with stateful PVCs (Longhorn-backed).
  • Anything GPU-accelerated — the worker nodes carry the Intel iGPU passthrough.
  • The control plane for Flux's primary GitRepository and the SOPS keys.

The edge cluster handles only public-facing ingress + sidecar workloads; everything stateful lives here.

Lifecycle commands

# Render & apply Talos config
talhelper genconfig
talhelper gencommand apply | sh

# Upgrade
talhelper gencommand upgrade --extra-flags "--preserve" | sh

# Reconcile Flux
flux reconcile kustomization flux-system

# Inspect
kubectl get nodes -o wide
kubectl get pods -A | grep -v Running | grep -v Completed
flux get all -A

See also