Monitoring
Prometheus-compatible metrics stack with VictoriaMetrics, Grafana, Loki, and Fluent Bit.
About
This namespace deploys the full observability stack for the homelab cluster. It combines VictoriaMetrics for metrics storage, Grafana for dashboards, Loki for log aggregation, and Fluent Bit as the log collector DaemonSet. Self-hosting this stack avoids cloud observability costs while providing full access to all cluster telemetry.
AlternativeTo
Cloud Hosted
| Tool | Open Source | Free Tier | Monthly Cost |
|---|---|---|---|
| Grafana Cloud | Yes | Limited | From $19/mo |
| Datadog | No | No | From $15/host |
| New Relic | No | Limited | Pay-as-you-go |
Installation
Architecture
- HelmReleases:
victoria-stack(VictoriaMetrics operator, vmsingle, vmstack + Grafana),loki,fluent-bit - DaemonSet: Fluent Bit log collector on all nodes, mounts
/var/log - Additional:
grafana-to-ntfyDeployment proxies Grafana alerts to ntfy; OpenTelemetry Collector - Storage: Longhorn-encrypted PVCs for VictoriaMetrics and Loki; S3/MinIO for Loki chunk storage
- Networking: HTTPRoutes for Grafana, VictoriaMetrics UI, and Loki
Security
- Fluent Bit runs as
runAsUser: 0(requires node log access) grafana-to-ntfyruns asrunAsUser: 1000,runAsNonRoot: true,readOnlyRootFilesystem: true, capabilities dropped- All secrets SOPS-encrypted with age
Updates
Managed by Renovate. grafana-to-ntfy and otelcol images are digest-pinned.
Data Management
- PVCs: Longhorn-encrypted PVCs for VictoriaMetrics (
vmsingle,vlsingle) and Loki data - S3: MinIO / Loki chunk storage for long-term log retention
- Backups: No k8up schedule present. Data durability via Longhorn replication.
User Management
Grafana OIDC configured via GF_AUTH_GENERIC_OAUTH_* env vars from SOPS-encrypted secret. Users authenticated via the cluster's OIDC provider.
Configuration Management
- Helm chart values in ConfigMaps for victoria-stack, Loki, and Fluent Bit
- Grafana OIDC credentials, SMTP config, and ntfy auth from SOPS-encrypted secrets
ntfy-authsecret used bygrafana-to-ntfyfor push notification delivery
Administration
Usage
Access Grafana to view cluster dashboards, query metrics with PromQL/MetricsQL, and browse logs via Loki. Alerts configured in Grafana are forwarded to ntfy via the grafana-to-ntfy proxy service. Fluent Bit collects container logs from all nodes automatically.
Metadata
- Image:
kittyandrew/grafana-to-ntfy:latest@sha256:68244c5041b80dcb852822dd001d5c4c76d66dab6093ba1596656d565d53090b