Monitoring with Prometheus & Grafana


For the upcoming WTF-Model release on Steam, I decided I needed some form of observability and monitoring for my development infrastructure.
The setup I therefore built ensures full visibility across my network, devices, web services, and GitHub CI/CD automation—a critical requirement during the WTF-Model’s deployment and review cycles.

The setup provides real-time (and color-coded) insight into:

  • Internet connectivity and latency
  • System resource usage across devices
  • Website and API health
  • GitHub Actions runner availability for builds and deployments

Everything runs in a self-hosted Docker Compose environment powered by Prometheus and Grafana.

Below you can see the final result — the dashboard displayed in a three-quarter split view alongside my daily calendar on the iPad mini next to my main workstation, serving as a real-time status and information center.

Grafana dashboard displayed on iPad mini beside workstation for real-time monitoring

Architecture

I have this deployed inside a dedicated Docker network, with containers mapped to a consecutive port range (18000–18xxx) for a clean, predictable layout. Docker Compose was chosen for its simplicity and efficiency in single-host environments. I tried for an hour running this with Helm for K8s, but it was clearly overkill and so i decided for a less complex approach.

Secrets to access devices and scrape data are injected via a .env file, keeping the setup portable and secure.
I automated deployment and teardown via functionally identical Windows and Linux scripts:

  • bootstrap.sh / bootstrap.ps1 – builds images, initializes volumes, and launches containers
  • cleanup.sh / cleanup.ps1 – stops services, removes containers, and clears caches
Running container overview and dual deploy script output for Linux and Windows environments

Exporters and Metrics

The monitoring relies on a combination of Prometheus exporters — both standard and custom-coded with Python — each containerized and integrated through docker-compose.yml.

Among them:

  • A customized FRITZ!Box exporter providing DSL, PPPoE, and internet connectivity status.
  • A Pi-hole exporter for DNS-level ad-blocking metrics.
  • The Blackbox exporter, performing ICMP and HTTP checks across LAN devices and websites via custom HTML health endpoints.
  • A Python-based IP exporter that determines the public IP, VPN latency, and geolocation using multiple providers (ipapi, ipwhois, ifconfig, ipsb, ipinfo).
  • The Windows and Linux exporter, installed on node.lan, imac.lan, server.lan, and nuc.lan, tracking CPU, RAM, disk, and GPU utilization.
  • A GitHub exporter, monitoring self-hosted GitHub Actions runners used for WTF-Model deployments, indicating online, idle, or busy states.

Prometheus scrapes data from all these sources on fine-tuned intervals to keep updates near real time while maintaining efficiency:
ICMP probes run every second, lightweight metrics every few seconds, and FRITZ!Box data every 10 seconds - that only to due to device API limitations.
I believe this balance ensures fast, continuous visibility without adding unnecessary load.

Prometheus and Docker Compose configuration in VSCode showing exporter setup

Dashboard Layout

I organized the Grafana dashboard into two main columns for quick, high-level insights:

Left Column – Connection

  • FRITZ!Box DSL / PPPoE / Internet states
  • VPN connection, public IP, and latency
  • WAN traffic rates and daily totals
  • DNS latency (internal vs external)
  • Website health checks and Pi-hole statistics

Right Column – Devices

  • CPU, RAM, Disk, and GPU metrics for each major device
  • Online status for core network components
  • GitHub Actions runner states used for WTF-Model builds and deployments
Annotated Grafana dashboard layout highlighting connection and device monitoring sections

I believe that the result is a compact yet comprehensive monitoring environment which provides immediate visibility into all systems supporting the WTF-Model and my daily workflow.

comments powered by Disqus