Back to Articles
20 min read

Building a 20-Node Raspberry Pi Kubernetes Cluster

A complete engineering guide to designing and constructing a high-density distributed systems laboratory. Covers hardware selection, power budget analysis, and custom thermal management firmware.

Raspberry PiKubernetesTalos LinuxHardware EngineeringRustNetworking

Motivation: Beyond Cloud Abstractions

Managed Kubernetes services (EKS, GKE) are designed to hide the complexities of underlying infrastructure. While beneficial for production velocity, this abstraction layer obscures the fundamental challenges of distributed systems engineering: network partitioning, partial failure modes, and physical resource contention.

To gain deep operational expertise, I architected a physical cluster that exposes these primitives. The goal was to build a system that forces confrontation with edge computing constraints—strictly limited memory, ARM64 architecture, and unreliable storage mediums—within a controlled environment.

The primary design constraint was environmental: the cluster functions as office infrastructure, requiring near-silent operation and a power envelope under 100 watts, while providing sufficient density to simulate complex failure scenarios.

Engineering Constraints

Acoustic Profile

Must operate below 30dB (library quiet) for office compatibility. Precludes standard 1U server fans.

Power Envelope

Total draw capped at ~400W to minimize thermal load and operational cost.

Node Density

Minimum 10+ nodes required to effectively test distributed consensus and failure recovery.

Hardware Architecture

Compute: Raspberry Pi CM4

The Compute Module 4 (8GB RAM) was selected for its form factor and PCIe connectivity. Unlike standard Raspberry Pis, the CM4 exposes a single PCIe Gen 2 lane, enabling direct NVMe storage attachment. This eliminates the USB bottleneck that typically plagues single-board computer clusters.

Interconnect: PoE+ Architecture

Power and networking are consolidated using a 24-port UniFi Pro PoE switch. This drastically simplifies cable management and allows for remote power cycling of hung nodes via the switch management API—a critical feature for headless fleet management.

Storage: Distributed NVMe

Each node hosts a 1TB NVMe SSD. Aggregated via distributed block storage software, this provides a high-performance, replicated storage tier capable of sustaining database I/O (PostgreSQL WAL writes) that would saturate SD card interfaces.

Physical Implementation

The entire cluster is contained within 4.33U of standard 19-inch rack space. The layout prioritizes thermal airflow and cable density.

Physical Rack Layout (4.33U Total)

Office environment • ~400W compute power • Silent operation

Network Layer - 2U

UniFi
Dream Machine Pro (1U)
Router • Gateway • Controller • IDS/IPS
USW Pro 24 PoE (1U)
24-port PoE+ switch • Powers all 20 Pi nodes

Management Layer - 1.33U

Racknex Mount
Intel NUC #1
Proxmox Host
Intel NUC #2
Proxmox Host
Intel NUC #3
Proxmox Host
Ubuntu VM Jumpbox
kubectl • talosctl • GitOps • CI/CD

Compute Layer - 1U

Compute Blade
Pi 1
Pi 2
Pi 3
Pi 4
Pi 5
Pi 6
Pi 7
Pi 8
Pi 9
Pi 10
Pi 11
Pi 12
Pi 13
Pi 14
Pi 15
Pi 16
Pi 17
Pi 18
Pi 19
Pi 20
CM4 Modules
ARM64 • 8GB RAM
Storage
1TB NVMe per node
Power
PoE+ per node
Cooling
Noctua + Rust Control

Software Stack

Talos Kubernetes orchestrating microservices

MLOps Services

  • • RAG Pipelines
  • • Vector Databases (pgvector)
  • • Embedding Generation
  • • Model Serving
  • • LLM Orchestration

Data Layer

  • • PostgreSQL (HA)
  • • Redis (Caching/PubSub)
  • • MinIO (S3-compatible)
  • • Distributed Block Storage
  • • 20TB Total NVMe

Observability

  • • Prometheus (Metrics)
  • • Grafana (Dashboards)
  • • Distributed Tracing
  • • AlertManager
  • • Custom Thermal Monitor

Network Architecture

  • • UniFi Dream Machine Pro (routing/gateway)
  • • USW Pro 24 PoE (24-port switch)
  • • Single cable per Pi (PoE+ power + data)
  • • Kubernetes CNI for pod networking
  • • Service mesh & network policies

Power & Thermal

  • • ~400W total power consumption
  • • PoE+ budget management (~25W/port)
  • • Custom Rust thermal controller
  • • Noctua fan curves (0-100% PWM)
  • • Silent operation in office environment

Operating System: Talos Linux

I standardized on Talos Linux, an immutable OS built from scratch for Kubernetes. Talos eliminates the traditional Linux package manager and shell, treating the operating system as an ephemeral layer configured solely via API.

This architectural choice enforces “Infrastructure as Code” discipline. There is no SSH to “fix” a node; configurations must be applied declaratively. Upgrades are atomic image swaps (A/B partitioning), ensuring failed updates automatically rollback.

# Declarative Bootstrap Sequence
talosctl gen config homelab https://controlplane:6443
talosctl apply-config --insecure --nodes 192.168.1.10 --file controlplane.yaml
talosctl bootstrap --nodes 192.168.1.10
# Batch Provisioning Workers
talosctl apply-config --insecure --nodes 192.168.1.11-30 --file worker.yaml

Custom Thermal Control Plane

The high density of 20 nodes in 1U creates a significant thermal challenge. The stock fan control solutions relied on userspace tools incompatible with Talos's restrictive environment.

I developed a lightweight Rust daemon to interface directly with the I2C bus on the carrier boards. This daemon runs as a DaemonSet on every node, autonomously managing fan PWM duty cycles based on local thermal telemetry.

src/thermal_controller.rs
fn calculate_fan_speed(temp: f32) -> u8 {
  match temp {
    t if t < 40.0 => 0,    // Silent
    t if t < 50.0 => 30,   // Low
    t if t < 60.0 => 50,   // Medium
    t if t < 70.0 => 75,   // High
    _ => 100,               // Max
  }
}

The control loop implements hysteresis to prevent fan oscillation (“hunting”) and maintains the SoC temperature between 50-60°C, balancing thermal headroom with acoustic comfort.

Distributed Storage Strategy

Reliable stateful workloads on unreliable hardware require robust replication. The storage layer uses a distributed block storage provider that implements 3-way synchronous replication.

Storage classes are configured to prefer local volume access (reading from the replica on the same node) to minimize network traversals, while synchronous writes ensure that a node failure results in zero data loss. This setup successfully survived multiple “pull the plug” tests during commissioning.

Engineering Challenges

The ARM64 Tax

While ARM support is improving, the ecosystem is not fully mature. I frequently encounter container images lacking `linux/arm64` manifests. This necessitated building a custom CI pipeline to cross-compile and re-package upstream dependencies, a valuable exercise in supply chain management.

Memory Pressure

Operating within 8GB constraints requires strict QoS enforcement. I learned to rely on the OOM Killer's behavior, setting distinct `requests` and `limits` to prioritize critical system components (CNI, storage) over application workloads during resource contention.

Conclusion

Constructing this cluster provided insights into distributed systems that are inaccessible through cloud consoles. It transformed theoretical knowledge of consensus algorithms, network latency, and failure domains into practical operational experience.

For engineers seeking to master Kubernetes, the physical constraints of a homelab offer a rigorous training ground. The friction encountered—hardware compatibility, thermal limits, network debugging—is not a bug, but the primary feature.

Bill of Materials

ComponentEst. Cost
20x Raspberry Pi CM4 (8GB)~$1,600
20x 1TB NVMe drives~$1,200
Compute Blade carrier boards (1U)~$800
3x Intel NUC + Racknex mount~$1,500
UniFi Dream Machine Pro~$400
USW Pro 24 PoE switch~$500
Total Infrastructure~$6,000