Skip to content

Ch6 — Scale 🟪

“We’ll scale when we need to,” they said. Budo nodded, and quietly bookmarked the postmortem template.

Status: outline. Lab scaffolding in labs/ch06-scale/.

The problem

Requests/limits set once at service birth, never revisited; HPAs targeting CPU on memory-bound services. Capacity review is quarterly archaeology. Make it a weekly agent run.

What you’ll build

budo capacity --window 7d — pulls saturation/utilization history per workload, finds: throttling, OOM-risk, over-provisioning, HPA misconfiguration → emits YAML diffs and opens a PR with reasoning in the description. Agent output as reviewable code change — the safest mutation channel there is.

Key concepts introduced

  • Agents that write config: diff-not-file outputs, validation in code (kubectl apply --dry-run=server) before proposing
  • Eval harness for recommendations: replay historical metric windows, score recs against what actually happened (the throttling you induce with the load generator). First taste of agent evals as regression tests.
  • Scheduled agents: cron-driven, unattended — what changes when no human watches the run?

The scenario

Load generator cranked unevenly: cartservice starved (throttled), recommendationservice 10x over-provisioned, frontend HPA targeting the wrong metric. Three findings hidden in real data.

Break it

Feed it a window containing a one-off load test. Does it size the fleet for an event that never recurs? Anomaly-vs-trend discrimination is the hardening.

Belt test

All 3 seeded findings surfaced with correct YAML; dry-run validation passes; the eval harness scores ≥ baseline on 3 replayed windows.