High Availability Is a Habit, Not a Checkbox
Fifteen years of data centers taught me one thing: redundancy isn't a diagram, it's a discipline. Here's the short list of things I still get wrong.
Everybody draws the diagram with two of everything. Two firewalls, two switches, two power feeds, two of those little cloud icons nobody has ever opened. The diagram is fine. The diagram has never been the problem.
The problem is the maintenance window at 2 a.m. when somebody trusts the diagram instead of the runbook. The problem is the secondary path that hasn't carried real traffic in eighteen months and quietly stopped working in March. The problem is the alert that fires into a Slack channel nobody is in anymore.
High availability isn't a topology. It's a set of small, boring habits — failovers you actually test, configs you actually diff, dashboards you actually read. The expensive gear is the easy part. The discipline is the part that decides whether you sleep through the night.
I still get this wrong. I get it wrong less than I used to.