Availability Patterns
High availability ensures systems remain operational despite failures.
Measuring Availability
The Nines
| Availability | Downtime/Year | Downtime/Month | Downtime/Week |
|---|---|---|---|
| 99% (two 9s) | 3.65 days | 7.31 hours | 1.68 hours |
| 99.9% (three 9s) | 8.77 hours | 43.83 minutes | 10.08 minutes |
| 99.99% (four 9s) | 52.60 minutes | 4.38 minutes | 1.01 minutes |
| 99.999% (five 9s) | 5.26 minutes | 26.30 seconds | 6.05 seconds |
Calculating Availability
Serial components (both must work):
Availability = A1 x A2
Example: Web server (99.9%) -> Database (99.9%)
Total = 0.999 x 0.999 = 99.8%
Parallel components (either works):
Availability = 1 - (1 - A1) x (1 - A2)
Example: Two web servers, each 99.9%
Total = 1 - (0.001 x 0.001) = 99.9999%
Loading diagram...
Redundancy Patterns
Active-Passive (Failover)
One active server, standby takes over on failure.
Loading diagram...
Operation:
- Primary handles all traffic
- Standby monitors primary via heartbeat
- On failure detection, standby becomes active
- DNS or VIP switches to new active
| Advantages | Disadvantages |
|---|---|
| Simple to implement | Standby resources idle |
| Clear failover path | Failover time (seconds to minutes) |
| Works for stateful services | Data sync complexity |
Active-Active
Multiple servers handle traffic simultaneously.
Loading diagram...
Operation:
- All servers handle traffic
- Load balancer distributes requests
- If one fails, others absorb traffic
- No explicit failover required
| Advantages | Disadvantages |
|---|---|
| Full resource utilization | More complex (state sync) |
| No failover delay | Requires stateless design |
| Better capacity | Load balancer is potential SPOF |