Load Balancing
Load balancing distributes incoming traffic across multiple servers to improve throughput, reduce latency, and increase availability. This section covers load balancing algorithms, types, architecture patterns, and related concepts.
Purpose
| Purpose | Description |
|---|---|
| Increased capacity | Multiple servers handle more traffic than a single server |
| High availability | Traffic routes away from failed servers automatically |
| Performance optimization | Requests route to the server best able to handle them |
Load Balancing Algorithms
Round Robin
Requests distribute sequentially across servers in rotation.
| Aspect | Description |
|---|---|
| Advantages | Simple implementation, even distribution |
| Disadvantages | Assumes all servers have equal capacity and all requests have equal cost |
Weighted Round Robin
Servers receive requests proportionally to their assigned weights.
Least Connections
Requests route to the server with the fewest active connections.
Appropriate for: Workloads where request duration varies significantly.
IP Hash
A hash of the client IP address determines server selection. The same client IP consistently routes to the same server.
server_index = hash(client_ip) % num_servers
Appropriate for: Scenarios requiring client affinity without session management overhead.
Least Response Time
Requests route to the server with the lowest response time and fewest active connections.
Types of Load Balancers
Layer 4 (Transport Layer)
Layer 4 load balancers operate at the TCP/UDP level. They route based on IP addresses and ports without inspecting packet contents.
| Aspect | Description |
|---|---|
| Performance | High throughput, low latency |
| Routing capability | Cannot route based on URL, headers, or cookies |
| Examples | AWS NLB, HAProxy (L4 mode) |
Layer 7 (Application Layer)
Layer 7 load balancers operate at the HTTP level. They can inspect URLs, headers, cookies, and request content.
| Aspect | Description |
|---|---|
| Routing capability | Route /api/* to backend servers, /static/* to CDN; route based on cookies for A/B testing |
| Performance | Higher overhead than Layer 4 |
| Examples | AWS ALB, Nginx, HAProxy (L7 mode) |
Layer 7 load balancers are more common in modern applications due to their routing flexibility.
Architecture Patterns
Single Load Balancer
The load balancer is a single point of failure. If it fails, all traffic stops.
Redundant Load Balancers
Two load balancers share a virtual IP address. If the active load balancer fails, the standby assumes the virtual IP. Clients connect to the same IP throughout failover.
Global Load Balancing (DNS)
DNS-based routing directs users to the nearest regional load balancer. A user in Tokyo routes to servers in Tokyo; a user in New York routes to servers in US East.
Health Checks
Health checks determine server availability.
Active Health Checks
The load balancer periodically sends requests to each server to verify availability.
health_check:
path: /health
interval: 10s
timeout: 5s
unhealthy_threshold: 3
Passive Health Checks
The load balancer monitors actual traffic responses. Servers returning excessive errors are marked unhealthy.
5xx errors --> Mark unhealthy after threshold
Passive health checks detect issues that active checks may miss, such as servers returning errors only for certain types of requests.
Session Persistence
Session persistence routes requests from the same user to the same server.
Sticky Sessions
The load balancer tracks which server handled a user's initial request and routes subsequent requests to the same server.
Disadvantages:
- Uneven load distribution (some servers may receive more long-duration sessions)
- Session loss if the server fails
Stateless Alternatives
Stateless session management eliminates the need for session persistence.
| Approach | Description |
|---|---|
| External session store | Store sessions in Redis; any server can access them |
| Stateless tokens | Use JWTs that contain session data |
| Client storage | Store appropriate state in the browser |
SSL Termination
SSL termination decrypts HTTPS traffic at the load balancer, forwarding unencrypted HTTP to backend servers.
Client --> HTTPS --> Load Balancer --> HTTP --> Servers
| Aspect | Description |
|---|---|
| Certificate management | Certificates managed in one location |
| Server resources | Backend servers do not perform encryption/decryption |
| Routing capability | Load balancer can inspect decrypted traffic for routing decisions |
Note: Traffic between load balancer and servers is unencrypted. This is acceptable when traffic remains within a private network (VPC).
Design Considerations
| Topic | Considerations |
|---|---|
| Load balancer failure | Redundant load balancers, DNS failover, health checks |
| Load balancer scaling | Multiple load balancers behind DNS, auto-scaling, cloud-managed solutions |
| Session management | Prefer stateless design; use external session stores when state is required |
| Layer selection | Layer 4 for raw performance; Layer 7 for routing flexibility; often both |