Skip to main content

Load Balancing

Load balancing distributes incoming traffic across multiple servers to improve throughput, reduce latency, and increase availability. This section covers load balancing algorithms, types, architecture patterns, and related concepts.

Purpose

PurposeDescription
Increased capacityMultiple servers handle more traffic than a single server
High availabilityTraffic routes away from failed servers automatically
Performance optimizationRequests route to the server best able to handle them

Load Balancing Algorithms

Round Robin

Requests distribute sequentially across servers in rotation.

Loading diagram...
AspectDescription
AdvantagesSimple implementation, even distribution
DisadvantagesAssumes all servers have equal capacity and all requests have equal cost

Weighted Round Robin

Servers receive requests proportionally to their assigned weights.

Loading diagram...

Least Connections

Requests route to the server with the fewest active connections.

Appropriate for: Workloads where request duration varies significantly.

IP Hash

A hash of the client IP address determines server selection. The same client IP consistently routes to the same server.

server_index = hash(client_ip) % num_servers

Appropriate for: Scenarios requiring client affinity without session management overhead.

Least Response Time

Requests route to the server with the lowest response time and fewest active connections.

Types of Load Balancers

Layer 4 (Transport Layer)

Layer 4 load balancers operate at the TCP/UDP level. They route based on IP addresses and ports without inspecting packet contents.

AspectDescription
PerformanceHigh throughput, low latency
Routing capabilityCannot route based on URL, headers, or cookies
ExamplesAWS NLB, HAProxy (L4 mode)

Layer 7 (Application Layer)

Layer 7 load balancers operate at the HTTP level. They can inspect URLs, headers, cookies, and request content.

AspectDescription
Routing capabilityRoute /api/* to backend servers, /static/* to CDN; route based on cookies for A/B testing
PerformanceHigher overhead than Layer 4
ExamplesAWS ALB, Nginx, HAProxy (L7 mode)

Layer 7 load balancers are more common in modern applications due to their routing flexibility.

Architecture Patterns

Single Load Balancer

Loading diagram...

The load balancer is a single point of failure. If it fails, all traffic stops.

Redundant Load Balancers

Loading diagram...

Two load balancers share a virtual IP address. If the active load balancer fails, the standby assumes the virtual IP. Clients connect to the same IP throughout failover.

Global Load Balancing (DNS)

Loading diagram...

DNS-based routing directs users to the nearest regional load balancer. A user in Tokyo routes to servers in Tokyo; a user in New York routes to servers in US East.

Health Checks

Health checks determine server availability.

Active Health Checks

The load balancer periodically sends requests to each server to verify availability.

health_check:
path: /health
interval: 10s
timeout: 5s
unhealthy_threshold: 3

Passive Health Checks

The load balancer monitors actual traffic responses. Servers returning excessive errors are marked unhealthy.

5xx errors --> Mark unhealthy after threshold

Passive health checks detect issues that active checks may miss, such as servers returning errors only for certain types of requests.

Session Persistence

Session persistence routes requests from the same user to the same server.

Sticky Sessions

The load balancer tracks which server handled a user's initial request and routes subsequent requests to the same server.

Disadvantages:

  • Uneven load distribution (some servers may receive more long-duration sessions)
  • Session loss if the server fails

Stateless Alternatives

Stateless session management eliminates the need for session persistence.

ApproachDescription
External session storeStore sessions in Redis; any server can access them
Stateless tokensUse JWTs that contain session data
Client storageStore appropriate state in the browser

SSL Termination

SSL termination decrypts HTTPS traffic at the load balancer, forwarding unencrypted HTTP to backend servers.

Client --> HTTPS --> Load Balancer --> HTTP --> Servers
AspectDescription
Certificate managementCertificates managed in one location
Server resourcesBackend servers do not perform encryption/decryption
Routing capabilityLoad balancer can inspect decrypted traffic for routing decisions

Note: Traffic between load balancer and servers is unencrypted. This is acceptable when traffic remains within a private network (VPC).

Design Considerations

TopicConsiderations
Load balancer failureRedundant load balancers, DNS failover, health checks
Load balancer scalingMultiple load balancers behind DNS, auto-scaling, cloud-managed solutions
Session managementPrefer stateless design; use external session stores when state is required
Layer selectionLayer 4 for raw performance; Layer 7 for routing flexibility; often both