Load Balancing

Load balancing distributes incoming traffic across multiple servers to improve throughput, reduce latency, and increase availability. This section covers load balancing algorithms, types, architecture patterns, and related concepts.

Purpose

Purpose	Description
Increased capacity	Multiple servers handle more traffic than a single server
High availability	Traffic routes away from failed servers automatically
Performance optimization	Requests route to the server best able to handle them

Load Balancing Algorithms

Round Robin

Requests distribute sequentially across servers in rotation.

Loading diagram...

Aspect	Description
Advantages	Simple implementation, even distribution
Disadvantages	Assumes all servers have equal capacity and all requests have equal cost

Weighted Round Robin

Servers receive requests proportionally to their assigned weights.

Loading diagram...

Least Connections

Requests route to the server with the fewest active connections.

Appropriate for: Workloads where request duration varies significantly.

IP Hash

A hash of the client IP address determines server selection. The same client IP consistently routes to the same server.

server_index = hash(client_ip) % num_servers

Appropriate for: Scenarios requiring client affinity without session management overhead.

Least Response Time

Requests route to the server with the lowest response time and fewest active connections.

Types of Load Balancers

Layer 4 (Transport Layer)

Layer 4 load balancers operate at the TCP/UDP level. They route based on IP addresses and ports without inspecting packet contents.

Aspect	Description
Performance	High throughput, low latency
Routing capability	Cannot route based on URL, headers, or cookies
Examples	AWS NLB, HAProxy (L4 mode)

Layer 7 (Application Layer)

Layer 7 load balancers operate at the HTTP level. They can inspect URLs, headers, cookies, and request content.

Aspect	Description
Routing capability	Route `/api/` to backend servers, `/static/` to CDN; route based on cookies for A/B testing
Performance	Higher overhead than Layer 4
Examples	AWS ALB, Nginx, HAProxy (L7 mode)

Layer 7 load balancers are more common in modern applications due to their routing flexibility.

Architecture Patterns

Single Load Balancer

Loading diagram...

The load balancer is a single point of failure. If it fails, all traffic stops.

Redundant Load Balancers

Loading diagram...

Two load balancers share a virtual IP address. If the active load balancer fails, the standby assumes the virtual IP. Clients connect to the same IP throughout failover.

Global Load Balancing (DNS)

Loading diagram...

DNS-based routing directs users to the nearest regional load balancer. A user in Tokyo routes to servers in Tokyo; a user in New York routes to servers in US East.

Health Checks

Health checks determine server availability.

Active Health Checks

The load balancer periodically sends requests to each server to verify availability.

health_check:
  path: /health
  interval: 10s
  timeout: 5s
  unhealthy_threshold: 3

Passive Health Checks

The load balancer monitors actual traffic responses. Servers returning excessive errors are marked unhealthy.

5xx errors --> Mark unhealthy after threshold

Passive health checks detect issues that active checks may miss, such as servers returning errors only for certain types of requests.

Session Persistence

Session persistence routes requests from the same user to the same server.

Sticky Sessions

The load balancer tracks which server handled a user's initial request and routes subsequent requests to the same server.

Disadvantages:

Uneven load distribution (some servers may receive more long-duration sessions)
Session loss if the server fails

Stateless Alternatives

Stateless session management eliminates the need for session persistence.

Approach	Description
External session store	Store sessions in Redis; any server can access them
Stateless tokens	Use JWTs that contain session data
Client storage	Store appropriate state in the browser

SSL Termination

SSL termination decrypts HTTPS traffic at the load balancer, forwarding unencrypted HTTP to backend servers.

Client --> HTTPS --> Load Balancer --> HTTP --> Servers

Aspect	Description
Certificate management	Certificates managed in one location
Server resources	Backend servers do not perform encryption/decryption
Routing capability	Load balancer can inspect decrypted traffic for routing decisions

Note: Traffic between load balancer and servers is unencrypted. This is acceptable when traffic remains within a private network (VPC).

Design Considerations

Topic	Considerations
Load balancer failure	Redundant load balancers, DNS failover, health checks
Load balancer scaling	Multiple load balancers behind DNS, auto-scaling, cloud-managed solutions
Session management	Prefer stateless design; use external session stores when state is required
Layer selection	Layer 4 for raw performance; Layer 7 for routing flexibility; often both

Purpose​

Load Balancing Algorithms​

Round Robin​

Weighted Round Robin​

Least Connections​

IP Hash​

Least Response Time​

Types of Load Balancers​

Layer 4 (Transport Layer)​

Layer 7 (Application Layer)​

Architecture Patterns​

Single Load Balancer​

Redundant Load Balancers​

Global Load Balancing (DNS)​

Health Checks​

Active Health Checks​

Passive Health Checks​

Session Persistence​

Sticky Sessions​

Stateless Alternatives​

SSL Termination​

Design Considerations​

Table of Contents

Purpose

Load Balancing Algorithms

Round Robin

Weighted Round Robin

Least Connections

IP Hash

Least Response Time

Types of Load Balancers

Layer 4 (Transport Layer)

Layer 7 (Application Layer)

Architecture Patterns

Single Load Balancer

Redundant Load Balancers

Global Load Balancing (DNS)

Health Checks

Active Health Checks

Passive Health Checks

Session Persistence

Sticky Sessions

Stateless Alternatives

SSL Termination

Design Considerations