Skip to main content

Consistent Hashing

Consistent hashing distributes data across multiple servers while minimizing disruption when servers are added or removed.

Problem with Simple Hashing

With traditional modular hashing, keys map to servers by computing the hash of the key and taking the modulo with the number of servers.

Example with 4 servers: The hash of "user_123" modulo 4 equals 2, routing to Server 2. The hash of "user_456" modulo 4 equals 0, routing to Server 0.

Problem: When servers are added or removed, the modulo changes, causing nearly all keys to map to different servers.

For example, with 4 servers, a key might hash to server 2. After adding a fifth server, that same key hashes to server 3. This means the data must be moved to the new location.

With N servers and K keys, adding one server redistributes approximately K*(N-1)/N keys.

For a cache, this invalidates almost all cached data simultaneously, causing a "thundering herd" to the database.

How Consistent Hashing Works

Servers are arranged on a ring (0 to 2^32 - 1).

Step 1: Place Servers on the Ring

Hash each server's identifier to get its position:

position(server) = hash(server_id) % 2^32

Loading diagram...

Step 2: Place Keys on the Ring

Hash each key to get its position on the ring, using the same hash function modulo 2^32.

Step 3: Find the Server

Walk clockwise from the key's position until reaching a server. That server stores the key.

Loading diagram...

Server Changes

Server removed: Only keys that mapped to that server move to the next server clockwise.

Server added: Only keys between the new server and its predecessor move.

Adding one server to N servers redistributes approximately K/N keys instead of nearly all keys.

Uneven Distribution Problem

Basic consistent hashing can result in uneven distribution with few servers:

ServerLoadDistribution
S160%One server may own most of the ring
S240%

Virtual Nodes (Vnodes)

Virtual nodes solve the uneven distribution problem by giving each physical server multiple positions on the ring.

Instead of one position per server, create many "virtual nodes." Each physical server (like S1 or S2) is represented by multiple virtual nodes (S1_0, S1_1, S1_2, etc.).

Each virtual node gets its own position on the ring by hashing a combination of the server identifier and the virtual node number.

Benefits:

  • More even distribution across servers
  • When a server fails, its load spreads across many servers
  • Heterogeneous hardware: powerful servers receive more vnodes

Trade-off: More vnodes increase memory usage for ring metadata

Typical values: 100-200 virtual nodes per physical server

Implementation

A consistent hash implementation maintains a ring data structure that maps positions to nodes, along with a sorted list of all positions on the ring.

Adding a node: For each virtual node (typically 150 per physical server), compute a hash of the server identifier combined with the virtual node number. Store the mapping from this position to the physical node, and add the position to the sorted list.

Looking up a key: Hash the key to get its position on the ring. Walk through the sorted positions to find the first server position that is greater than or equal to the key's position. If no such position exists (the key position is past the last server), wrap around to the first server on the ring.

Replication with Consistent Hashing

For fault tolerance, store data on multiple servers by walking clockwise and selecting the next N distinct physical servers:

Loading diagram...

Skip virtual nodes of the same physical server when selecting replicas.

Production Usage

SystemUse Case
Amazon DynamoDBPartition data across nodes
Apache CassandraDistribute data across ring
DiscordRoute users to chat servers
Akamai CDNDistribute content to edge servers
Memcached clientsDistribute cache entries

Comparison with Alternatives

ApproachRedistribution on ChangeLoad BalanceComplexity
Modular hash~100% of keysEvenSimple
Consistent hash~1/N of keysEven with vnodesMedium
Range-basedManual rebalanceCan be unevenSimple
Directory-basedZeroEvenHigh (lookup overhead)

Common Questions

Why not use a lookup table?

A directory-based approach works but requires a highly available lookup service. Consistent hashing is stateless - any client can compute the mapping independently.

How to handle hot keys?

Options include:

  • Add random suffix to split across nodes: hot_key_1, hot_key_2
  • Use a local cache before the distributed layer
  • Replicate hot data to more servers

What hash function to use?

MD5 or SHA-1 provide good distribution. Speed is not critical since hashing occurs once per request. Security is not a concern - only uniform distribution is required.

How many virtual nodes?

More vnodes improve balance but increase memory usage. 100-200 is typical. With 3 physical servers and 150 vnodes each, the ring has 450 positions using approximately 20KB of metadata.

Summary

  1. Simple hashing fails at scale because server changes cause massive redistribution
  2. Consistent hashing minimizes redistribution - only K/N keys move when adding/removing servers
  3. Virtual nodes solve uneven distribution by giving each server multiple positions
  4. Replication is natural - walk further around the ring
  5. Trade-offs exist - more vnodes means better balance but more memory