Design a Unique ID Generator

Related Concepts: Snowflake ID Architecture, Clock Synchronization (NTP), Bit Manipulation, Distributed Coordination, Database Auto-Increment Limitations, Time-Based Ordering, Worker ID Assignment

Design a system that generates unique identifiers across a distributed system without requiring coordination between servers.

Step 1: Requirements and Scope

Functional Requirements

Generate globally unique IDs
IDs should be sortable by time (newer IDs are larger)
IDs should be numeric (64 bits)
No coordination required between ID generators

Non-Functional Requirements

Requirement	Target	Rationale
Uniqueness	100% guaranteed	Duplicate IDs cause data corruption
Latency	< 1ms	ID generation on critical path
Throughput	10K+ IDs/sec/server	High-volume systems
Availability	99.999%	Core infrastructure
Ordering	Time-sortable	Efficient indexing, debugging

Scale Estimation

100,000 IDs per second system-wide
1000 servers generating IDs
100 IDs per second per server average
Burst: 10,000 IDs per second per server

Step 2: Why Common Solutions Fall Short

Auto-Increment Database

Loading diagram...

Problem	Impact
Single point of failure	Database down = no IDs
Network latency	1-10ms per ID
Scaling	Sharding auto-increment is complex

UUID (128-bit)

Problem	Impact
128 bits	Twice the storage needed
Not time-sortable	Random ordering hurts indexes
Not numeric	Contains letters (a-f)
Poor index locality	Random distribution = random I/O

Ticket Server (Batches)

Loading diagram...

Advantages	Disadvantages
Simple concept	Ticket server is SPOF
No clock dependency	Need multiple ticket servers
Sequential within range	Range exhaustion handling

Step 3: The Snowflake Approach

Twitter's Snowflake generates 64-bit IDs with embedded structure.

Bit Layout

Loading diagram...

Component Details

Component	Bits	Range	Purpose
Sign	1	0	Keeps ID positive
Timestamp	41	69 years	Time since custom epoch
Machine ID	10	0-1023	Unique server identifier
Sequence	12	0-4095	Counter per millisecond

Capacity Analysis

Metric	Calculation	Result
Time range	2^41 milliseconds	~69 years
Max machines	2^10	1,024 servers
IDs per ms per machine	2^12	4,096 IDs
IDs per second per machine	4,096 x 1,000	4 million
Total system capacity	1,024 x 4M	4 billion/sec

ID Generation Flow

Loading diagram...

Step 4: Alternative Approaches

Multi-Leader Auto-Increment

Loading diagram...

Advantages	Disadvantages
Uses existing DB infrastructure	Adding servers requires reconfiguration
Simple concept	Not time-sortable
No external dependencies	Gaps when servers fail

ULID (Universally Unique Lexicographically Sortable ID)

Loading diagram...

Advantages	Disadvantages
No machine ID needed	128 bits (not 64)
Time-sortable	Alphanumeric output
No coordination	Theoretical collision risk

Comparison Table

Approach	Bits	Sortable	Coordination	Throughput	Clock Dependent
Auto-increment	64	Yes	Per insert	Low	No
UUID v4	128	No	None	High	No
UUID v7	128	Yes	None	High	Yes
Snowflake	64	Yes	Setup only	Very High	Yes
Ticket server	64	Within batch	Per batch	High	No
ULID	128	Yes	None	High	Yes

Step 5: Clock Synchronization

The Problem

Loading diagram...

Solutions

Problem	Solution	Tradeoff
Clock backward	Reject / wait until caught up	Brief unavailability
Clock drift	Monitor NTP sync	Operational overhead
Clock forward	Cap maximum timestamp	May pause generation

Handling Backward Clock

Strategy	Behavior	When to Use
Reject	Throw error, fail fast	Strict consistency needed
Wait	Block until clock catches up	Small jumps (< 1s)
Logical increment	Use last_timestamp + 1	Availability priority

Step 6: Machine ID Assignment

Options

Loading diagram...

Method	Advantages	Disadvantages	Use Case
Config file	Simple, explicit	Manual management	Static clusters
Zookeeper	Dynamic, handles failures	Extra dependency	Large clusters
IP-based	No coordination	Collision risk	Stateless containers
Cloud metadata	Cloud-native	Vendor-specific	Cloud deployments

Zookeeper Assignment Flow

Loading diagram...

Step 7: System Architecture

Loading diagram...

Deployment Options

Option	Architecture	Use Case
Library	Embedded in each service	Simple, low latency
Service	Dedicated ID service	Centralized management
Sidecar	Container per app instance	Kubernetes environments

Step 8: Customizing Bit Allocation

Alternative Layouts

Use Case	Timestamp	Datacenter	Machine	Sequence
Standard	41 bits	-	10 bits	12 bits
Multi-DC	41 bits	5 bits	5 bits	12 bits
High volume	39 bits	-	10 bits	14 bits
Long lifespan	43 bits	-	8 bits	12 bits

Tradeoffs

More bits for...	Gain	Cost
Timestamp	Longer lifespan	Fewer machines or sequence
Machine ID	More servers	Shorter lifespan or sequence
Sequence	Higher throughput	Fewer machines or shorter lifespan
Datacenter	Multi-region	Fewer machines

Production Examples

System	ID Format	Notable Features
Twitter Snowflake	64-bit	Original implementation, open-sourced
Instagram	64-bit	Sharded PostgreSQL-based
Discord	64-bit Snowflake	Same as Twitter, epoch 2015
MongoDB ObjectId	96-bit	Timestamp + machine + PID + counter
Cassandra TimeUUID	128-bit	UUID v1 variant
Stripe	Prefixed IDs	`ch_1234...` for readability

Summary: Key Design Decisions

Decision	Options	Recommendation
ID format	UUID, Snowflake, ULID	Snowflake for 64-bit sortable
Bit allocation	Standard vs custom	Standard unless specific needs
Clock handling	Reject vs wait vs increment	Wait for small jumps, reject large
Machine ID	Config, Zookeeper, IP-based	Zookeeper for dynamic scaling
Deployment	Library vs service	Library for simplicity
Epoch	Unix epoch vs custom	Custom to maximize timestamp range

Step 1: Requirements and Scope​

Functional Requirements​

Non-Functional Requirements​

Scale Estimation​

Step 2: Why Common Solutions Fall Short​

Auto-Increment Database​

UUID (128-bit)​

Ticket Server (Batches)​

Step 3: The Snowflake Approach​

Bit Layout​

Component Details​

Capacity Analysis​

ID Generation Flow​

Step 4: Alternative Approaches​

Multi-Leader Auto-Increment​

ULID (Universally Unique Lexicographically Sortable ID)​

Comparison Table​

Step 5: Clock Synchronization​

The Problem​

Solutions​

Handling Backward Clock​

Step 6: Machine ID Assignment​

Options​

Zookeeper Assignment Flow​

Step 7: System Architecture​

Deployment Options​

Step 8: Customizing Bit Allocation​

Alternative Layouts​

Tradeoffs​

Production Examples​

Summary: Key Design Decisions​

In this article

Step 1: Requirements and Scope

Functional Requirements

Non-Functional Requirements

Scale Estimation

Step 2: Why Common Solutions Fall Short

Auto-Increment Database

UUID (128-bit)

Ticket Server (Batches)

Step 3: The Snowflake Approach

Bit Layout

Component Details

Capacity Analysis

ID Generation Flow

Step 4: Alternative Approaches

Multi-Leader Auto-Increment

ULID (Universally Unique Lexicographically Sortable ID)

Comparison Table

Step 5: Clock Synchronization

The Problem

Solutions

Handling Backward Clock

Step 6: Machine ID Assignment

Options

Zookeeper Assignment Flow

Step 7: System Architecture

Deployment Options

Step 8: Customizing Bit Allocation

Alternative Layouts

Tradeoffs

Production Examples

Summary: Key Design Decisions