Design a Unique ID Generator
Related Concepts: Snowflake ID Architecture, Clock Synchronization (NTP), Bit Manipulation, Distributed Coordination, Database Auto-Increment Limitations, Time-Based Ordering, Worker ID Assignment
Design a system that generates unique identifiers across a distributed system without requiring coordination between servers.
Step 1: Requirements and Scope
Functional Requirements
- Generate globally unique IDs
- IDs should be sortable by time (newer IDs are larger)
- IDs should be numeric (64 bits)
- No coordination required between ID generators
Non-Functional Requirements
| Requirement | Target | Rationale |
|---|---|---|
| Uniqueness | 100% guaranteed | Duplicate IDs cause data corruption |
| Latency | < 1ms | ID generation on critical path |
| Throughput | 10K+ IDs/sec/server | High-volume systems |
| Availability | 99.999% | Core infrastructure |
| Ordering | Time-sortable | Efficient indexing, debugging |
Scale Estimation
- 100,000 IDs per second system-wide
- 1000 servers generating IDs
- 100 IDs per second per server average
- Burst: 10,000 IDs per second per server
Step 2: Why Common Solutions Fall Short
Auto-Increment Database
Loading diagram...
| Problem | Impact |
|---|---|
| Single point of failure | Database down = no IDs |
| Network latency | 1-10ms per ID |
| Scaling | Sharding auto-increment is complex |
UUID (128-bit)
| Problem | Impact |
|---|---|
| 128 bits | Twice the storage needed |
| Not time-sortable | Random ordering hurts indexes |
| Not numeric | Contains letters (a-f) |
| Poor index locality | Random distribution = random I/O |
Ticket Server (Batches)
Loading diagram...
| Advantages | Disadvantages |
|---|---|
| Simple concept | Ticket server is SPOF |
| No clock dependency | Need multiple ticket servers |
| Sequential within range | Range exhaustion handling |
Step 3: The Snowflake Approach
Twitter's Snowflake generates 64-bit IDs with embedded structure.
Bit Layout
Loading diagram...
Component Details
| Component | Bits | Range | Purpose |
|---|---|---|---|
| Sign | 1 | 0 | Keeps ID positive |
| Timestamp | 41 | 69 years | Time since custom epoch |
| Machine ID | 10 | 0-1023 | Unique server identifier |
| Sequence | 12 | 0-4095 | Counter per millisecond |
Capacity Analysis
| Metric | Calculation | Result |
|---|---|---|
| Time range | 2^41 milliseconds | ~69 years |
| Max machines | 2^10 | 1,024 servers |
| IDs per ms per machine | 2^12 | 4,096 IDs |
| IDs per second per machine | 4,096 x 1,000 | 4 million |
| Total system capacity | 1,024 x 4M | 4 billion/sec |
ID Generation Flow
Loading diagram...
Step 4: Alternative Approaches
Multi-Leader Auto-Increment
Loading diagram...
| Advantages | Disadvantages |
|---|---|
| Uses existing DB infrastructure | Adding servers requires reconfiguration |
| Simple concept | Not time-sortable |
| No external dependencies | Gaps when servers fail |
ULID (Universally Unique Lexicographically Sortable ID)
Loading diagram...
| Advantages | Disadvantages |
|---|---|
| No machine ID needed | 128 bits (not 64) |
| Time-sortable | Alphanumeric output |
| No coordination | Theoretical collision risk |
Comparison Table
| Approach | Bits | Sortable | Coordination | Throughput | Clock Dependent |
|---|---|---|---|---|---|
| Auto-increment | 64 | Yes | Per insert | Low | No |
| UUID v4 | 128 | No | None | High | No |
| UUID v7 | 128 | Yes | None | High | Yes |
| Snowflake | 64 | Yes | Setup only | Very High | Yes |
| Ticket server | 64 | Within batch | Per batch | High | No |
| ULID | 128 | Yes | None | High | Yes |
Step 5: Clock Synchronization
The Problem
Loading diagram...
Solutions
| Problem | Solution | Tradeoff |
|---|---|---|
| Clock backward | Reject / wait until caught up | Brief unavailability |
| Clock drift | Monitor NTP sync | Operational overhead |
| Clock forward | Cap maximum timestamp | May pause generation |
Handling Backward Clock
| Strategy | Behavior | When to Use |
|---|---|---|
| Reject | Throw error, fail fast | Strict consistency needed |
| Wait | Block until clock catches up | Small jumps (< 1s) |
| Logical increment | Use last_timestamp + 1 | Availability priority |
Step 6: Machine ID Assignment
Options
Loading diagram...
| Method | Advantages | Disadvantages | Use Case |
|---|---|---|---|
| Config file | Simple, explicit | Manual management | Static clusters |
| Zookeeper | Dynamic, handles failures | Extra dependency | Large clusters |
| IP-based | No coordination | Collision risk | Stateless containers |
| Cloud metadata | Cloud-native | Vendor-specific | Cloud deployments |
Zookeeper Assignment Flow
Loading diagram...
Step 7: System Architecture
Loading diagram...
Deployment Options
| Option | Architecture | Use Case |
|---|---|---|
| Library | Embedded in each service | Simple, low latency |
| Service | Dedicated ID service | Centralized management |
| Sidecar | Container per app instance | Kubernetes environments |
Step 8: Customizing Bit Allocation
Alternative Layouts
| Use Case | Timestamp | Datacenter | Machine | Sequence |
|---|---|---|---|---|
| Standard | 41 bits | - | 10 bits | 12 bits |
| Multi-DC | 41 bits | 5 bits | 5 bits | 12 bits |
| High volume | 39 bits | - | 10 bits | 14 bits |
| Long lifespan | 43 bits | - | 8 bits | 12 bits |
Tradeoffs
| More bits for... | Gain | Cost |
|---|---|---|
| Timestamp | Longer lifespan | Fewer machines or sequence |
| Machine ID | More servers | Shorter lifespan or sequence |
| Sequence | Higher throughput | Fewer machines or shorter lifespan |
| Datacenter | Multi-region | Fewer machines |
Production Examples
| System | ID Format | Notable Features |
|---|---|---|
| Twitter Snowflake | 64-bit | Original implementation, open-sourced |
| 64-bit | Sharded PostgreSQL-based | |
| Discord | 64-bit Snowflake | Same as Twitter, epoch 2015 |
| MongoDB ObjectId | 96-bit | Timestamp + machine + PID + counter |
| Cassandra TimeUUID | 128-bit | UUID v1 variant |
| Stripe | Prefixed IDs | ch_1234... for readability |
Summary: Key Design Decisions
| Decision | Options | Recommendation |
|---|---|---|
| ID format | UUID, Snowflake, ULID | Snowflake for 64-bit sortable |
| Bit allocation | Standard vs custom | Standard unless specific needs |
| Clock handling | Reject vs wait vs increment | Wait for small jumps, reject large |
| Machine ID | Config, Zookeeper, IP-based | Zookeeper for dynamic scaling |
| Deployment | Library vs service | Library for simplicity |
| Epoch | Unix epoch vs custom | Custom to maximize timestamp range |