Design Search Autocomplete

Design a typeahead/autocomplete system that suggests search queries as users type.

Related Concepts: Trie Data Structure, Prefix Matching, Ranking Algorithm, Caching (Edge/CDN), Sampling for Analytics, MapReduce for Aggregation, Sharding by Prefix, Real-Time Trending Updates

Step 1: Requirements and Scope

Functional Requirements

As users type, return top 10 suggestions matching the prefix
Suggestions ranked by popularity/relevance
Support multiple languages
Handle trending queries (fast updates)
Personalization (optional)

Non-Functional Requirements

Requirement	Target	Rationale
Latency	< 100ms (ideally < 50ms)	Must feel instant
Availability	99.99%	Core search feature
Scalability	10B queries/day	Massive search volume
Freshness	Minutes for trending	Current events matter

Scale Estimation

10 billion searches per day (~120K/second)
5 million unique queries in index
Average query length: 20 characters
~5 keystrokes per search (with autocomplete)
Suggestion requests: 5 x 120K = 600K/second

Step 2: The Core Challenge

Users expect suggestions instantly as they type. With millions of queries and sub-100ms latency requirements, how do you efficiently find and rank matching prefixes?

Loading diagram...

Step 3: Data Structure - Trie

A trie (prefix tree) is ideal for prefix matching.

Trie Structure

Loading diagram...

Trie Lookup Complexity

Operation	Complexity	Example
Find prefix node	O(L)	L = prefix length
Get all descendants	O(N)	N = nodes in subtree
Get top K	O(N log K)	Need to traverse all

Problem: Finding Top K is Slow

Traversing the entire subtree for each keystroke is too slow.

Solution: Cache top suggestions at each node.

Loading diagram...

Now lookup is O(L) where L = prefix length.

Step 4: High-Level Architecture

Loading diagram...

Step 5: Query Flow

Request Path

Loading diagram...

Response Time Breakdown

Step	Latency	Notes
Network (client to LB)	10-50ms	Varies by location
Load balancer	< 1ms	Fast routing
Trie lookup	< 1ms	In-memory, O(L)
Trending merge	< 1ms	Redis read
Network (LB to client)	10-50ms	Varies
Total	~50-100ms	Target: < 100ms

Step 6: Data Collection Pipeline

Pipeline Architecture

Loading diagram...

Aggregation Query

Step	Input	Output
Group by query	Raw logs	Query to count
Filter recent	All time	Last 7 days
Filter spam	All queries	Clean queries
Normalize	Mixed case	Lowercase
Top N	All queries	Top 5M by count

Trie Building Process

Step	Action	Complexity
1	Insert all queries	O(Q x L)
2	Compute top-K at each node	O(N) DFS traversal
3	Serialize to binary format	O(N)
4	Compress	~50% size reduction

Real-Time Updates

Loading diagram...

Operation	Command	Purpose
Increment score	`ZINCRBY trending:app 1 "app store"`	Count recent queries
Get top	`ZREVRANGE trending:app 0 9`	Top 10 trending
Expire old	`ZREMRANGEBYSCORE trending:app 0 {old_timestamp}`	Remove stale

Query	Base Score	Trending Boost	Final Score
"apple stock"	100	+50 (trending)	150
"apple"	500	0	500
"apple event"	20	+200 (trending)	220

Step 8: Scaling Strategies

Trie Size Estimation

Factor	Value
Unique queries	5 million
Avg query length	20 characters
Node overhead	~100 bytes (pointers, cached suggestions)
Estimated size	500 MB - 2 GB

Conclusion: Fits in memory on a single server.

Sharding (If Needed)

Loading diagram...

Sharding Strategy	Pros	Cons
By first character	Simple routing	Uneven distribution (more 's' than 'x')
By hash of prefix	Even distribution	Need to query multiple shards
Full replication	Simple, fast	Higher memory cost

Recommendation: Full replication (trie is small enough)

Replication for Availability

Configuration	Purpose
3+ replicas per region	High availability
Multi-region deployment	Low latency globally
Read-only replicas	Horizontal scaling

Step 9: Optimizations

Client-Side Optimizations

Optimization	Implementation	Benefit
Debouncing	Wait 100ms after typing stops	80% fewer requests
Prefetching	Request likely next prefix	Perceived zero latency
Local caching	Cache recent suggestions	Offline support

Server-Side Optimizations

Optimization	Implementation	Benefit
Bloom filter	Skip prefixes with no matches	Save trie traversal
Compression	Compress trie nodes	50% memory reduction
CDN caching	Cache popular prefixes at edge	Lower latency

Filtering Inappropriate Content

Filter Type	Implementation
Profanity	Blocklist lookup
Spam	Pattern detection
Personal info	Regex (SSN, phone)
Legal	Country-specific blocks

Step 10: Multi-Language Support

Language-Specific Tries

Loading diagram...

Special Considerations

Language	Challenge	Solution
Chinese	No word boundaries	Character-based trie
Japanese	Multiple scripts	Separate tries per script
Arabic	Right-to-left	Render correctly
Emoji	Special characters	Unicode handling

Step 11: Personalization

Personal Suggestions

Loading diagram...

Personalization Signals

Signal	Weight	Source
Recent searches	High	User's last 100 queries
Click history	Medium	What user clicked before
Location	Medium	Local businesses, news
Demographics	Low	Age, interests (if known)

Real-World Systems

System	Scale	Notable Features
Google	5B searches/day	ML ranking, personalized
Amazon	Product search	Category-aware suggestions
YouTube	Video search	Thumbnail previews
Spotify	Music search	Artist/song/playlist mixed
Elasticsearch	Open source	Completion suggester

Summary: Key Design Decisions

Decision	Options	Recommendation
Data structure	Trie, Inverted index	Trie with cached top-K
Storage	Memory, Redis, Elasticsearch	In-memory trie
Updates	Batch only, Real-time	Batch + streaming for trending
Sharding	By prefix, Full replication	Full replication (trie is small)
Ranking	Frequency, ML	Frequency with trending boost
Personalization	None, Full	Optional layer on top

Step 1: Requirements and Scope​

Functional Requirements​

Non-Functional Requirements​

Scale Estimation​

Step 2: The Core Challenge​

Step 3: Data Structure - Trie​

Trie Structure​

Trie Lookup Complexity​

Problem: Finding Top K is Slow​

Step 4: High-Level Architecture​

Step 5: Query Flow​

Request Path​

Response Time Breakdown​

Step 6: Data Collection Pipeline​

Pipeline Architecture​

Aggregation Query​

Trie Building Process​

Step 7: Handling Trending Queries​

Real-Time Updates​

Trending Storage (Redis)​

Merging Base + Trending​

Step 8: Scaling Strategies​

Trie Size Estimation​

Sharding (If Needed)​

Replication for Availability​

Step 9: Optimizations​

Client-Side Optimizations​

Server-Side Optimizations​

Filtering Inappropriate Content​

Step 10: Multi-Language Support​

Language-Specific Tries​

Special Considerations​

Step 11: Personalization​

Personal Suggestions​

Personalization Signals​

Real-World Systems​

Summary: Key Design Decisions​

Table of Contents

Step 1: Requirements and Scope

Functional Requirements

Non-Functional Requirements

Scale Estimation

Step 2: The Core Challenge

Step 3: Data Structure - Trie

Trie Structure

Trie Lookup Complexity

Problem: Finding Top K is Slow

Step 4: High-Level Architecture

Step 5: Query Flow

Request Path

Response Time Breakdown

Step 6: Data Collection Pipeline

Pipeline Architecture

Aggregation Query

Trie Building Process

Step 7: Handling Trending Queries

Real-Time Updates

Trending Storage (Redis)

Merging Base + Trending

Step 8: Scaling Strategies

Trie Size Estimation

Sharding (If Needed)

Replication for Availability

Step 9: Optimizations

Client-Side Optimizations

Server-Side Optimizations

Filtering Inappropriate Content

Step 10: Multi-Language Support

Language-Specific Tries

Special Considerations

Step 11: Personalization

Personal Suggestions

Personalization Signals

Real-World Systems

Summary: Key Design Decisions