Design Search Autocomplete
Design a typeahead/autocomplete system that suggests search queries as users type.
Related Concepts: Trie Data Structure, Prefix Matching, Ranking Algorithm, Caching (Edge/CDN), Sampling for Analytics, MapReduce for Aggregation, Sharding by Prefix, Real-Time Trending Updates
Step 1: Requirements and Scope
Functional Requirements
- As users type, return top 10 suggestions matching the prefix
- Suggestions ranked by popularity/relevance
- Support multiple languages
- Handle trending queries (fast updates)
- Personalization (optional)
Non-Functional Requirements
| Requirement | Target | Rationale |
|---|---|---|
| Latency | < 100ms (ideally < 50ms) | Must feel instant |
| Availability | 99.99% | Core search feature |
| Scalability | 10B queries/day | Massive search volume |
| Freshness | Minutes for trending | Current events matter |
Scale Estimation
- 10 billion searches per day (~120K/second)
- 5 million unique queries in index
- Average query length: 20 characters
- ~5 keystrokes per search (with autocomplete)
- Suggestion requests: 5 x 120K = 600K/second
Step 2: The Core Challenge
Users expect suggestions instantly as they type. With millions of queries and sub-100ms latency requirements, how do you efficiently find and rank matching prefixes?
Loading diagram...
Step 3: Data Structure - Trie
A trie (prefix tree) is ideal for prefix matching.
Trie Structure
Loading diagram...
Trie Lookup Complexity
| Operation | Complexity | Example |
|---|---|---|
| Find prefix node | O(L) | L = prefix length |
| Get all descendants | O(N) | N = nodes in subtree |
| Get top K | O(N log K) | Need to traverse all |
Problem: Finding Top K is Slow
Traversing the entire subtree for each keystroke is too slow.
Solution: Cache top suggestions at each node.
Loading diagram...
Now lookup is O(L) where L = prefix length.
Step 4: High-Level Architecture
Loading diagram...
Step 5: Query Flow
Request Path
Loading diagram...
Response Time Breakdown
| Step | Latency | Notes |
|---|---|---|
| Network (client to LB) | 10-50ms | Varies by location |
| Load balancer | < 1ms | Fast routing |
| Trie lookup | < 1ms | In-memory, O(L) |
| Trending merge | < 1ms | Redis read |
| Network (LB to client) | 10-50ms | Varies |
| Total | ~50-100ms | Target: < 100ms |
Step 6: Data Collection Pipeline
Pipeline Architecture
Loading diagram...
Aggregation Query
| Step | Input | Output |
|---|---|---|
| Group by query | Raw logs | Query to count |
| Filter recent | All time | Last 7 days |
| Filter spam | All queries | Clean queries |
| Normalize | Mixed case | Lowercase |
| Top N | All queries | Top 5M by count |
Trie Building Process
| Step | Action | Complexity |
|---|---|---|
| 1 | Insert all queries | O(Q x L) |
| 2 | Compute top-K at each node | O(N) DFS traversal |
| 3 | Serialize to binary format | O(N) |
| 4 | Compress | ~50% size reduction |
Step 7: Handling Trending Queries
Real-Time Updates
Loading diagram...
Trending Storage (Redis)
| Operation | Command | Purpose |
|---|---|---|
| Increment score | ZINCRBY trending:app 1 "app store" | Count recent queries |
| Get top | ZREVRANGE trending:app 0 9 | Top 10 trending |
| Expire old | ZREMRANGEBYSCORE trending:app 0 {old_timestamp} | Remove stale |
Merging Base + Trending
| Query | Base Score | Trending Boost | Final Score |
|---|---|---|---|
| "apple stock" | 100 | +50 (trending) | 150 |
| "apple" | 500 | 0 | 500 |
| "apple event" | 20 | +200 (trending) | 220 |
Step 8: Scaling Strategies
Trie Size Estimation
| Factor | Value |
|---|---|
| Unique queries | 5 million |
| Avg query length | 20 characters |
| Node overhead | ~100 bytes (pointers, cached suggestions) |
| Estimated size | 500 MB - 2 GB |
Conclusion: Fits in memory on a single server.
Sharding (If Needed)
Loading diagram...
| Sharding Strategy | Pros | Cons |
|---|---|---|
| By first character | Simple routing | Uneven distribution (more 's' than 'x') |
| By hash of prefix | Even distribution | Need to query multiple shards |
| Full replication | Simple, fast | Higher memory cost |
Recommendation: Full replication (trie is small enough)
Replication for Availability
| Configuration | Purpose |
|---|---|
| 3+ replicas per region | High availability |
| Multi-region deployment | Low latency globally |
| Read-only replicas | Horizontal scaling |
Step 9: Optimizations
Client-Side Optimizations
| Optimization | Implementation | Benefit |
|---|---|---|
| Debouncing | Wait 100ms after typing stops | 80% fewer requests |
| Prefetching | Request likely next prefix | Perceived zero latency |
| Local caching | Cache recent suggestions | Offline support |
Server-Side Optimizations
| Optimization | Implementation | Benefit |
|---|---|---|
| Bloom filter | Skip prefixes with no matches | Save trie traversal |
| Compression | Compress trie nodes | 50% memory reduction |
| CDN caching | Cache popular prefixes at edge | Lower latency |
Filtering Inappropriate Content
| Filter Type | Implementation |
|---|---|
| Profanity | Blocklist lookup |
| Spam | Pattern detection |
| Personal info | Regex (SSN, phone) |
| Legal | Country-specific blocks |
Step 10: Multi-Language Support
Language-Specific Tries
Loading diagram...
Special Considerations
| Language | Challenge | Solution |
|---|---|---|
| Chinese | No word boundaries | Character-based trie |
| Japanese | Multiple scripts | Separate tries per script |
| Arabic | Right-to-left | Render correctly |
| Emoji | Special characters | Unicode handling |
Step 11: Personalization
Personal Suggestions
Loading diagram...
Personalization Signals
| Signal | Weight | Source |
|---|---|---|
| Recent searches | High | User's last 100 queries |
| Click history | Medium | What user clicked before |
| Location | Medium | Local businesses, news |
| Demographics | Low | Age, interests (if known) |
Real-World Systems
| System | Scale | Notable Features |
|---|---|---|
| 5B searches/day | ML ranking, personalized | |
| Amazon | Product search | Category-aware suggestions |
| YouTube | Video search | Thumbnail previews |
| Spotify | Music search | Artist/song/playlist mixed |
| Elasticsearch | Open source | Completion suggester |
Summary: Key Design Decisions
| Decision | Options | Recommendation |
|---|---|---|
| Data structure | Trie, Inverted index | Trie with cached top-K |
| Storage | Memory, Redis, Elasticsearch | In-memory trie |
| Updates | Batch only, Real-time | Batch + streaming for trending |
| Sharding | By prefix, Full replication | Full replication (trie is small) |
| Ranking | Frequency, ML | Frequency with trending boost |
| Personalization | None, Full | Optional layer on top |