Design an E-commerce Recommendation System
Design a recommendation system for an e-commerce platform like Amazon.
Requirements
Functional:
- "Customers who bought this also bought"
- Personalized homepage recommendations
- "Frequently bought together" bundles
- Search result personalization
Non-functional:
- Real-time recommendations < 100ms
- Handle 100M+ products
- Support 100M+ users
Metrics
Offline
| Metric | Description |
|---|---|
| Recall@K | Relevant items in top K |
| NDCG | Ranking quality |
| Coverage | % of catalog recommended |
| Diversity | Variety of recommendations |
Online
| Metric | Description |
|---|---|
| CTR | Click-through rate |
| Conversion | Purchase rate |
| Revenue per session | Average order value |
| Cart additions | Items added to cart |
Architecture
Candidate Generation
Collaborative Filtering
Item-based CF: Find items frequently purchased together. Build a co-purchase matrix tracking which items are bought in the same transaction, then retrieve the top-N most similar items based on co-purchase frequency.
Two-Tower Model
A neural network approach for candidate retrieval:
| Tower | Architecture | Output |
|---|---|---|
| User tower | Dense(256) -> ReLU -> Dense(128) | 128-dim user embedding |
| Item tower | Dense(256) -> ReLU -> Dense(128) | 128-dim item embedding |
Compute relevance as the dot product of user and item embeddings. Item embeddings can be pre-computed and indexed for fast retrieval.
Features
User Features
| Feature | Description |
|---|---|
| Purchase history | Past items bought |
| Browse history | Items viewed |
| Cart history | Items added to cart |
| Search queries | Recent searches |
| Demographics | Age, location |
| Price sensitivity | Average purchase price |
Item Features
| Feature | Description |
|---|---|
| Category | Product category hierarchy |
| Brand | Brand information |
| Price | Current price |
| Rating | Average review rating |
| Visual embedding | Product image features |
| Text embedding | Title/description |
| Popularity | Sales velocity |
Context Features
| Feature | Description |
|---|---|
| Current page | Homepage, PDP, cart |
| Time of day | Morning, evening |
| Device | Mobile, desktop |
| Season | Holiday, back-to-school |
Ranking Model
A deep neural network to predict purchase probability:
| Layer | Configuration |
|---|---|
| Input | Concatenated user, item, and context features |
| Hidden 1 | Dense(256) -> ReLU -> Dropout(0.2) |
| Hidden 2 | Dense(128) -> ReLU |
| Output | Dense(1) -> Sigmoid |
The model predicts P(purchase | user, item, context).
Serving
Recommendation serving process:
- Candidate generation: Collect candidates from multiple sources (collaborative filtering, popular items, two-tower model)
- Feature retrieval: Fetch user features and item features for all candidates
- Scoring: Apply the ranking model to score each candidate
- Ranking: Sort candidates by predicted score in descending order
- Business rules: Filter for in-stock items, apply diversity constraints, insert sponsored items
- Return: Top N items after filtering
Cold Start
New Users
| Approach | Description |
|---|---|
| Popular items | Recommend best-sellers |
| Category browsing | Use current page context |
| Preference questionnaire | Quick preference survey |
| Demographic-based | Age/location-based recommendations |
New Items
| Approach | Description |
|---|---|
| Content-based similarity | Match to existing items |
| Category placement | Recommend in relevant categories |
| New arrivals boost | Give exposure to new items |
| A/B test exposure | Random exposure for data collection |
Business Rules
After ML ranking, apply business logic:
| Rule | Purpose |
|---|---|
| Category diversity | Prevent recommendations from being dominated by one category |
| In-stock filter | Remove items that are out of stock |
| Margin boost | Boost higher-margin items when relevance is similar |
| Sponsored insertion | Insert sponsored items at designated slots |
The ML model does not have access to inventory or margin data. Business rules handle these constraints.
Reference
| Topic | Description |
|---|---|
| Inventory constraints | ML model ranks by relevance. Business rules filter unavailable items. |
| Margin optimization | Boost higher-margin items when relevance is similar. |
| Category diversity | Prevent recommendations from being dominated by one category. |
| Sponsored products | Insert at designated slots without competing with organic recommendations. |