Skip to main content

Design a Spotify Recommendation System

Design a machine learning system to recommend songs and playlists on Spotify.

Requirements

Functional:

  • Recommend songs for "Discover Weekly" playlist
  • Recommend songs for radio stations
  • Real-time "next song" recommendations
  • Personalized home page recommendations

Non-functional:

  • Generate Discover Weekly for 400M+ users weekly
  • Real-time recommendations < 100ms
  • Handle catalog of 80M+ songs

Metrics

Offline Metrics

MetricDescription
Recall@KFraction of user's future listens in top K
NDCGRanking quality of recommendations
CoveragePercentage of catalog recommended
DiversityVariety in recommendations

Online Metrics

MetricDescription
Stream RateSongs played / Songs recommended
Skip RateSongs skipped within 30 seconds
Save RateSongs added to library
Listen TimeTotal listening duration

Business Metrics

  • Monthly Active Users (MAU)
  • Premium conversion rate
  • Session duration
  • Artist discovery rate

Architecture

Loading diagram...

Candidate Generation

Approach 1: Collaborative Filtering

Users who like similar songs will like similar new songs.

Matrix Factorization: Decompose user-song interaction matrix R into user factors (U) and song factors (V).

To recommend for user i:

  1. Get user embedding U[i] (128-dim vector)
  2. Compute dot product with all song embeddings V
  3. Return top-N highest scoring songs

Approach 2: Content-Based Filtering

Recommend songs similar to what user already likes.

Audio Features:

  • Tempo, key, loudness
  • Danceability, energy, valence
  • Audio embeddings from neural networks

Metadata Features:

  • Genre, artist, album
  • Release year
  • Lyrics embeddings

Approach 3: Two-Tower Model

Neural network approach for candidate retrieval.

Loading diagram...

Feature Engineering

User Features

FeatureDescription
listening_historySequence of recent tracks
top_artistsMost listened artists
top_genresMost listened genres
listening_time_distributionWhen they listen
skip_rateHow often they skip
playlist_creation_behaviorPlaylists they've made
premium_statusSubscription tier

Song Features

FeatureDescription
audio_featuresSpotify's audio analysis (tempo, energy, etc.)
audio_embeddingNeural network embedding of audio
artist_embeddingArtist representation
genre_embeddingGenre representation
popularityGlobal and regional popularity
release_dateWhen released
lyrics_embeddingNLP embedding of lyrics

Context Features

FeatureDescription
time_of_dayMorning, afternoon, evening, night
day_of_weekWeekday vs weekend
device_typePhone, desktop, smart speaker
activity_contextWorkout, focus, party (if available)

Model Architecture

Discover Weekly Pipeline

Weekly batch job to generate personalized playlists.

StepDescription
1Candidate Generation: 1000 songs from CF neighbors, content-based similar songs, trending in user's genres
2Ranking Model: Predict P(stream), P(save), P(skip). Combine into final score.
3Diversity Optimization: Ensure artist diversity, mix familiar and new, genre balance
4Quality Filters: Remove explicit if preference set, remove recently played, remove disliked artists

Real-time Radio Recommendations

Loading diagram...

Training

Data Collection

Signal TypeEventsInterpretation
PositiveStream complete (over 30s)User engaged with song
Save to libraryStrong preference
Add to playlistCurated preference
Repeat listenVery strong signal
NegativeSkip early (under 30s)User didn't like
Remove from playlistChanged preference
Hide songExplicit dislike

Loss Function

Multi-task learning with weighted losses:

TaskWeightRationale
Stream predictionw1Primary engagement signal
Save predictionw2Strong preference indicator
Skip predictionw3Negative signal
Playlist add predictionw4Curated preference

Total Loss = w1 x stream_loss + w2 x save_loss + w3 x skip_loss + w4 x playlist_loss

Handling Implicit Feedback

Listening data is implicit - not playing doesn't mean dislike.

Solutions:

  • Weighted matrix factorization
  • Bayesian personalized ranking (BPR)
  • Negative sampling strategies

Cold Start Problem

New Users

ApproachDescription
Onboarding surveyAsk for favorite artists/genres
Popular itemsRecommend globally popular songs
Demographic similarityUse age, location-based recommendations
Quick learningRapidly update from first few interactions

New Songs

ApproachDescription
Content-basedUse audio features and metadata
Artist fansRecommend to fans of the artist
Editorial playlistsHuman curation for initial exposure
Exploration budgetAllocate slots for new content

Serving Infrastructure

Embedding Index

Use Approximate Nearest Neighbor (ANN) search for fast similarity lookup:

ComponentConfigurationPurpose
Index TypeAnnoy / HNSW / FaissTrade-off: build time vs. query speed
Embedding Dim128Balance expressiveness vs. storage
Distance MetricAngular (cosine)Normalized similarity
Index Trees100More trees = better recall, slower
Query K100-200Candidates for ranking stage

Query Flow:

  1. User embedding -> ANN index query
  2. Return top-K most similar song embeddings
  3. Latency target: under 10ms for 100 results from 50M songs

Caching Strategy

DataCache TTLUpdate Frequency
User Embedding1 hourRe-compute on significant activity
Song EmbeddingsPermanentUpdated daily batch job
Candidate Pool15 minutesPer-user, invalidated on context change
Ranking Scores5 minutesShort-lived, context-dependent

Monitoring

Quality Metrics

MetricFormulaTarget
Stream RateStreams / ImpressionsOver 60%
Skip RateEarly Skips / StreamsUnder 25%
Save RateSaves / StreamsOver 5%
Discovery RateNew Artist Streams / Total15-30%
Diversity ScoreUnique Artists / Total StreamsOver 40%

A/B Testing

  • Test new models against production
  • Segment by user type (new, casual, power users)
  • Monitor for novelty effects

Reference

TopicDescription
Exploitation vs explorationPlaying safe favorites keeps satisfaction stable. Discovery keeps the experience fresh.
Personalization depthToo personalized feels stale. Too random feels irrelevant.
Computation trade-offReal-time is fresher but more expensive. Batch is efficient but stale.
Popularity biasPopular songs are safe bets. Long-tail discovery differentiates the platform.