Skip to main content

Design YouTube

Design a video streaming platform that lets users upload, watch, and share videos.

Related Concepts: Video Encoding/Transcoding (FFmpeg), Adaptive Bitrate Streaming (HLS/DASH), CDN Distribution, Blob Storage (S3), Asynchronous Processing (Message Queue), Chunked Upload, Thumbnail Generation, Metadata Indexing

Step 1: Requirements and Scope

Functional Requirements

  • Users can upload videos
  • Users can watch videos (streaming playback)
  • Users can search for videos
  • Users can like, comment, share videos
  • Support multiple video qualities (360p, 720p, 1080p, 4K)
  • Recommendations (optional)

Non-Functional Requirements

RequirementTargetRationale
Latency< 200ms start timeUser experience
Availability99.99%Core feature
ConsistencyEventual for metadataVideo data is immutable
DurabilityNo video lossContent is valuable

Scale Estimation

YouTube-scale numbers:

  • 2 billion monthly active users
  • 500 hours of video uploaded per minute
  • 1 billion video views per day
  • Average video length: 5 minutes
  • Storage formats: 360p, 720p, 1080p, 4K

Upload bandwidth:

  • 500 hours/minute x 60 minutes = 30,000 hours uploaded per hour
  • At 720p (1.5 Mbps): ~5.4 GB/hour per video
  • Total: ~162 TB uploaded per hour

Storage (rough):

  • Store multiple resolutions per video
  • 1080p video: ~3 GB/hour
  • With multiple formats: ~10 GB/hour per video
  • Growing by ~300+ PB per year

Step 2: High-Level Architecture

Loading diagram...

Key Components:

  • CDN: Delivers video content to users globally
  • Upload API: Handles video uploads with resumable uploads
  • Encoding Pipeline: Transcodes videos to multiple formats
  • Blob Storage: Stores raw and processed video files
  • Metadata Service: Handles video info, user data, etc.

Step 3: Video Upload Flow

Requirements

Users upload large files (potentially hours of 4K video). The upload must be:

  • Resumable (network failures happen)
  • Validated (no corrupt files)
  • Processed asynchronously (encoding takes time)

Upload Flow

Loading diagram...

Resumable Uploads

Large files need chunked, resumable uploads:

StepActionPurpose
1Client requests uploadGet upload_id and signed URL
2Client uploads in chunksTypically 5-10 MB chunks
3Server tracks progressEach chunk acknowledged
4On failure, resume from last chunkAvoid restarting entire upload
5Client confirms completionTrigger processing

Pre-signed URLs

ApproachProsCons
Upload through APISimpleAPI servers become bottleneck
Pre-signed URL to S3Scalable, directMore complex client

Recommendation: Pre-signed URLs. Let clients upload directly to object storage.

Step 4: Video Encoding Pipeline

Purpose

Users upload videos in various formats (MP4, MOV, AVI, MKV...) and resolutions. The system must:

  1. Normalize to standard formats
  2. Create multiple quality levels
  3. Optimize for streaming

Encoding Pipeline

Loading diagram...

Output Formats

ResolutionBitrateUse Case
360p0.4 MbpsMobile data saver
720p1.5 MbpsStandard mobile
1080p4 MbpsDesktop, good connection
4K15 MbpsHigh-end devices

Parallel Encoding

A 10-minute video takes ~10 minutes to encode sequentially.

Solution: Segment-based parallel encoding

Loading diagram...
ApproachTime for 10 min videoWorkers
Sequential~10 minutes1
Parallel (5 workers)~2 minutes5
Parallel (10 workers)~1 minute10

Encoding Infrastructure

OptionProsCons
Self-managed EC2/GCEFull controlOps overhead
AWS Elastic TranscoderManaged, scalableCost at scale
Custom encoding farmOptimized for workloadComplex

YouTube's approach: Custom encoding infrastructure (Borg) for cost efficiency at scale.

Step 5: Video Storage

Storage Tiers

Not all videos are accessed equally. Optimize storage costs:

TierStorage TypeAccess PatternCost
HotSSD / Standard S3Frequent (popular videos)$$$
WarmHDD / S3 IAOccasional$$
ColdGlacier / ArchiveRare (old videos)$

Storage Organization

Videos are organized in a hierarchical folder structure. At the top level, each video has a folder identified by its video_id. Within each video folder:

  • A raw subfolder contains the original uploaded file
  • An encoded subfolder contains subfolders for each resolution (360p, 720p, 1080p), with each resolution folder holding numbered segment files plus the HLS/DASH manifest file
  • A thumbnails subfolder contains the default thumbnail and preview frames

Content Addressing

Use content-addressed storage for deduplication:

ApproachHow It WorksSavings
Video-levelHash entire videoLow (slight re-encodes differ)
Segment-levelHash each segmentMedium (common intros/outros)
Block-levelHash small blocksHigh (requires more compute)

Step 6: Video Streaming

Streaming Protocols

ProtocolHow It WorksUse Case
HLSHTTP-based, chunksiOS, Safari, default choice
DASHHTTP-based, adaptiveCross-platform, YouTube uses
RTMPPersistent connectionLegacy, live streaming

Adaptive Bitrate Streaming

The player automatically switches quality based on network conditions.

Loading diagram...

Manifest File

The HLS manifest (m3u8 file) lists all available quality levels with their bandwidth requirements and resolutions. It contains entries for each resolution option: 360p at approximately 400 Kbps for low-bandwidth connections, 720p at 1.5 Mbps for standard quality, and 1080p at 4 Mbps for high-definition playback. Each entry points to that resolution's segment playlist, allowing the player to switch between quality levels based on network conditions.

Step 7: Content Delivery Network (CDN)

CDN Architecture

Without CDN: User in Tokyo requests video stored in US -> 200ms+ latency

With CDN: Video cached at Tokyo edge server -> 20ms latency

Loading diagram...

CDN Caching Strategy

Content TypeCache DurationRationale
Popular videosDays-weeksFrequently accessed
Long-tail videosHoursMay not be accessed again
ThumbnailsWeeksSmall, frequently shown
ManifestsMinutesMay be updated

Multi-CDN Strategy

YouTube uses multiple CDNs:

  • Google's private network (most traffic)
  • ISP peering (cache inside ISP networks)
  • Commercial CDNs (backup/overflow)
BenefitDescription
RedundancyCDN outage does not take down service
Cost optimizationRoute to cheapest option
PerformanceChoose fastest for each user

Metadata Schema

The videos table stores core video metadata:

  • video_id: Unique identifier (primary key)
  • user_id: Uploader's account
  • title and description: User-provided content
  • duration_seconds: Video length
  • upload_time: When the video was uploaded
  • status: Processing state (processing, ready, or failed)
  • view_count and like_count: Engagement metrics

The video_formats table tracks encoded versions with a composite primary key of video_id and resolution. Each row stores the resolution (like "720p"), bitrate, and storage path for that encoded version.

View Count Problem

Naive approach: UPDATE videos SET view_count = view_count + 1

At YouTube scale (1B views/day), this creates massive database contention.

Solution: Batch counting

Loading diagram...

Search Implementation

ComponentTechnologyPurpose
Primary searchElasticsearchFull-text search on titles, descriptions
AutocompleteTrie/Prefix treeInstant suggestions
TrendingRedisFast access to popular searches

Step 9: Handling Failures

Upload Failures

FailureDetectionRecovery
Network timeoutClient timeoutResume from last chunk
Corrupt chunkChecksum mismatchRetry chunk upload
Storage failureWrite errorRetry to different region

Encoding Failures

FailureDetectionRecovery
Worker crashHeartbeat timeoutRe-queue segment
Corrupt outputValidation checkRe-encode
Resource exhaustionOOM errorSmaller segments

Playback Failures

FailureDetectionRecovery
CDN cache miss404 responseFetch from origin
Quality unavailableManifest lookupFall back to lower quality
Network degradationBuffering eventsSwitch to lower bitrate

Step 10: Cost Optimization

Video platforms are expensive. Key cost drivers:

Cost AreaAt YouTube ScaleOptimization
Storage700+ PBTiered storage, dedup
CDN/BandwidthMassivePrivate network, peering
Encoding500 hrs/min uploadedEfficient codecs, parallelization
ComputeTranscoding, MLSpot instances, efficient scheduling

Codec Evolution

CodecBitrate SavingsAdoption
H.264BaselineUniversal
H.265/HEVC25-50% vs H.264Growing
VP930-50% vs H.264YouTube default
AV130% vs VP9New standard

YouTube aggressively pushes VP9/AV1 to reduce bandwidth costs.

Real-World Systems

CompanyNotable Design Choice
YouTubeVP9/AV1 codecs, private CDN (Google's network), bigtable for metadata
NetflixPer-title encoding (each video gets optimal settings), Open Connect CDN
TwitchOptimized for live (lower latency transcoding), HLS
TikTokShort-form optimized, aggressive caching, quick startup

Summary: Key Design Decisions

DecisionOptionsRecommendation
Upload methodThrough API, Direct to storagePre-signed URLs to S3
EncodingSequential, ParallelParallel segment encoding
Streaming protocolHLS, DASH, RTMPHLS/DASH with adaptive bitrate
StorageSingle tier, TieredTiered (hot/warm/cold)
CDNSingle CDN, Multi-CDNMulti-CDN with private network
View countingReal-time, BatchedBatched with Redis buffer