Design a Music Streaming Service
Understanding the Problem
Design a music streaming service like Spotify that can serve 100M+ songs to millions of concurrent users worldwide.
Functional Requirements:
- Search for songs, artists, and albums
- Play songs with seamless streaming
- Create and manage playlists
- Get personalized recommendations (e.g. Discover Weekly)
- Download songs for offline listening
Non-Functional Requirements:
- Low playback latency: Song playback must start within 200ms of pressing play
- Gapless playback: No silence between consecutive tracks
- Massive scale: Handle 50M concurrent streams
- Global availability: Low-latency streaming worldwide via CDN edge nodes
Estimation
Let's size this system:
- 500M total users, 50M concurrent streams at peak
- 100M songs in the catalog, average ~3 MB per song (compressed) = 300 PB of audio storage
- 200M playlist operations/day β creates, adds, deletes, reorders
- Bandwidth: 50M streams Γ 160 kbps average = ~8 Tbps aggregate bandwidth
- Metadata: 100M songs Γ ~5 KB metadata each = ~500 GB (easily fits in memory)
The main challenges are: (1) serving audio at massive scale via CDN, (2) pre-fetching the next track for gapless playback, and (3) building a recommendation engine that keeps users engaged.
Audio Encoding & Chunked Streaming
Music is encoded at different quality levels to adapt to network conditions:
- Low: 96 kbps (Ogg Vorbis) β mobile data saver
- Normal: 160 kbps (Ogg Vorbis) β default quality
- High: 320 kbps (Ogg Vorbis) β premium tier
Spotify uses Ogg Vorbis for streaming; Apple Music uses AAC. Both are lossy but perceptually excellent at 160+ kbps.
Chunked streaming: Songs are split into ~10-second chunks, similar to HLS/DASH for video. This enables:
- Adaptive bitrate: Switch quality mid-song based on bandwidth
- Fast start: Begin playback after buffering just 1-2 chunks (~200ms)
- Pre-fetch: Start loading the next track's first chunks while the current song is still playing (gapless playback)
- Seek: Jump to any point without downloading the entire file
Recommendation Engine
Personalized recommendations are the killer feature that keeps users on the platform. Spotify's Discover Weekly uses a multi-stage pipeline:
Collaborative Filtering:
- "Users who liked song X also liked song Y"
- Matrix factorization on the user-song interaction matrix (billions of plays)
- Works well for popular songs but struggles with new/niche tracks (cold start problem)
Content-Based Filtering:
- Analyze audio features: tempo, key, energy, danceability, acousticness
- Use deep learning models on raw audio spectrograms
- Solves the cold start problem β new songs can be recommended based on their audio features
Hybrid Approach:
- Combine collaborative and content-based signals
- Add contextual features: time of day, listening history, skip patterns
- Re-rank with business rules (boost new releases, licensed content)
- Spotify runs this pipeline weekly for Discover Weekly, daily for Daily Mix
Playlist Management & Offline Sync
Playlists:
- Stored as ordered lists of song IDs in a database
- Support collaborative playlists (multiple editors) with operational transforms or CRDTs for conflict resolution
- 200M operations/day means ~2,300 writes/second β manageable with sharding by user_id
Offline Download:
- Songs are downloaded encrypted (DRM) to the device
- Playback requires a valid license token (checked periodically when online)
- Sync service tracks which songs are cached locally and handles cleanup when storage is low
Royalty & Licensing:
- Every stream is logged for royalty calculations
- Stream events go to a Kafka pipeline for aggregation
- Royalties are calculated per-stream based on licensing agreements (pro-rata or user-centric model)