In the previous article, you learned about data partitioning and sharding. Now let us design a real system: a URL shortener like bit.ly.
This is one of the most popular system design interview questions. It looks simple but touches many core concepts: hashing, databases, caching, and scaling.
Step 1: Requirements Always start by clarifying what the system needs to do.
Functional Requirements Given a long URL, generate a short URL When a user visits the short URL, redirect to the original long URL Users can optionally set a custom short code Short URLs expire after a configurable time (default: 5 years) Non-Functional Requirements The system should be highly available (redirects must always work) Redirection should happen in real time (< 100ms) Short URLs should not be guessable (no sequential IDs) Not in Scope (for this design) User accounts and authentication URL analytics dashboard (we will discuss basic analytics) Paid plans and rate limiting by plan Step 2: Back-of-the-Envelope Estimation Traffic Estimation: Write (new URLs created): 100 million per day Read (redirections): 10 billion per day (100:1 read-to-write ratio) Writes per second: 100M / 86,400 = ~1,160 writes/sec Reads per second: 10B / 86,400 = ~115,740 reads/sec Peak: 2-3x average Peak writes: ~3,000/sec Peak reads: ~350,000/sec Storage Estimation: Each URL mapping: ~500 bytes (short code + long URL + metadata) Per day: 100M * 500 bytes = 50 GB/day Per year: 50 GB * 365 = ~18 TB/year 5 years (retention): ~90 TB total Short Code Length: Using Base62 (a-z, A-Z, 0-9) = 62 characters 6 characters: 62^6 = 56.8 billion combinations 7 characters: 62^7 = 3.5 trillion combinations At 100M URLs/day for 5 years = 182.5 billion URLs 7 characters is enough (3.5 trillion >> 182.5 billion) Step 3: API Design REST API: POST /api/shorten Request: { "long_url": "https://example.com/very/long/path?query=value", "custom_code": "my-link", // optional "expiration": "2031-01-01" // optional } Response: { "short_url": "https://short.ly/Ab3xK9", "long_url": "https://example.com/very/long/path?query=value", "expires_at": "2031-01-01T00:00:00Z" } GET /{shortCode} Response: HTTP 301 Redirect to the long URL Location: https://example.com/very/long/path?query=value 301 vs 302 Redirect 301 (Permanent Redirect): The browser caches the redirect. Subsequent visits go directly to the long URL without hitting your server. Less server load but you lose analytics data. 302 (Temporary Redirect): The browser does NOT cache. Every visit hits your server first. More server load but you can track every click. Choose 302 if analytics are important. Choose 301 for maximum performance with less tracking.
...