System Design #18: Design a Notification System

In the previous article, you designed a file storage system. Now let us design a notification system that sends push notifications, emails, and SMS messages to millions of users. Every large application needs notifications. Whether it is a new message alert, an order confirmation, or a security warning, the notification system is a critical piece of infrastructure. Step 1: Requirements Functional Requirements Send push notifications (iOS and Android) Send email notifications Send SMS notifications Support different notification types: transactional, marketing, system alerts User preferences: opt-in/opt-out per channel, quiet hours Template-based notifications Delivery tracking and analytics Non-Functional Requirements Soft real-time: transactional notifications within 30 seconds At-least-once delivery (no lost notifications) High throughput: 10 million notifications per minute during peaks No duplicate notifications (deduplication) Scalable to billions of notifications per day Step 2: Estimation Notifications per day: 5 billion Push: 3 billion (60%) Email: 1.5 billion (30%) SMS: 500 million (10%) Peak load: 10 million per minute = ~167,000 per second Per notification: Push: ~500 bytes payload Email: ~5 KB (with HTML template) SMS: ~200 bytes Storage for notification history: 5 billion * 1 KB (average) = 5 TB/day Retention: 30 days = 150 TB Step 3: Notification Types Different notifications have different priorities and requirements. ...

May 29, 2026 · 11 min

Claude Opus 4.8: 5 Things That Changed for Developers

Anthropic released Claude Opus 4.8 today. It builds on Opus 4.7 with the same price but better performance. This article covers the five changes that matter most for developers. 1. It Catches Its Own Bugs — 4x More Often This is the biggest change in Opus 4.8. Previous versions would sometimes complete a task without mentioning problems in the code. Opus 4.8 is four times more likely to flag issues it finds in its own output. ...

May 28, 2026 · 3 min

System Design #17: Design a File Storage System

In the previous article, you designed a video streaming service. Now let us design a file storage and sync system like Google Drive, Dropbox, or OneDrive. File storage systems are complex because they need to sync files across multiple devices, handle conflicts, and deduplicate data. Let us break it down. Step 1: Requirements Functional Requirements Upload and download files Sync files across devices (desktop, mobile, web) Share files and folders with other users File versioning (view and restore previous versions) Offline access (work offline, sync when reconnected) Non-Functional Requirements High reliability — files must never be lost Fast sync — changes appear on other devices within seconds Low bandwidth usage — only transfer changed parts of files Support 500 million users with 100 million daily active users Step 2: Estimation Users: 500M total, 100M DAU Storage: Average files per user: 500 files Average file size: 500 KB Total files: 250 billion files Total storage: 500M users * 500 files * 500 KB = 125 PB Daily activity: File uploads/edits: 200M per day File downloads: 500M per day File syncs: 1 billion per day (across devices) Upload bandwidth: 200M uploads/day * average 200 KB change = 40 TB/day upload 40 TB / 86,400 = ~460 MB/sec average upload Metadata: Each file: ~200 bytes of metadata (name, size, hash, version, path) 250 billion files * 200 bytes = 50 TB of metadata Step 3: Block Storage — The Key Insight Instead of storing files as single blobs, split them into fixed-size blocks (typically 4 MB). This is the foundation of how Dropbox and Google Drive work. ...

May 28, 2026 · 10 min

System Design #16: Design a Video Streaming Service

In the previous article, you designed a news feed. Now let us design a video streaming service like YouTube or Netflix. Video streaming is a complex system with two major pipelines: uploading and processing videos, and streaming them to viewers. Let us break it down step by step. Step 1: Requirements Functional Requirements Upload videos Stream/watch videos Search for videos Like, comment, and subscribe Video recommendations Multiple video quality options (360p, 720p, 1080p, 4K) Non-Functional Requirements High availability — videos should always be watchable Low latency — video should start playing within 2 seconds Smooth playback — no buffering on stable connections Support 1 billion daily active users Support 5 billion video views per day Step 2: Estimation Daily Active Users: 1 billion Video views per day: 5 billion Videos uploaded per day: 500,000 Average video size (original): 500 MB Average video duration: 5 minutes Upload storage per day: 500,000 videos * 500 MB = 250 TB/day (original files) After transcoding (multiple resolutions + formats): Each video -> 5 resolutions * 3 formats = 15 versions Average transcoded version: 100 MB 500,000 * 15 * 100 MB = 750 TB/day (transcoded files) Total storage per day: ~1 PB/day Total storage per year: ~365 PB/year Streaming bandwidth: 5 billion views/day Average bitrate: 5 Mbps (1080p) Average watch time: 3 minutes Total bandwidth: 5B * 5 Mbps * 180 sec = 4.5 exabits/day ~52 Tbps average bandwidth Step 3: Two Main Pipelines A video streaming service has two distinct pipelines that work independently. ...

May 28, 2026 · 10 min

System Design #15: Design a News Feed

In the previous article, you designed a chat system. Now let us design a news feed — the home timeline you see on Twitter/X, Instagram, or Facebook. The news feed is one of the most common interview questions. It tests your understanding of fan-out strategies, caching, ranking, and scale. Step 1: Requirements Functional Requirements Users can create posts (text, images, links) Users can follow other users Users see a news feed with posts from people they follow Posts are ranked (not just chronological) Trending topics section Like and comment on posts Non-Functional Requirements News feed loads in under 200ms New posts appear in followers’ feeds within 5 seconds The system supports 500 million daily active users High availability — the feed should always load, even if stale Step 2: Estimation Users: 500 million DAU Posts: Each user creates ~2 posts/day Total: 1 billion posts/day Posts per second: 1B / 86,400 = ~11,600 posts/sec Feed Reads: Each user opens the feed ~10 times/day Total: 5 billion feed reads/day Reads per second: 5B / 86,400 = ~57,870 reads/sec Following: Average user follows 300 people Some users have millions of followers (celebrities) Storage: Average post: 1 KB (text + metadata) 1 billion posts/day * 1 KB = 1 TB/day Per year: ~365 TB Media (images, videos): stored in blob storage + CDN Step 3: The Core Problem — Feed Generation When a user opens their feed, the system must show recent posts from all the people they follow, ranked by relevance. There are two approaches. ...

May 28, 2026 · 10 min

System Design #14: Design a Chat System

In the previous article, you designed a URL shortener. Now let us tackle a more complex system: a real-time chat application like WhatsApp or Slack. Chat systems are a favorite in system design interviews because they combine real-time communication, message storage, presence detection, and push notifications. Step 1: Requirements Functional Requirements One-on-one messaging Group messaging (up to 500 members) Online/offline status (presence) Read receipts (message seen) Media sharing (images, files) Push notifications for offline users Message history (persistent storage) Non-Functional Requirements Real-time delivery (< 200ms for online users) Messages must never be lost (durability) Message ordering must be preserved within a conversation The system should support 2 billion users with 50 billion messages per day Step 2: Back-of-the-Envelope Estimation Users: 2 billion total, 500 million daily active users (DAU) Messages: 50 billion messages/day 50B / 86,400 = ~580,000 messages/sec Peak: ~1.5 million messages/sec Message size: Average message: 200 bytes (text) Media messages: ~200 KB (image thumbnail + metadata) Storage per day: Text: 50B * 200 bytes = 10 TB/day Media: assume 5% of messages have media 2.5B * 200 KB = 500 TB/day (media stored in blob storage) Text storage per year: 10 TB * 365 = 3.6 PB Connections: 500M concurrent WebSocket connections Each connection uses ~10 KB of memory Total memory for connections: 5 TB Need thousands of chat servers Step 3: Communication Protocol Why WebSocket? For real-time chat, the server must push messages to clients immediately. HTTP is request-response — the client must ask for new messages. WebSocket provides a persistent, bidirectional connection. ...

May 27, 2026 · 11 min

System Design #13: Design a URL Shortener

In the previous article, you learned about data partitioning and sharding. Now let us design a real system: a URL shortener like bit.ly. This is one of the most popular system design interview questions. It looks simple but touches many core concepts: hashing, databases, caching, and scaling. Step 1: Requirements Always start by clarifying what the system needs to do. Functional Requirements Given a long URL, generate a short URL When a user visits the short URL, redirect to the original long URL Users can optionally set a custom short code Short URLs expire after a configurable time (default: 5 years) Non-Functional Requirements The system should be highly available (redirects must always work) Redirection should happen in real time (< 100ms) Short URLs should not be guessable (no sequential IDs) Not in Scope (for this design) User accounts and authentication URL analytics dashboard (we will discuss basic analytics) Paid plans and rate limiting by plan Step 2: Back-of-the-Envelope Estimation Traffic Estimation: Write (new URLs created): 100 million per day Read (redirections): 10 billion per day (100:1 read-to-write ratio) Writes per second: 100M / 86,400 = ~1,160 writes/sec Reads per second: 10B / 86,400 = ~115,740 reads/sec Peak: 2-3x average Peak writes: ~3,000/sec Peak reads: ~350,000/sec Storage Estimation: Each URL mapping: ~500 bytes (short code + long URL + metadata) Per day: 100M * 500 bytes = 50 GB/day Per year: 50 GB * 365 = ~18 TB/year 5 years (retention): ~90 TB total Short Code Length: Using Base62 (a-z, A-Z, 0-9) = 62 characters 6 characters: 62^6 = 56.8 billion combinations 7 characters: 62^7 = 3.5 trillion combinations At 100M URLs/day for 5 years = 182.5 billion URLs 7 characters is enough (3.5 trillion >> 182.5 billion) Step 3: API Design REST API: POST /api/shorten Request: { "long_url": "https://example.com/very/long/path?query=value", "custom_code": "my-link", // optional "expiration": "2031-01-01" // optional } Response: { "short_url": "https://short.ly/Ab3xK9", "long_url": "https://example.com/very/long/path?query=value", "expires_at": "2031-01-01T00:00:00Z" } GET /{shortCode} Response: HTTP 301 Redirect to the long URL Location: https://example.com/very/long/path?query=value 301 vs 302 Redirect 301 (Permanent Redirect): The browser caches the redirect. Subsequent visits go directly to the long URL without hitting your server. Less server load but you lose analytics data. 302 (Temporary Redirect): The browser does NOT cache. Every visit hits your server first. More server load but you can track every click. Choose 302 if analytics are important. Choose 301 for maximum performance with less tracking. ...

May 27, 2026 · 10 min

System Design #12: Data Partitioning and Sharding

In the previous article, you learned about consistent hashing. Now let us dive deep into data partitioning — how to split your database across multiple machines when one is not enough. Why Partition Data? A single database server has limits. It can only store so much data and handle so many queries per second. When you hit those limits, you have two choices: Vertical scaling — buy a bigger machine (expensive, has a ceiling) Horizontal scaling — split data across multiple machines (partitioning) Single Database (No Partitioning): [All 500M users] --> [One Database Server] | |--> 10TB of data |--> 50,000 queries/sec |--> Single point of failure |--> $$$$ for a huge machine Partitioned Database: [Users A-M] --> [Database Shard 1] (5TB, 25K qps) [Users N-Z] --> [Database Shard 2] (5TB, 25K qps) Each shard handles half the data and half the traffic. If one shard goes down, only half the users are affected. Horizontal vs Vertical Partitioning There are two ways to split data. ...

May 27, 2026 · 10 min

System Design #11: Consistent Hashing

In the previous article, you learned about rate limiting algorithms. Now let us solve a fundamental problem in distributed systems: how to distribute data across multiple servers. Consistent hashing is the answer. It is used by Amazon DynamoDB, Apache Cassandra, Akamai CDN, and Discord. Once you understand it, you will see it everywhere. The Problem: Distributing Data Across Servers Imagine you have a cache with 4 servers. You need to decide which server stores which data. The simplest approach is modular hashing. ...

May 26, 2026 · 11 min

System Design #10: Rate Limiting and Throttling

In the previous article, you learned about microservices and monolith architectures. Now let us talk about protecting your APIs from abuse: rate limiting. Rate limiting controls how many requests a client can make in a given time period. Without it, a single client can overwhelm your servers, intentionally or by accident. Why Every API Needs Rate Limiting 1. Prevent Abuse A malicious user can send thousands of requests per second to overload your servers. Rate limiting stops them before they cause damage. ...

May 26, 2026 · 13 min