In the previous article, you designed a search engine. Now let us wrap up this series with everything you need to ace a system design interview.

This article is a cheat sheet and guide. Bookmark it and review before your interview.

The 4-Step Framework

Every system design interview follows the same structure. Use this framework to stay organized and cover everything the interviewer expects.

4-Step Framework (40 minutes total):

  Step 1: Requirements (5 minutes)
    - Clarify functional requirements (what the system does)
    - Clarify non-functional requirements (scale, latency, availability)
    - Define what is IN scope and OUT of scope

  Step 2: Estimation (5 minutes)
    - Users, traffic, storage
    - QPS (queries per second)
    - Peak vs average load

  Step 3: High-Level Design (15 minutes)
    - Architecture diagram
    - Core components and how they interact
    - API design
    - Data model

  Step 4: Deep Dive (15 minutes)
    - Interviewer picks 2-3 components to go deeper
    - Discuss trade-offs, failure modes, scaling
    - Show your knowledge of specific technologies

Step 1: Requirements (5 Minutes)

Do not skip this step. Jumping straight to the design is the number one mistake candidates make.

Questions to Ask:

  Functional:
    "What are the main features?"
    "Who are the users?"
    "What are the main use cases?"

  Non-Functional:
    "How many users? DAU?"
    "What latency is acceptable?"
    "Do we need strong consistency or is eventual OK?"
    "What is the read-to-write ratio?"

  Scope:
    "Should I design authentication?"
    "Should I handle international users?"
    "What about mobile vs web?"

  Example (URL Shortener):
    Functional: shorten URL, redirect, custom codes, analytics
    Non-functional: 100M URLs/day, < 100ms redirect, 99.99% uptime
    Out of scope: user accounts, rate limiting by plan

Step 2: Estimation (5 Minutes)

Back-of-the-envelope estimation shows you can think about scale. You do not need exact numbers — order of magnitude is enough.

Estimation Template:

  1. Users and Traffic:
     DAU = X million
     Requests per day = DAU * actions per user
     QPS = requests / 86,400
     Peak QPS = QPS * 2 to 5

  2. Storage:
     Data per record = X bytes
     Records per day = Y
     Storage per day = X * Y
     Storage per year = daily * 365

  3. Bandwidth:
     Incoming: write QPS * request size
     Outgoing: read QPS * response size

  4. Memory (for caching):
     Cache 20% of daily data (80/20 rule)
     Cache size = 0.2 * daily data

Step 3: High-Level Design (15 Minutes)

Draw the architecture. Start with the client and work your way down.

Design Template:

  [Client] --> [Load Balancer] --> [API Servers]
                                       |
                              +--------+--------+
                              |        |        |
                        [Service A] [Service B] [Service C]
                              |        |        |
                         [Cache]  [Message Queue] [Storage]
                              |        |        |
                         [Database] [Workers] [Blob Store]

  For each component, briefly explain:
    - What it does
    - Why it is needed
    - Technology choice (Redis, Kafka, PostgreSQL, etc.)

Step 4: Deep Dive (15 Minutes)

The interviewer will pick areas to go deeper. Be ready to discuss:

  • How a specific component handles failure
  • How to scale a bottleneck
  • Trade-offs between different approaches
  • Specific algorithm details (consistent hashing, fan-out, etc.)

Back-of-the-Envelope Estimation Cheat Sheet

Powers of 2

Powers of 2:

  2^10 = 1 Thousand     = 1 KB
  2^20 = 1 Million      = 1 MB
  2^30 = 1 Billion      = 1 GB
  2^40 = 1 Trillion     = 1 TB
  2^50 = 1 Quadrillion  = 1 PB

  Handy approximations:
  1 Million seconds  = ~12 days
  1 Billion seconds  = ~31 years
  1 day = 86,400 seconds (~100K for quick math)
  1 month = 2.5 million seconds
  1 year = 31.5 million seconds

Latency Numbers

Latency Numbers Every Developer Should Know:

  L1 cache reference:             0.5 ns
  L2 cache reference:             7 ns
  Main memory (RAM) reference:    100 ns
  SSD random read:                150 us (150,000 ns)
  HDD random read:                10 ms  (10,000,000 ns)
  Network round trip (same DC):   500 us
  Network round trip (US to EU):  150 ms

  Summary:
  RAM is 1000x faster than SSD.
  SSD is 100x faster than HDD.
  Local network is 300x faster than cross-continent.
  Always cache in memory when possible.

QPS Estimation

QPS Estimation:

  Given: DAU (Daily Active Users) and actions per user

  QPS = DAU * actions_per_user / 86,400
  Peak QPS = QPS * 3 (typical peak-to-average ratio)

  Example:
    DAU = 10 million
    Actions per user = 20
    QPS = 10M * 20 / 86,400 = ~2,300 QPS
    Peak QPS = 2,300 * 3 = ~7,000 QPS

  Server capacity (rule of thumb):
    Single web server: 1,000-10,000 QPS (depends on request complexity)
    Single database: 5,000-50,000 QPS (depends on query complexity)
    Redis: 100,000+ QPS
    Kafka: 100,000+ messages/sec per partition

Storage Estimation

Storage Estimation:

  Per record size (typical):
    Tweet-like post:     250 bytes
    User profile:        1 KB
    Image metadata:      500 bytes
    Image file:          200 KB - 2 MB
    Video file:          50 MB - 2 GB
    Chat message:        200 bytes

  Formula:
    Daily storage = records_per_day * record_size
    Yearly storage = daily * 365
    Total storage = yearly * retention_years

  Example (chat system):
    50 billion messages/day * 200 bytes = 10 TB/day
    10 TB * 365 = 3.6 PB/year
    5-year retention: 18 PB

Technology Cheat Sheet

Database Selection Guide

When to use what:

  PostgreSQL / MySQL:
    - Structured data with relationships
    - Need ACID transactions
    - Complex queries and joins
    - < 10 TB data, < 50K QPS
    - Example: user accounts, orders, financial data

  MongoDB:
    - Semi-structured data (varying schemas)
    - Document-oriented access pattern
    - Need flexible schema
    - Example: product catalogs, content management

  Cassandra / ScyllaDB:
    - Write-heavy workloads (100K+ writes/sec)
    - Time-series data
    - Need linear horizontal scaling
    - Can tolerate eventual consistency
    - Example: chat messages, IoT sensor data, activity logs

  Redis:
    - Cache layer (sub-millisecond latency)
    - Session storage
    - Rate limiting counters
    - Leaderboards (sorted sets)
    - Pub/sub messaging

  Elasticsearch:
    - Full-text search
    - Log analytics
    - Autocomplete
    - Example: product search, log monitoring

  ClickHouse / TimescaleDB:
    - Analytics and aggregation queries
    - Time-series data analysis
    - Example: metrics, event analytics, dashboards

Message Queue Selection

When to use what:

  Apache Kafka:
    - High throughput (millions of messages/sec)
    - Event streaming (retain events for replay)
    - Fan-out to multiple consumers
    - Example: activity feeds, event sourcing, log aggregation

  RabbitMQ:
    - Task queues (distribute work to workers)
    - Complex routing rules
    - Lower latency than Kafka
    - Example: email sending, image processing tasks

  Amazon SQS:
    - Simple, managed queue (no infrastructure to manage)
    - Standard (at-least-once) or FIFO (exactly-once)
    - Example: decoupling microservices, async processing

Caching Strategy

When to cache what:

  Cache-Aside (Lazy Loading):
    Read from cache. On miss, read from DB, write to cache.
    Best for: read-heavy workloads
    Risk: stale data until TTL expires

  Write-Through:
    Write to cache AND database simultaneously.
    Best for: data that must be fresh in cache
    Risk: higher write latency

  Write-Behind:
    Write to cache immediately, write to DB asynchronously.
    Best for: write-heavy workloads
    Risk: data loss if cache crashes before DB write

  CDN:
    Cache static content (images, videos, JS, CSS) at the edge.
    Best for: global user base, media-heavy applications

Design Patterns Cheat Sheet

Pattern: When to Use

  Load Balancer:
    Multiple servers handling the same traffic.
    Algorithms: round robin, least connections, consistent hashing.

  Database Replication:
    Read-heavy workload. Leader handles writes, followers handle reads.

  Database Sharding:
    Single database cannot handle the data volume or traffic.
    Shard by user_id, entity_id, or geography.

  Caching:
    Read-heavy workload with frequently accessed data.
    Use Redis or Memcached.

  Message Queue:
    Async processing needed. Decouple producer and consumer.
    Use Kafka, RabbitMQ, or SQS.

  CDN:
    Serving static content to a global audience.
    Use Cloudflare, AWS CloudFront, or Akamai.

  Fan-Out on Write:
    Pre-compute results for fast reads (news feed, notifications).
    Trade-off: higher write cost, faster reads.

  Fan-Out on Read:
    Compute results at read time.
    Trade-off: lower write cost, slower reads.

  Consistent Hashing:
    Distributing data across servers with minimal redistribution
    when servers are added or removed.

  Rate Limiting:
    Protecting APIs from abuse. Token bucket or sliding window.

  Circuit Breaker:
    Preventing cascade failures in microservices.
    Stop calling a failing service, try again later.

  Saga Pattern:
    Distributed transactions across multiple services.
    Each step has a compensating action for rollback.

Non-Functional Requirements Checklist

Use this list to make sure you address key concerns in your design.

Non-Functional Requirements:

  1. Scalability
     "How does the system handle 10x the current traffic?"
     --> Horizontal scaling, sharding, caching, CDN

  2. Availability
     "What happens when a server goes down?"
     --> Redundancy, failover, multiple data centers
     Target: 99.99% = 52 min downtime/year

  3. Consistency
     "Do all users see the same data at the same time?"
     --> Strong consistency (banking) vs eventual (social media)
     --> CAP theorem trade-offs

  4. Latency
     "How fast does the system respond?"
     --> Caching, CDN, database indexing, async processing
     Target: p99 < 200ms for user-facing APIs

  5. Durability
     "What if the database crashes? Is data lost?"
     --> Replication (3 copies), backups, WAL (write-ahead log)
     Target: 99.999999999% (11 nines) for critical data

  6. Security
     "How do we protect against attacks?"
     --> Authentication, authorization, encryption, rate limiting

Common Mistakes That Fail Candidates

Mistake 1: Jumping to the solution
  Bad:  "I will use Kafka and Cassandra and Redis and..."
  Good: "Let me first understand the requirements. What scale are we targeting?"

Mistake 2: Not drawing a diagram
  Bad:  Talking without visualizing
  Good: Draw boxes and arrows. Label each component.

Mistake 3: One-size-fits-all
  Bad:  "I always use MongoDB" or "I always use microservices"
  Good: "For this use case, PostgreSQL fits because..."

Mistake 4: Ignoring trade-offs
  Bad:  "We should use strong consistency"
  Good: "Strong consistency adds latency. For this use case, eventual
         consistency is acceptable because users can tolerate a 1-second delay."

Mistake 5: Over-engineering
  Bad:  Designing for Google scale when the system has 10K users
  Good: "At this scale, a single PostgreSQL with read replicas is enough.
         I would shard only when we exceed 1TB of data."

Mistake 6: Not mentioning failure modes
  Bad:  Assuming everything works perfectly
  Good: "If Redis goes down, we fall back to the database. Latency
         increases but the system stays available."

Mistake 7: Forgetting about data
  Bad:  Designing services without thinking about the data model
  Good: "The main entity is a message with fields: id, sender, content,
         timestamp. I will partition by conversation_id."

What Interviewers Actually Look For

What gets you hired:

  1. Structured approach
     You follow a clear framework. You do not ramble.

  2. Trade-off analysis
     You explain WHY you chose a technology, not just WHAT.
     "I chose Cassandra over PostgreSQL because our write load
     is 500K/sec and we need linear horizontal scaling."

  3. Scale awareness
     You think about what happens at 10x or 100x current load.
     You know when things break and how to fix them.

  4. Communication
     You explain clearly. You check in with the interviewer.
     "Does this make sense? Should I go deeper into any part?"

  5. Depth on demand
     When the interviewer asks "how would you handle X?",
     you can go 2-3 levels deeper into the details.

What does NOT matter:
  - Memorizing exact numbers (order of magnitude is enough)
  - Knowing every technology (understanding patterns matters more)
  - Having the "perfect" design (trade-off analysis > correctness)

How to Practice

Practice Plan (2 weeks):

  Week 1: Foundations
    Day 1: Review all concepts (this cheat sheet)
    Day 2: Design a URL Shortener
    Day 3: Design a Chat System
    Day 4: Design a News Feed
    Day 5: Design a Video Streaming Service
    Day 6: Design a Notification System
    Day 7: Rest / review weak areas

  Week 2: Practice
    Day 8:  Design a Ride-Sharing App (like Uber)
    Day 9:  Design a Payment System (like Stripe)
    Day 10: Design a Rate Limiter
    Day 11: Design a Search Autocomplete
    Day 12: Design a Metrics/Monitoring System
    Day 13: Mock interview with a friend
    Day 14: Review and polish

  For each design:
    - Set a 40-minute timer
    - Follow the 4-step framework
    - Draw the architecture on paper
    - Write down the trade-offs you considered
    - Compare with published solutions afterward

Top 15 System Design Problems by Interview Frequency

Ranked by how often they appear in interviews:

  1.  Design a URL Shortener (bit.ly)           -- Very Common
  2.  Design a Chat System (WhatsApp)            -- Very Common
  3.  Design a News Feed (Twitter)               -- Very Common
  4.  Design a Web Crawler                       -- Common
  5.  Design a Notification System               -- Common
  6.  Design a Rate Limiter                      -- Common
  7.  Design a Key-Value Store                   -- Common
  8.  Design a Search Autocomplete               -- Common
  9.  Design a Video Platform (YouTube)          -- Common
  10. Design a File Storage (Google Drive)        -- Common
  11. Design a Ride-Sharing App (Uber)           -- Moderate
  12. Design a Payment System (Stripe)           -- Moderate
  13. Design a Metrics/Monitoring System         -- Moderate
  14. Design a Ticket Booking System             -- Moderate
  15. Design a Social Graph (LinkedIn)           -- Moderate

  If you can design the top 10, you can handle any system design interview.
  The patterns repeat across different systems.

Quick Reference: Design Any System

When you get a system design question you have never seen:

  1. Requirements (5 min)
     "What does the system do? How many users? What latency?"

  2. Estimation (5 min)
     "X million users * Y actions = Z QPS. Z * data_size = storage."

  3. API Design (2 min)
     "Main endpoints: POST /create, GET /read, PUT /update"

  4. Data Model (3 min)
     "Main entities: User, Item, Action. Relationships between them."

  5. High-Level Design (10 min)
     "Client -> LB -> API -> Service -> Cache -> DB"
     Draw it. Label each box.

  6. Deep Dive (15 min)
     Pick the hardest parts. Discuss:
     - How to scale the bottleneck
     - How to handle failures
     - Trade-offs you made and why

This article is the final part of the System Design Tutorial series. Here are all the articles:

Foundations:

Building Blocks:

Real System Designs:

Advanced:


This is the final part of the System Design Tutorial series. You now have everything you need to design scalable systems and ace system design interviews. Good luck!