In the previous article, you designed a video streaming service. Now let us design a file storage and sync system like Google Drive, Dropbox, or OneDrive.
File storage systems are complex because they need to sync files across multiple devices, handle conflicts, and deduplicate data. Let us break it down.
Step 1: Requirements
Functional Requirements
- Upload and download files
- Sync files across devices (desktop, mobile, web)
- Share files and folders with other users
- File versioning (view and restore previous versions)
- Offline access (work offline, sync when reconnected)
Non-Functional Requirements
- High reliability — files must never be lost
- Fast sync — changes appear on other devices within seconds
- Low bandwidth usage — only transfer changed parts of files
- Support 500 million users with 100 million daily active users
Step 2: Estimation
Users: 500M total, 100M DAU
Storage:
Average files per user: 500 files
Average file size: 500 KB
Total files: 250 billion files
Total storage: 500M users * 500 files * 500 KB = 125 PB
Daily activity:
File uploads/edits: 200M per day
File downloads: 500M per day
File syncs: 1 billion per day (across devices)
Upload bandwidth:
200M uploads/day * average 200 KB change = 40 TB/day upload
40 TB / 86,400 = ~460 MB/sec average upload
Metadata:
Each file: ~200 bytes of metadata (name, size, hash, version, path)
250 billion files * 200 bytes = 50 TB of metadata
Step 3: Block Storage — The Key Insight
Instead of storing files as single blobs, split them into fixed-size blocks (typically 4 MB). This is the foundation of how Dropbox and Google Drive work.
Block Storage:
File: report.pdf (12 MB)
Without blocks:
Store as one 12 MB blob.
If 1 byte changes, re-upload the entire 12 MB.
With blocks (4 MB each):
Block 1: bytes 0 to 4MB hash: abc123
Block 2: bytes 4MB to 8MB hash: def456
Block 3: bytes 8MB to 12MB hash: ghi789
File metadata: [abc123, def456, ghi789]
If bytes in Block 2 change:
Block 1: unchanged hash: abc123 (skip upload)
Block 2: changed hash: xyz999 (upload only this block)
Block 3: unchanged hash: ghi789 (skip upload)
Upload: 4 MB instead of 12 MB (67% bandwidth savings)
Why 4 MB Blocks?
Block Size Trade-offs:
Small blocks (64 KB):
+ More granular sync (upload only tiny changes)
+ Better deduplication
- More metadata overhead (more block references per file)
- More HTTP requests for upload/download
Large blocks (16 MB):
+ Fewer metadata entries
+ Fewer HTTP requests
- Less granular sync (upload 16 MB for a 1-byte change)
- Worse deduplication
4 MB (Dropbox's choice):
Good balance between sync granularity and metadata overhead.
Most file edits modify less than 4 MB.
One HTTP request per block is manageable.
Step 4: Deduplication
Content-based deduplication saves storage by storing identical blocks only once.
Deduplication:
Alex uploads photo.jpg (8 MB):
Block A1: hash = abc123
Block A2: hash = def456
Sam uploads the same photo.jpg:
Block S1: hash = abc123 --> already exists! Skip storage.
Block S2: hash = def456 --> already exists! Skip storage.
Storage used: 8 MB (not 16 MB)
Both users' file metadata points to the same blocks.
How it works:
1. Client computes SHA-256 hash of each block
2. Client asks server: "Do you have block abc123?"
3. Server: "Yes" --> skip upload
4. Server: "No" --> upload the block
Deduplication ratio (real-world):
Across all users: 30-50% storage savings
Within one user (file versions): 80-90% savings
Corporate environments (shared files): 60-70% savings
Step 5: File Sync Engine
The sync engine detects local changes and synchronizes them with the server.
Detecting Changes
Change Detection:
Desktop Client:
1. Use filesystem watcher (inotify on Linux, FSEvents on macOS)
Detects: file created, modified, deleted, moved
2. When a change is detected:
a. Compute block hashes for the changed file
b. Compare with stored block hashes
c. Upload only the changed blocks
Mobile/Web Client:
No filesystem watcher.
Use manual sync or periodic polling.
Server-Side Change Notification:
1. When server receives a file update from Device A
2. Server sends notification to all other devices of the same user
3. Notification methods:
- WebSocket (if connected): instant notification
- Long polling: client holds open connection
- Push notification (mobile): APNs/FCM
4. Device B receives notification: "file X changed"
5. Device B downloads only the changed blocks
Sync Flow
Sync Flow (Alex edits a file on laptop):
1. Alex edits report.pdf on laptop
2. Laptop's filesystem watcher detects the change
3. Sync engine computes new block hashes:
Block 1: abc123 (unchanged)
Block 2: xyz999 (CHANGED)
Block 3: ghi789 (unchanged)
4. Upload Block 2 (xyz999) to block storage
5. Update file metadata:
version 2: [abc123, xyz999, ghi789]
6. Server notifies Alex's desktop and phone: "report.pdf v2 available"
7. Desktop downloads Block 2 (xyz999)
8. Phone downloads Block 2 (xyz999) when on Wi-Fi
[Laptop] --changed blocks--> [Block Storage]
|
[Update Metadata]
|
[Notify Devices]
/ \
[Desktop: sync] [Phone: sync when ready]
Step 6: Conflict Resolution
What happens when two devices edit the same file at the same time?
Conflict Scenario:
Alex edits report.pdf on laptop (offline)
Alex also edits report.pdf on desktop (offline)
Laptop comes online: uploads version 2 (laptop changes)
Desktop comes online: uploads version 2 (desktop changes)
Conflict! Two different version 2s exist.
Resolution Strategies
Strategy 1: Last Writer Wins (simple)
The most recent upload overwrites the other.
Desktop uploaded at 10:05, laptop at 10:03.
Desktop version wins. Laptop changes are lost.
Used by: some simple systems
Problem: data loss
Strategy 2: Create Conflict Copy (Dropbox approach)
Keep both versions.
Files:
report.pdf (desktop version)
report (conflict).pdf (laptop version)
The user manually resolves the conflict.
No data loss, but requires user action.
Used by: Dropbox, Google Drive (for offline edits)
Strategy 3: Operational Merge (Google Docs approach)
Track individual edit operations, not file versions.
Merge operations using Operational Transform (OT) or CRDTs.
Alex types "Hello" at position 0
Sam types "World" at position 0
Merged: "HelloWorld" or "WorldHello" (deterministic merge)
Used by: Google Docs, Notion, Figma
Only works for structured data (documents, spreadsheets)
Not practical for binary files (images, PDFs)
Recommended approach for file storage:
Online edits: lock the file (prevent concurrent edits)
Offline edits: create conflict copies (Dropbox approach)
Step 7: File Versioning
Versioning:
File: report.pdf
Version 1 (created): blocks [abc123, def456, ghi789]
Version 2 (edit p2): blocks [abc123, xyz999, ghi789]
Version 3 (edit p1): blocks [mno111, xyz999, ghi789]
Storage for 3 versions:
Without dedup: 12 MB * 3 = 36 MB
With block dedup: only unique blocks stored
abc123 (4MB) + def456 (4MB) + ghi789 (4MB) + xyz999 (4MB) + mno111 (4MB) = 20 MB
Version metadata:
| version | blocks | timestamp | size |
|---------|--------------------------------|---------------------|-------|
| 1 | [abc123, def456, ghi789] | 2026-06-01 10:00 | 12 MB |
| 2 | [abc123, xyz999, ghi789] | 2026-06-01 14:30 | 12 MB |
| 3 | [mno111, xyz999, ghi789] | 2026-06-02 09:15 | 12 MB |
Restore version 1:
Download blocks abc123, def456, ghi789 and reassemble.
Retention policy:
Keep all versions for 30 days.
After 30 days, keep one version per week.
After 1 year, keep one version per month.
Step 8: Sharing and Permissions
Sharing Model:
Permission levels:
- Owner: full control (edit, share, delete)
- Editor: can edit and view
- Viewer: can only view and download
Sharing a file:
1. Owner creates a sharing record:
(file_id, user_id, permission_level)
2. Shared user receives a notification
3. File appears in their "Shared with me" folder
Sharing a link:
1. Generate a unique link token
2. Anyone with the link can access (view-only or edit)
3. Link can have expiration and password protection
Folder sharing:
Sharing a folder shares ALL files and subfolders.
Permissions cascade down the folder tree.
Individual files can have overriding permissions.
Database:
sharing (file_id, user_id, permission, created_at, expires_at)
share_links (link_token, file_id, permission, password_hash, expires_at)
Step 9: System Architecture
Complete Architecture:
[Client (Desktop/Mobile/Web)]
|
[Load Balancer]
|
[API Gateway]
/ | \
[File [Sync [Sharing
Service] Service] Service]
| | |
[Block [Notification [Permission
Storage] Service] Service]
| | |
[S3] [WebSocket [PostgreSQL]
+ Push]
Metadata:
[Metadata DB (PostgreSQL)]
- File tree (folders, names, paths)
- File versions
- Block references per version
- User storage quotas
Block Storage:
[Block Store (S3)]
- Content-addressed blocks
- Key = SHA-256 hash of block content
- Replicated across 3+ data centers
Sync:
[Sync Service]
- Receives block changes from clients
- Updates metadata
- Notifies other devices via WebSocket or push
Search:
[Search Service (Elasticsearch)]
- Index file names, content (for docs), tags
- Full-text search across all user's files
Upload Flow
Upload Flow (new file):
1. Client splits file into 4 MB blocks
2. Client computes SHA-256 hash of each block
3. Client sends block hashes to server:
"I have blocks [abc123, def456, ghi789]"
4. Server checks which blocks already exist:
"I need blocks [def456]. I already have [abc123, ghi789]."
5. Client uploads only the missing blocks
6. Server stores the blocks in S3
7. Server creates file metadata:
{ name: "report.pdf", version: 1, blocks: [...], size: 12MB }
8. Server notifies other devices of the new file
This is called "upload deduplication" — skip blocks the server already has.
Download Flow
Download Flow:
1. Client requests file metadata:
"Give me report.pdf version 3"
2. Server returns block list: [mno111, xyz999, ghi789]
3. Client checks local cache:
"I already have [xyz999, ghi789] from version 2"
4. Client downloads only missing block: [mno111]
5. Client reassembles the file from all blocks
Bandwidth used: 4 MB instead of 12 MB
Step 10: Scaling
Scaling Strategy:
Block Storage (S3):
- S3 scales automatically to petabytes
- 3-way replication for durability (99.999999999%)
- Use S3 Intelligent-Tiering for cost optimization
- Frequently accessed blocks: S3 Standard
- Rarely accessed blocks: S3 Infrequent Access
Metadata Database:
- PostgreSQL with read replicas
- Shard by user_id (each user's file tree on one shard)
- 50 TB of metadata across 10-20 shards
Sync Service:
- Stateless, horizontally scaled
- Each instance handles WebSocket connections
- Redis pub/sub for cross-instance notifications
CDN:
- Shared files and public links served via CDN
- Personal files fetched directly from S3 (no CDN benefit for single-user files)
Multi-Region:
- Block storage replicated across regions
- Metadata primary in user's home region
- Cross-region sync for users who travel
Common Mistakes
Storing files as single blobs. This wastes bandwidth on every sync. Split files into blocks so you only transfer changes.
No deduplication. Without dedup, identical files uploaded by different users (or file versions) waste storage. Content-addressed blocks solve this.
Synchronous conflict resolution. Do not block the upload because of a conflict. Accept both versions and resolve later (conflict copy or merge).
Polling for changes. Checking every few seconds wastes bandwidth and battery. Use WebSocket or long polling for change notifications.
Interview Tips
Start with the block storage concept. “Files are split into 4 MB blocks identified by their content hash. This enables deduplication and efficient sync.”
Explain sync with a concrete example. “When a user edits a 12 MB file, only the changed 4 MB block is uploaded and synced to other devices.”
Discuss conflict resolution. “For concurrent offline edits, I will use the conflict copy approach — keep both versions and let the user decide.”
Mention deduplication. “Content-based deduplication saves 30-50% storage. If two users upload the same photo, it is stored only once.”
Talk about notification. “I will use WebSocket for real-time sync notifications. When a file changes on one device, other devices are notified immediately.”
Related Articles
- System Design #16: Design a Video Streaming Service — Content storage and CDN
- System Design #5: Databases — SQL vs NoSQL for metadata
- System Design #7: Message Queues — Event-driven notifications
- System Design #4: Caching — CDN for shared files
What’s Next?
In the next article, System Design #18: Design a Notification System, you will learn:
- Multi-channel notifications: push, email, SMS
- Priority queues for different notification types
- Deduplication and rate limiting
- How to handle millions of notifications per minute
This is part 17 of the System Design Tutorial series. Follow along to learn system design from scratch.