System Design #8: API Design — REST, GraphQL, gRPC

In the previous article, you learned about message queues for asynchronous communication. But most communication in a system is synchronous — a client sends a request and waits for a response. That is where APIs come in.

An API (Application Programming Interface) is a contract between two systems. It defines how they communicate: what requests you can send, what responses you get back, and what format the data is in.

Good API design is critical. A bad API slows down development, confuses users, and is hard to change later.

REST APIs

REST (Representational State Transfer) is the most popular API style. It was introduced by Roy Fielding in 2000. REST uses HTTP methods and URLs to interact with resources.

Core Principles

1. Resources and URLs: Everything is a resource identified by a URL. A user is a resource. An order is a resource. A product is a resource.

Resources and URLs:

  /users           -- collection of users
  /users/123       -- a specific user (ID 123)
  /users/123/orders  -- orders belonging to user 123
  /products        -- collection of products
  /products/456    -- a specific product

2. HTTP Methods: Each method represents an action on a resource.

HTTP Methods:

  GET    /users       -- list all users
  GET    /users/123   -- get user 123
  POST   /users       -- create a new user
  PUT    /users/123   -- replace user 123 entirely
  PATCH  /users/123   -- update specific fields of user 123
  DELETE /users/123   -- delete user 123

3. Status Codes: The response status code tells the client what happened.

Common Status Codes:

  200 OK             -- success
  201 Created        -- resource created (after POST)
  204 No Content     -- success, no body (after DELETE)
  400 Bad Request    -- client sent invalid data
  401 Unauthorized   -- not authenticated
  403 Forbidden      -- authenticated but not allowed
  404 Not Found      -- resource does not exist
  409 Conflict       -- resource conflict (e.g., duplicate email)
  429 Too Many Requests -- rate limited
  500 Internal Server Error -- server bug

4. Statelessness: Each request contains all the information needed to process it. The server does not store session state between requests. Authentication tokens are sent with every request.

REST Best Practices

Use nouns, not verbs, in URLs:

Good:  GET /users/123/orders
Bad:   GET /getUserOrders?userId=123

Use plural nouns:

Good:  /users, /products, /orders
Bad:   /user, /product, /order

Pagination for large collections:

GET /users?page=2&limit=20

Response:
{
  "data": [...20 users...],
  "page": 2,
  "limit": 20,
  "total": 1500,
  "next": "/users?page=3&limit=20"
}

Filtering and sorting:

GET /products?category=electronics&sort=price&order=asc&min_price=100

Consistent error responses:

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Email is required",
    "details": [
      {
        "field": "email",
        "message": "must not be empty"
      }
    ]
  }
}

Use HTTPS always. Never expose APIs over plain HTTP.

REST Limitations

Over-fetching: You request a user but get all 50 fields when you only need the name and email.
Under-fetching: You need a user and their orders. With REST, that is two separate requests: GET /users/123 then GET /users/123/orders.
Versioning complexity: Changing the API without breaking existing clients is hard.
No real-time updates: REST is request-response. For real-time data, you need WebSockets or Server-Sent Events.

GraphQL

GraphQL was created by Facebook in 2012 and open-sourced in 2015. It solves the over-fetching and under-fetching problems of REST by letting clients specify exactly what data they need.

How GraphQL Works

Instead of multiple endpoints, GraphQL has a single endpoint (usually POST /graphql). Clients send a query describing the exact data they want.

GraphQL Query — ask for exactly what you need:

  query {
    user(id: 123) {
      name
      email
      orders {
        id
        total
        status
      }
    }
  }

Response:
  {
    "data": {
      "user": {
        "name": "Alex",
        "email": "alex@example.com",
        "orders": [
          { "id": 1, "total": 59.99, "status": "delivered" },
          { "id": 2, "total": 29.99, "status": "shipped" }
        ]
      }
    }
  }

One request. No over-fetching. No under-fetching.

GraphQL Schema

The server defines a schema that describes all available data and operations.

GraphQL Schema:

  type User {
    id: ID!
    name: String!
    email: String!
    orders: [Order!]!
  }

  type Order {
    id: ID!
    total: Float!
    status: String!
    items: [Item!]!
  }

  type Query {
    user(id: ID!): User
    users(page: Int, limit: Int): [User!]!
  }

  type Mutation {
    createUser(name: String!, email: String!): User!
    updateUser(id: ID!, name: String): User!
  }

Queries read data. Mutations write data. The schema is a contract between client and server.

GraphQL Advantages

No over-fetching: clients get exactly the fields they request.
No under-fetching: nested data in a single request (user + orders + items).
Strongly typed: the schema validates queries at build time. If a client asks for a field that does not exist, it gets a clear error.
Great for mobile apps: mobile clients often need different data than web clients. Each can request exactly what it needs.
Introspection: clients can query the schema itself to discover available data and types.

GraphQL Challenges

Complexity on the server: resolving nested queries efficiently is hard. A naive implementation causes the N+1 query problem (one query for the user, then one query per order).
Caching is harder: REST caches easily by URL. GraphQL queries are POST requests with dynamic bodies — standard HTTP caching does not work.
No built-in rate limiting: it is hard to predict the cost of a query. A deeply nested query can be very expensive.
Learning curve: more complex than REST for simple APIs.

When to Use GraphQL

Mobile applications — different screens need different data shapes. GraphQL avoids multiple round trips on slow mobile networks.
Multiple frontend clients — web, mobile, and third-party apps each need different data. One GraphQL API serves all of them.
Complex data with relationships — when resources have many nested relationships (users -> orders -> items -> reviews).
Rapid iteration — adding new fields to the schema does not break existing clients. No versioning needed.

Used by: Facebook, GitHub, Shopify, Twitter, Airbnb, Netflix (for some internal APIs).

gRPC

gRPC (Google Remote Procedure Call) was created by Google in 2015. It uses Protocol Buffers for serialization and HTTP/2 for transport. gRPC is designed for high-performance communication between services.

How gRPC Works

You define your service and messages in a .proto file. gRPC generates client and server code in many languages (Go, Java, Python, C++, etc.).

Protocol Buffer definition (user.proto):

  syntax = "proto3";

  service UserService {
    rpc GetUser(GetUserRequest) returns (User);
    rpc CreateUser(CreateUserRequest) returns (User);
    rpc ListUsers(ListUsersRequest) returns (stream User);  // server streaming
  }

  message GetUserRequest {
    int64 id = 1;
  }

  message User {
    int64 id = 1;
    string name = 2;
    string email = 3;
  }

  message CreateUserRequest {
    string name = 1;
    string email = 2;
  }

From this definition, gRPC generates typed client and server code. The client calls methods like local function calls, but the actual execution happens on a remote server.

gRPC Advantages

High performance: Protocol Buffers are binary (not text like JSON). Serialization and deserialization are 5-10x faster than JSON. Messages are smaller.
HTTP/2: multiplexed connections, header compression, bidirectional streaming. Multiple requests over a single TCP connection.
Streaming: four types of streaming: unary (normal request-response), server streaming, client streaming, and bidirectional streaming.
Code generation: generate client libraries in any language from the .proto file. No manual API client code needed.
Strong typing: the .proto file is the contract. Mismatched types are caught at compile time, not runtime.

gRPC Streaming Types

gRPC Streaming:

  1. Unary (normal request-response):
     Client --request--> Server --response--> Client

  2. Server streaming:
     Client --request--> Server --response1--> Client
                                --response2--> Client
                                --response3--> Client
     Example: real-time stock price updates

  3. Client streaming:
     Client --request1--> Server
     Client --request2--> Server
     Client --request3--> Server --response--> Client
     Example: uploading a file in chunks

  4. Bidirectional streaming:
     Client <---> Server (both send messages at any time)
     Example: chat application, real-time collaboration

gRPC Challenges

Not browser-friendly: browsers cannot make gRPC calls directly (no HTTP/2 trailer support). You need gRPC-Web or a REST gateway.
Not human-readable: binary Protocol Buffers cannot be read in a browser or with curl. Debugging is harder.
Smaller ecosystem: fewer tools, tutorials, and community support compared to REST.
Schema evolution: while Protocol Buffers handle backward compatibility well, you must follow strict rules (never reuse field numbers, do not remove required fields).

When to Use gRPC

Microservice-to-microservice communication — services talk to each other behind a load balancer. Browser compatibility is not needed.
High performance requirements — low latency, high throughput, minimal bandwidth.
Streaming — real-time data feeds, live updates, file uploads.
Polyglot environments — services written in different languages need a shared interface. The .proto file generates code for all of them.

Used by: Google, Netflix, Uber, Dropbox, Square, Lyft, Docker.

REST vs GraphQL vs gRPC

Feature	REST	GraphQL	gRPC
Format	JSON (text)	JSON (text)	Protocol Buffers (binary)
Transport	HTTP/1.1 or HTTP/2	HTTP/1.1 or HTTP/2	HTTP/2
Schema	OpenAPI/Swagger (optional)	Required (SDL)	Required (.proto)
Over/under-fetching	Common problem	Solved	Not applicable
Streaming	Limited (WebSocket)	Subscriptions	Native support
Browser support	Full	Full	Limited (needs proxy)
Caching	Easy (HTTP caching)	Hard	Hard
Learning curve	Low	Medium	Medium-High
Best for	Public APIs, web apps	Mobile apps, complex data	Internal microservices

Decision Framework

Choosing an API style:

  Building a public API for external developers?
    --> REST (universal, well-understood, cacheable)

  Building a mobile app with complex data needs?
    --> GraphQL (flexible queries, one round trip)

  Communication between internal microservices?
    --> gRPC (high performance, type safety, streaming)

  Need real-time streaming data?
    --> gRPC or GraphQL subscriptions

  Simple CRUD API?
    --> REST (simplest, most tooling)

Many systems use a combination: REST for public APIs, gRPC for internal services, and GraphQL for mobile clients.

API Gateway

An API gateway is a single entry point for all API requests. It sits between clients and backend services, handling cross-cutting concerns.

Without API Gateway:

  [Mobile App] --> [User Service]
  [Mobile App] --> [Order Service]
  [Mobile App] --> [Product Service]
  [Web App]    --> [User Service]
  [Web App]    --> [Order Service]

  Clients must know about every service. Each service handles authentication.


With API Gateway:

  [Mobile App] --> [API Gateway] --> [User Service]
  [Web App]    -->                --> [Order Service]
  [Partner API]-->                --> [Product Service]

  One entry point. Gateway handles:
    - Authentication and authorization
    - Rate limiting
    - Request routing
    - Load balancing
    - SSL termination
    - Request/response transformation
    - Caching
    - Logging and monitoring

Popular API Gateways: Kong, AWS API Gateway, Nginx, Envoy, Traefik.

Webhooks

Webhooks are the reverse of APIs. Instead of the client polling the server for updates, the server pushes events to the client when something happens.

Polling (inefficient):

  Client: "Any new orders?" --> Server: "No"
  Client: "Any new orders?" --> Server: "No"
  Client: "Any new orders?" --> Server: "No"
  Client: "Any new orders?" --> Server: "Yes! Here is the order."
  (Wasted 3 requests)


Webhook (efficient):

  Client registers: "Send order events to https://myapp.com/webhooks/orders"

  [Nothing happens]
  [Nothing happens]
  Server: "New order!" --> POST https://myapp.com/webhooks/orders
                           { "event": "order.created", "data": { ... } }

Used by: Stripe (payment events), GitHub (repository events), Twilio (SMS events), Shopify (store events).

Webhook best practices:

Verify webhook signatures to prevent spoofing
Respond with 200 OK quickly, process the event asynchronously
Handle retries (the sender will retry if you do not respond with 200)
Use a queue to process webhook events (do not block on processing)

API Versioning

APIs evolve over time. You add fields, remove fields, change behavior. But you cannot break existing clients. Versioning lets you make changes while keeping old clients working.

URL Path Versioning

/api/v1/users/123
/api/v2/users/123

The most common approach. Simple and clear. Easy to route in a load balancer or API gateway.

Header Versioning

GET /users/123
Accept: application/vnd.myapi.v2+json

The URL stays clean. But it is harder to test (you cannot paste the URL in a browser) and harder to cache.

Query Parameter Versioning

GET /users/123?version=2

Simple but easy to forget. Not commonly used for major versions.

Best Practice

Use URL path versioning for major versions (v1, v2). Use backward-compatible changes (adding optional fields, not removing fields) to avoid new versions as long as possible.

Interview Tips

When discussing API design in a system design interview:

Start with the API. “Let me define the API first. For a URL shortener, we need POST /urls to create a short URL and GET /{shortCode} to redirect.”
Choose REST for external APIs. “The public API will be REST because it is the most widely understood and easiest for third-party developers.”
Choose gRPC for internal communication. “Between microservices, I will use gRPC for lower latency and type safety.”
Mention pagination. “The list endpoint returns paginated results: GET /users?page=1&limit=20.”
Discuss rate limiting. “The API gateway handles rate limiting — 100 requests per minute per API key.”
Mention an API gateway. “All requests go through an API gateway that handles authentication, rate limiting, and routing.”
Know the trade-offs. If the interviewer asks “why not GraphQL?” be ready to explain when GraphQL is better and when it is overkill.

System Design #7: Message Queues — Kafka, RabbitMQ, SQS
System Design #3: Load Balancers — Routing and balancing traffic
Go Tutorial #16: REST API with Gin — Building REST APIs in Go
Go Tutorial #26: gRPC — gRPC services in Go

What’s Next?

In the next article, System Design #9: Microservices vs Monolith, you will learn:

Monolith vs microservices: when to use each
Service discovery and inter-service communication
The Saga pattern for distributed transactions
Database per service pattern
How Netflix migrated from monolith to microservices

This is part 8 of the System Design Tutorial series. Follow along to learn system design from scratch.

REST APIs#

Core Principles#

REST Best Practices#

REST Limitations#

GraphQL#

How GraphQL Works#

GraphQL Schema#

GraphQL Advantages#

GraphQL Challenges#

When to Use GraphQL#

gRPC#

How gRPC Works#

gRPC Advantages#

gRPC Streaming Types#

gRPC Challenges#

When to Use gRPC#

REST vs GraphQL vs gRPC#

Decision Framework#

API Gateway#

Webhooks#

API Versioning#

URL Path Versioning#

Header Versioning#

Query Parameter Versioning#

Best Practice#

Interview Tips#

Related Articles#

What’s Next?#