17 - API Design and Gateways
📋 Jump to TakeawaysREST
REST is the most common API style for web services. It maps CRUD operations to HTTP methods:
GET /users/123— read a userPOST /users— create a userPUT /users/123— replace a userPATCH /users/123— partially update a userDELETE /users/123— delete a user
REST is simple, cacheable (GET responses can be cached by CDNs), and well-understood. Its weakness: over-fetching (you get the entire resource even if you need one field) and under-fetching (you need multiple requests to assemble related data).
GraphQL
GraphQL lets clients specify exactly what data they need in a single request. No over-fetching, no under-fetching.
query {
user(id: 123) {
name
orders(last: 5) {
total
status
}
}
}One request, exactly the data you need. Great for mobile clients with bandwidth constraints or complex UIs that pull from many resources.
The tradeoff: harder to cache (POST requests with dynamic queries), more complex server implementation, and potential for expensive queries that join too much data.
gRPC
gRPC uses Protocol Buffers for serialization and HTTP/2 for transport. Smaller payloads than JSON, strongly typed contracts, and generated client/server code from a schema definition.
Best for service-to-service communication where both sides are under your control. Not natively supported in browsers (gRPC's wire protocol requires HTTP/2 trailers which browsers don't expose — gRPC-Web exists as a workaround but requires a proxy).
API Gateway
An API gateway sits between clients and your backend services. Its job isn't routing (a load balancer or reverse proxy already does that). Its job is centralizing logic that every service needs but shouldn't implement individually.
Without a gateway: every service has its own auth middleware, its own rate limiter, its own request logging. That's duplicated code, duplicated bugs, and inconsistent behavior. One service might validate tokens differently than another. One might forget rate limiting entirely.
With a gateway: auth, rate limiting, logging, and SSL termination happen once, in one place, before the request reaches any service.
What it handles:
- Authentication — validates tokens on every request. It doesn't handle login itself (that's the identity provider — Okta, Auth0, Cognito). The user logs in via SSO (Single Sign-On), gets a token, and sends it with every request. The gateway checks "is this token valid?" and forwards the user's identity to backend services.
- Rate limiting — enforces limits per user/API key. Individual services don't need to worry about abuse.
- Logging and metrics — every request is logged consistently. You get a single place to see all traffic.
- SSL termination — handles HTTPS. Internal services communicate over plain HTTP (faster, simpler).
- API versioning — routes
/v1/ordersand/v2/ordersto different service versions without the client knowing. - Protocol translation — clients speak REST over HTTP. Internally, services might use gRPC for performance. The gateway translates between them.
The other big benefit: decoupling the public API from internal architecture. You can split one service into three, merge two services into one, or rewrite a service in a different language. As long as the gateway still maps the same public endpoints to the right place, clients never notice.
Real-world examples: AWS API Gateway, Kong, Envoy, NGINX.
API Versioning
APIs evolve. You need a strategy for backward compatibility.
URL versioning — /v1/users, /v2/users. Clear and explicit. Easy to route at the gateway level.
Header versioning — Accept: application/vnd.api+json;version=2. Cleaner URLs but harder to test in a browser.
Query parameter — /users?version=2. Simple but pollutes the query string.
URL versioning is the most common in practice. It's obvious and works with every tool.
Pagination
APIs that return lists need pagination. You can't return 10 million users in one response. Three approaches:
Offset-based — "skip the first 20, give me the next 10."
GET /users?offset=20&limit=10The database runs SELECT * FROM users LIMIT 10 OFFSET 20. Simple. But it breaks when data changes between requests. Say you fetch page 1 (users 1-10). Before you fetch page 2, a new user is inserted at position 5. Now everything shifts — user 10 from page 1 is now at position 11, so it shows up again on page 2. You get a duplicate.
Page-based — same idea, different syntax.
GET /users?page=3&size=10This is just offset in disguise (page 3, size 10 = offset 20, limit 10). Same instability problem. But it's nice for UIs that show "Page 1, 2, 3..." buttons.
Cursor-based — "give me 10 items after this specific item."
GET /users?after=user_abc123&limit=10The cursor points to the last item you received (usually an encoded ID or timestamp). The database runs SELECT * FROM users WHERE id > 'abc123' LIMIT 10. Even if new items are inserted, your position doesn't shift — you're anchored to a specific item, not a numeric offset.
Why cursor wins: offset says "skip 20 rows" (fragile — rows shift). Cursor says "start after this row" (stable — that row doesn't move). The tradeoff: you can't jump to "page 47" directly. You can only go forward from where you are. That's fine for infinite scroll, not great for "go to page" UIs.
Cursor-based is the best default for APIs with frequently changing data.
Key Takeaways
- REST is the default for public APIs; GraphQL for complex client needs; gRPC for internal services
- API gateways centralize auth, rate limiting, and routing
- Version your APIs from day one — URL versioning is the simplest
- Use cursor-based pagination for stable results under concurrent writes