08 - Content Delivery Networks
📋 Jump to TakeawaysWhat a CDN Does
A CDN is a network of servers distributed across the globe. It caches your content at edge locations close to users, so requests travel shorter distances.
Without a CDN, a user in Tokyo requesting content from a server in Virginia adds 150ms+ of network latency on every request. With a CDN, that content is served from a Tokyo edge node in 5-30ms (depending on last-mile network and cache state).
Static vs Dynamic Content
CDNs originally cached only static assets: images, CSS, JavaScript, videos. These don't change between users, so caching them is straightforward.
Modern CDNs also handle dynamic content:
- Edge computing — run code at the CDN node closest to the user (Cloudflare Workers, Lambda@Edge). Instead of the request traveling to your origin server, a small function executes at the edge and builds the response right there. Use cases: A/B testing, geolocation redirects, auth token validation, personalizing responses without hitting origin.
- Dynamic caching — cache API responses with short TTLs for content that changes infrequently. Example: a product price that updates every 5 minutes — set TTL to 30 seconds, and 99% of users get the cached response while it's at most 30 seconds stale. Use cases: weather data, leaderboards, product listings.
- Edge-side includes (ESI) — split a page into fragments, cache each separately, and the CDN assembles the final page per request. The shared header and footer are cached (same for everyone), while personalized content (e.g., "Hello, John") is fetched from origin. Result: 90% of the page is cached, only the personalized fragment hits your server.
The rule: cache what you can at the edge, and only go to origin for what you must.
How CDN Caching Works
When a user requests a resource:
- DNS resolves to the nearest CDN edge node (anycast routing)
- Edge node checks its cache
- Cache hit — return immediately
- Cache miss — fetch from origin, cache the response, return to user
Subsequent requests from nearby users hit the edge cache directly. The origin server only handles the first request (or cache refreshes).
Unicast vs Anycast:
| Unicast | Anycast | |
|---|---|---|
| How it works | IP 1.2.3.4 → one specific server |
IP 1.2.3.4 → announced by servers in multiple locations via BGP (Border Gateway Protocol) |
| Routing | All users connect to the same server regardless of location | Internet routers pick the shortest path, sending users to the nearest node |
| Failover | Server dies → IP is unreachable | Node dies → routers stop seeing its BGP announcement, traffic shifts to next closest node |
| Used by | Traditional single-server setups | CDNs (Cloudflare, CloudFront), DNS root servers |
The CDN doesn't do the routing — the internet's own routing infrastructure (BGP) picks the shortest path automatically. No special client-side logic needed.
Cache Headers
You control CDN behavior through HTTP headers:
Cache-Control: public, max-age=31536000— cache for one year (365 × 24 × 60 × 60 seconds). Use for versioned assets likeapp.a1b2c3.jswhere the filename changes when content changes.Cache-Control: public, max-age=0, s-maxage=60— browser doesn't cache, CDN caches for 60 secondsCache-Control: private— don't cache at the CDN (user-specific data)Vary: Accept-Encoding— cache different versions for different encodings
Getting cache headers right is the difference between a CDN that helps and one that serves stale data.
Cache Invalidation at the Edge
When you deploy new content, you need to invalidate the CDN cache. Options:
Versioned URLs — style.v2.css or app.a1b2c3.js. The URL changes, so the old cache entry is never requested again. The best approach for static assets.
Purge API — tell the CDN to drop specific URLs or patterns. Takes seconds to propagate globally.
Short TTLs — set max-age=60 so content refreshes every minute. Simple but adds origin load.
Versioned URLs are the gold standard. You never need to invalidate because the URL itself is the cache key.
CDN as DDoS Protection
A DDoS (Distributed Denial of Service) attack floods your server with traffic from thousands of machines to overwhelm it and make it unavailable to real users. You can't just block one IP because the traffic comes from many sources.
CDNs absorb traffic at the edge before it reaches your origin. This provides natural DDoS protection:
- Edge nodes have massive bandwidth capacity (anycast spreads attack traffic across all nodes)
- Malicious traffic is filtered before reaching your infrastructure
- Rate limiting can happen at the edge, close to the attacker
- Your origin IP stays hidden behind the CDN
How does the CDN distinguish attacks from legitimate traffic spikes? It uses multiple signals: traffic baseline anomalies (normal is 1K req/sec, suddenly 500K), behavioral patterns (bots don't execute JavaScript or load assets), TLS/HTTP fingerprinting (bots use different libraries than real browsers), and geographic anomalies. When uncertain, CDNs escalate progressively — JavaScript challenges, CAPTCHAs, or "under attack" mode that challenges all visitors. It's not a perfect filter, but the combination of signals catches most attacks.
This is why services like Cloudflare are both a CDN and a security product.
Key Takeaways
- CDNs reduce latency by serving content from edge nodes close to users
- Use versioned URLs for static assets to avoid invalidation headaches
- Modern CDNs handle dynamic content through edge computing
- Cache headers control what gets cached and for how long
- CDNs provide DDoS protection as a side effect of their architecture