09 - Databases: SQL vs NoSQL
📋 Jump to TakeawaysWe've covered how to handle and distribute traffic. Now let's look at where that data actually lives.
Relational Databases (SQL)
Relational databases store data in tables with rows and columns. They enforce schemas, support joins, and guarantee ACID transactions.
ACID properties:
- Atomicity — a transaction either fully completes or fully rolls back
- Consistency — data always moves from one valid state to another
- Isolation — concurrent transactions don't interfere with each other
- Durability — committed data survives crashes
Use SQL when you need strong consistency, complex queries with joins, or transactions that span multiple tables. Think: financial systems, user accounts, order processing.
Examples: PostgreSQL, MySQL, SQL Server.
Non-Relational Databases (NoSQL)
NoSQL databases trade some guarantees for flexibility and scale. They come in several flavors:
Document stores — store JSON-like documents. Flexible schema, good for varied data shapes. (MongoDB, CouchDB)
Key-value stores — simple get/set by key. Extremely fast, limited query capability. (Redis, Memcached)
Wide-column stores — rows can have different columns. Good for time-series and analytics. (Cassandra, HBase)
Graph databases — store nodes and edges. Optimized for relationship queries. (Neo4j, Amazon Neptune)
| Type | Best for | Tradeoff | Example use case |
|---|---|---|---|
| Document | Varied data shapes, nested objects, rapid iteration | No joins, weak consistency guarantees | User profiles, product catalogs, CMS |
| Key-value | Simple lookups by ID, caching, sessions | No queries beyond get/set by key | Session store, shopping cart, feature flags |
| Wide-column | High write throughput, time-series, sparse data | Must design around query patterns upfront, no joins | IoT telemetry, messaging history, analytics events |
| Graph | Relationship-heavy queries (friends-of-friends, recommendations) | Poor for non-relationship queries, harder to scale horizontally | Social networks, fraud detection, knowledge graphs |
When to Choose What
There's no universal answer. It depends on your access patterns.
Choose SQL when:
- Your data has clear relationships (users → orders → items)
- You need complex queries with joins and aggregations
- Consistency is critical (money, inventory)
- Your schema is well-defined and stable
Choose NoSQL when:
- You need horizontal scalability across many nodes
- Your data is denormalized or varies in structure
- You have simple access patterns (get by ID, range queries)
- You can tolerate eventual consistency
Many systems use both. PostgreSQL for transactional data, Redis for caching, Elasticsearch for search. Pick the right tool for each access pattern.
Data Modeling Differences
In SQL, you normalize: split data into tables and join at query time. A user's orders live in a separate orders table linked by user_id.
In NoSQL (document stores), you often embed: store the user and their orders in a single document. No joins needed, but data may be duplicated.
{
"user_id": "u_123",
"name": "Alice",
"orders": [
{"id": "o_1", "total": 49.99},
{"id": "o_2", "total": 129.00}
]
}The tradeoff: embedded documents are fast to read (one lookup) but harder to update (what if you need to change the user's name across all their orders?).
Transactions
SQL databases give you multi-statement transactions out of the box:
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;Both updates succeed or both roll back. You can't end up with money disappearing.
Most NoSQL databases don't support multi-document transactions. Operations on a single document are atomic, but if your logic spans multiple documents (transfer money between two accounts), you're on your own. Some newer NoSQL databases (MongoDB 4.0+, DynamoDB) added limited transaction support, but it's slower and more constrained than SQL.
Rule of thumb: if your domain requires "do these 3 things together or none of them," SQL is the safer choice.
Query Flexibility
SQL lets you query any column, combine conditions, and join across tables — even if you didn't plan for that query when you designed the schema:
SELECT * FROM orders WHERE status = 'pending' AND total > 100;NoSQL databases are optimized for specific access patterns. In DynamoDB, the base table can only be queried by partition key (and optionally sort key). Want to find all pending orders over $100? You need a Global Secondary Index (GSI) — and you must create it ahead of time. DynamoDB is more capable than a pure key-value store (it supports filter expressions, GSIs, and even a SQL-compatible query language called PartiQL), but you still must design your access patterns upfront.
This means SQL is better for exploratory queries and evolving requirements. NoSQL is better when you know exactly how you'll access data and can design your keys around it.
Schema Evolution
SQL (schema-on-write): you define the schema upfront. Adding a column requires ALTER TABLE, which may lock the table briefly. Every row conforms to the same structure. Migrations are explicit and versioned.
NoSQL (schema-on-read): there's no enforced schema. You can add new fields to documents without touching existing ones. Old documents simply don't have the new field. Your application code handles both shapes.
The tradeoff: NoSQL is easier to evolve quickly, but you push complexity into your application. Six months later, you have documents with 4 different shapes and your code needs if statements for each. SQL keeps the schema explicit and consistent, but migrations require coordination.
Key Takeaways
- SQL gives you ACID transactions and flexible queries; NoSQL gives you horizontal scale and schema flexibility
- Choose based on access patterns, not hype
- SQL normalizes (joins at read time); NoSQL embeds (duplicates for read speed)
- If you need multi-document transactions, SQL is the safer bet
- NoSQL requires you to know your query patterns upfront; SQL lets you query ad-hoc
- Many production systems use multiple database types together