09 - Databases: SQL vs NoSQL

We've covered how to handle and distribute traffic. Now let's look at where that data actually lives.

Relational Databases (SQL)

Relational databases store data in tables with rows and columns. They enforce schemas, support joins, and guarantee ACID transactions.

ACID properties:

Atomicity — a transaction either fully completes or fully rolls back
Consistency — data always moves from one valid state to another
Isolation — concurrent transactions don't interfere with each other
Durability — committed data survives crashes

Use SQL when you need strong consistency, complex queries with joins, or transactions that span multiple tables. Think: financial systems, user accounts, order processing.

Examples: PostgreSQL, MySQL, SQL Server.

Non-Relational Databases (NoSQL)

NoSQL databases trade some guarantees for flexibility and scale. They come in several flavors:

Document stores — store JSON-like documents. Flexible schema, good for varied data shapes. (MongoDB, CouchDB)

Key-value stores — simple get/set by key. Extremely fast, limited query capability. (Redis, Memcached)

Wide-column stores — rows can have different columns. Good for time-series and analytics. (Cassandra, HBase)

Graph databases — store nodes and edges. Optimized for relationship queries. (Neo4j, Amazon Neptune)

Type	Best for	Tradeoff	Example use case
Document	Varied data shapes, nested objects, rapid iteration	No joins, weak consistency guarantees	User profiles, product catalogs, CMS
Key-value	Simple lookups by ID, caching, sessions	No queries beyond get/set by key	Session store, shopping cart, feature flags
Wide-column	High write throughput, time-series, sparse data	Must design around query patterns upfront, no joins	IoT telemetry, messaging history, analytics events
Graph	Relationship-heavy queries (friends-of-friends, recommendations)	Poor for non-relationship queries, harder to scale horizontally	Social networks, fraud detection, knowledge graphs

When to Choose What

There's no universal answer. It depends on your access patterns.

Choose SQL when:

Your data has clear relationships (users → orders → items)
You need complex queries with joins and aggregations
Consistency is critical (money, inventory)
Your schema is well-defined and stable

Choose NoSQL when:

You need horizontal scalability across many nodes
Your data is denormalized or varies in structure
You have simple access patterns (get by ID, range queries)
You can tolerate eventual consistency

Many systems use both. PostgreSQL for transactional data, Redis for caching, Elasticsearch for search. Pick the right tool for each access pattern.

Data Modeling Differences

In SQL, you normalize: split data into tables and join at query time. A user's orders live in a separate orders table linked by user_id.

In NoSQL (document stores), you often embed: store the user and their orders in a single document. No joins needed, but data may be duplicated.

{
  "user_id": "u_123",
  "name": "Alice",
  "orders": [
    {"id": "o_1", "total": 49.99},
    {"id": "o_2", "total": 129.00}
  ]
}

The tradeoff: embedded documents are fast to read (one lookup) but harder to update (what if you need to change the user's name across all their orders?).

Transactions

SQL databases give you multi-statement transactions out of the box:

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

Both updates succeed or both roll back. You can't end up with money disappearing.

Most NoSQL databases don't support multi-document transactions. Operations on a single document are atomic, but if your logic spans multiple documents (transfer money between two accounts), you're on your own. Some newer NoSQL databases (MongoDB 4.0+, DynamoDB) added limited transaction support, but it's slower and more constrained than SQL.

Rule of thumb: if your domain requires "do these 3 things together or none of them," SQL is the safer choice.

Query Flexibility

SQL lets you query any column, combine conditions, and join across tables — even if you didn't plan for that query when you designed the schema:

SELECT * FROM orders WHERE status = 'pending' AND total > 100;

NoSQL databases are optimized for specific access patterns. In DynamoDB, the base table can only be queried by partition key (and optionally sort key). Want to find all pending orders over $100? You need a Global Secondary Index (GSI) — and you must create it ahead of time. DynamoDB is more capable than a pure key-value store (it supports filter expressions, GSIs, and even a SQL-compatible query language called PartiQL), but you still must design your access patterns upfront.

This means SQL is better for exploratory queries and evolving requirements. NoSQL is better when you know exactly how you'll access data and can design your keys around it.

Schema Evolution

SQL (schema-on-write): you define the schema upfront. Adding a column requires ALTER TABLE, which may lock the table briefly. Every row conforms to the same structure. Migrations are explicit and versioned.

NoSQL (schema-on-read): there's no enforced schema. You can add new fields to documents without touching existing ones. Old documents simply don't have the new field. Your application code handles both shapes.

The tradeoff: NoSQL is easier to evolve quickly, but you push complexity into your application. Six months later, you have documents with 4 different shapes and your code needs if statements for each. SQL keeps the schema explicit and consistent, but migrations require coordination.

Key Takeaways

SQL gives you ACID transactions and flexible queries; NoSQL gives you horizontal scale and schema flexibility
Choose based on access patterns, not hype
SQL normalizes (joins at read time); NoSQL embeds (duplicates for read speed)
If you need multi-document transactions, SQL is the safer bet
NoSQL requires you to know your query patterns upfront; SQL lets you query ad-hoc
Many production systems use multiple database types together