14. Graphs

📋 Jump to Takeaways

Graphs model relationships. Social networks (people connected by friendships), maps (cities connected by roads), dependencies (tasks that must happen before other tasks), the internet (pages connected by links). Any time you have "things" and "connections between things," you have a graph.

The Structure

A graph has vertices (nodes) and edges (connections). Edges can be:

  • Directed — one-way (A → B doesn't mean B → A)
  • Undirected — two-way (A — B means both can reach each other)
  • Weighted — edges have costs (road distances, network latency)
  • Unweighted — all edges are equal
Undirected:          Directed:           Weighted:
  A — B              A → B              A —3— B
  |   |              ↓   ↓              |     |
  C — D              C → D              2     5
                                        |     |
                                        C —1— D

Representation

Two standard ways to store a graph:

Adjacency list — each node stores its neighbors. Best for sparse graphs (few edges relative to nodes). Python's defaultdict(list) is the idiomatic choice.

from collections import defaultdict

# Adjacency list — defaultdict avoids KeyError on missing nodes
graph = defaultdict(list)
graph[0] = [1, 2]
graph[1] = [0, 3]
graph[2] = [0, 3]
graph[3] = [1, 2]

# Building from edge list
edges = [[0, 1], [0, 2], [1, 3], [2, 3]]
graph = defaultdict(list)
for u, v in edges:
    graph[u].append(v)
    graph[v].append(u)  # omit for directed graph

# Weighted adjacency list
weighted = defaultdict(list)
weighted[0] = [(1, 3), (2, 2)]  # (neighbor, weight)
weighted[1] = [(3, 5)]
weighted[2] = [(3, 1)]

Adjacency matrix — 2D array where matrix[i][j] = 1 means edge from i to j. Best for dense graphs or when you need O(1) edge lookup.

# Adjacency matrix — O(1) edge check, O(V²) space
matrix = [
    [0, 1, 1, 0],  # node 0 connects to 1, 2
    [1, 0, 0, 1],  # node 1 connects to 0, 3
    [1, 0, 0, 1],  # node 2 connects to 0, 3
    [0, 1, 1, 0],  # node 3 connects to 1, 2
]
Adjacency List Adjacency Matrix
Space O(V + E) O(V²)
Check edge exists O(degree) O(1)
Find all neighbors O(degree) O(V)
Best for Sparse graphs Dense graphs

Most interview problems use adjacency lists.

Go as deep as possible before backtracking. Uses a stack (or recursion).

# DFS — recursive
def dfs(graph: dict, node: int, visited: set, order: list):
    visited.add(node)
    order.append(node)
    for neighbor in graph[node]:
        if neighbor not in visited:
            dfs(graph, neighbor, visited, order)

# DFS — iterative with explicit stack
def dfs_iterative(graph: dict, start: int) -> list[int]:
    visited = set()
    stack = [start]
    order = []
    while stack:
        node = stack.pop()
        if node in visited:
            continue
        visited.add(node)
        order.append(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                stack.append(neighbor)
    return order

DFS explores one path completely before trying another. Good for: detecting cycles, topological sort, finding connected components, path existence.

Explore all neighbors at distance 1, then distance 2, then distance 3. Uses a deque for O(1) popleft.

from collections import deque

# BFS — finds shortest path in unweighted graphs
def bfs(graph: dict, start: int) -> dict[int, int]:
    dist = {start: 0}
    queue = deque([start])
    while queue:
        node = queue.popleft()
        for neighbor in graph[node]:
            if neighbor not in dist:
                dist[neighbor] = dist[node] + 1
                queue.append(neighbor)
    return dist  # distance from start to every reachable node

BFS guarantees shortest path in unweighted graphs. The first time you reach a node is the shortest way to get there.

Important: always use collections.deque for BFS. A regular list with pop(0) is O(n) — it shifts every element. deque.popleft() is O(1).

Multi-Source BFS

Sometimes you start BFS from multiple sources simultaneously. Enqueue ALL sources at once and expand outward in parallel.

LeetCode 994 - Rotting Oranges

# Rotting oranges: how many minutes until all oranges rot?
def oranges_rotting(grid: list[list[int]]) -> int:
    rows, cols = len(grid), len(grid[0])
    queue = deque()
    fresh = 0
    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == 2:
                queue.append((r, c))
            elif grid[r][c] == 1:
                fresh += 1

    minutes = 0
    dirs = [(0, 1), (0, -1), (1, 0), (-1, 0)]
    while queue and fresh > 0:
        for _ in range(len(queue)):
            r, c = queue.popleft()
            for dr, dc in dirs:
                nr, nc = r + dr, c + dc
                if 0 <= nr < rows and 0 <= nc < cols and grid[nr][nc] == 1:
                    grid[nr][nc] = 2
                    fresh -= 1
                    queue.append((nr, nc))
        minutes += 1

    return -1 if fresh > 0 else minutes

The key insight: by starting from all sources at once, each cell gets the minimum distance to its nearest source. One BFS pass, O(V+E) total.

Topological Sort

Topological sort orders nodes in a directed acyclic graph (DAG) so that for every edge A → B, A comes before B. Think: course prerequisites, build dependencies, task scheduling.

Kahn's Algorithm (BFS-based):

  1. Count in-degrees (how many edges point to each node)
  2. Enqueue all nodes with in-degree 0 (no dependencies)
  3. Process queue: for each node, reduce in-degree of its neighbors. If a neighbor hits 0, enqueue it.

LeetCode 210 - Course Schedule II

from collections import deque, defaultdict

def topological_sort(n: int, edges: list[list[int]]) -> list[int]:
    graph = defaultdict(list)
    in_degree = [0] * n
    for u, v in edges:
        graph[u].append(v)
        in_degree[v] += 1

    queue = deque(i for i in range(n) if in_degree[i] == 0)
    order = []

    while queue:
        node = queue.popleft()
        order.append(node)
        for neighbor in graph[node]:
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)

    if len(order) != n:
        return []  # cycle detected — not a DAG
    return order

If the result has fewer nodes than the graph, there's a cycle — topological sort is impossible. This is also how you detect cycles in directed graphs.

Connected Components

How many separate groups exist? Run DFS/BFS from each unvisited node, count how many times you start.

LeetCode 323 - Number of Connected Components in an Undirected Graph

def count_components(n: int, graph: dict) -> int:
    visited = set()
    count = 0

    def explore(node: int):
        visited.add(node)
        for neighbor in graph[node]:
            if neighbor not in visited:
                explore(neighbor)

    for i in range(n):
        if i not in visited:
            explore(i)
            count += 1
    return count

When DFS vs BFS

Use DFS when Use BFS when
Detecting cycles Finding shortest path (unweighted)
Topological sort Level-order processing
Path existence (any path) Nearest neighbor queries
Exploring all possibilities Minimum steps/moves
Space is tight (recursion stack) You need the closest result first

Cycle detection: In a directed graph, if DFS visits a node that's currently on the recursion stack (not just visited before), there's a cycle. This is how topological sort detects invalid DAGs — if len(order) != n, a cycle exists.

Rule of thumb: if the problem says "minimum," "shortest," or "fewest," think BFS. If it says "all paths," "detect cycle," or "can you reach," think DFS.

Grid as Graph

Many problems give you a 2D grid. Each cell is a node, adjacent cells (up/down/left/right) are edges.

LeetCode 200 - Number of Islands

def num_islands(grid: list[list[str]]) -> int:
    rows, cols = len(grid), len(grid[0])
    count = 0

    def sink(r: int, c: int):
        if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] == "0":
            return
        grid[r][c] = "0"  # mark visited
        sink(r + 1, c)
        sink(r - 1, c)
        sink(r, c + 1)
        sink(r, c - 1)

    for r in range(rows):
        for c in range(cols):
            if grid[r][c] == "1":
                sink(r, c)
                count += 1
    return count

Complexity

Algorithm Time Space
DFS O(V + E) O(V)
BFS O(V + E) O(V)
Topological sort O(V + E) O(V)

V = vertices, E = edges. Both DFS and BFS visit every node and every edge exactly once.

Practice Problems

Problem Difficulty Link
Flood Fill Easy LeetCode 733
Number of Islands Medium LeetCode 200
All Paths From Source to Target Medium LeetCode 797
Clone Graph Medium LeetCode 133
Course Schedule Medium LeetCode 207
Course Schedule II Medium LeetCode 210
Pacific Atlantic Water Flow Medium LeetCode 417
Word Ladder Hard LeetCode 127
Rotting Oranges Medium LeetCode 994

Key Takeaways

  • Graphs model relationships — vertices are things, edges are connections
  • defaultdict(list) is the idiomatic Python adjacency list
  • DFS goes deep first (stack/recursion) — cycles, topological sort, path existence
  • BFS goes wide first (deque) — shortest path in unweighted graphs, minimum steps
  • Always use deque for BFS — list.pop(0) is O(n), deque.popleft() is O(1)
  • "Minimum/shortest/fewest" → BFS. "All paths/cycle/reachability" → DFS
  • Grids are graphs — each cell connects to its 4 (or 8) neighbors
  • Kahn's algorithm with deque is the cleanest topological sort for interviews
  • Both DFS and BFS are O(V + E) time

📖 Examples

Complete examples for this lesson.

📝 Ready to test your knowledge?

Answer the quiz below to mark this lesson complete.

Spot something off? Report an issue

© 2026 ByteLearn.dev. Free courses for developers. · Privacy