11. Heaps and Priority Queues

📋 Jump to Takeaways

A heap answers one question efficiently: what's the smallest (or largest) element right now? Not "find element X" — dicts do that. Not "give me everything sorted" — sorting does that. Just "give me the extreme value, and let me add/remove elements while maintaining that guarantee." That's O(log n) per operation instead of O(n).

The Structure

A heap is a complete binary tree stored as a list. The heap property: every parent is smaller than (min-heap) or larger than (max-heap) its children.

Min-heap as tree:        As list:
       1                 [1, 3, 2, 7, 4, 5, 6]
      / \
     3   2               Parent of i: (i-1)//2
    / \ / \              Left child:  2*i + 1
   7  4 5  6             Right child: 2*i + 2

No pointers needed. The tree structure is implicit in the list indices.

Python's `heapq` Module

Python provides heapq — a min-heap implementation that operates on regular lists. No special class needed.

import heapq

nums = [5, 3, 8, 1, 9, 2]

# heapify — transform list into a heap in-place, O(n)
heapq.heapify(nums)  # nums is now [1, 3, 2, 5, 9, 8]

# heappush — add element, O(log n)
heapq.heappush(nums, 4)

# heappop — remove and return smallest, O(log n)
smallest = heapq.heappop(nums)  # 1

# peek — look at smallest without removing, O(1)
smallest = nums[0]

# heappushpop — push then pop (more efficient than separate operations)
result = heapq.heappushpop(nums, 6)

# heapreplace — pop then push (more efficient than separate operations)
result = heapq.heapreplace(nums, 7)

Key fact: heapq is min-heap only. There is no built-in max-heap in Python.

The Max-Heap Trick: Negate Values

Since heapq only supports min-heap, simulate a max-heap by negating values:

import heapq

nums = [5, 3, 8, 1, 9]

# Max-heap: negate on push, negate on pop
max_heap = [-x for x in nums]
heapq.heapify(max_heap)

# Get largest
largest = -heapq.heappop(max_heap)  # 9

# Push a new value
heapq.heappush(max_heap, -7)  # push 7 as -7

# Peek at largest
largest = -max_heap[0]

This is the standard Python idiom. Every competitive programmer and interviewer knows it.

`nlargest` and `nsmallest`

For finding the top/bottom K elements without maintaining a heap yourself:

import heapq

nums = [5, 3, 8, 1, 9, 2, 7]

# K largest elements
heapq.nlargest(3, nums)   # [9, 8, 7]

# K smallest elements
heapq.nsmallest(3, nums)  # [1, 2, 3]

# With key function
tasks = [('low', 3), ('high', 1), ('med', 2)]
heapq.nsmallest(2, tasks, key=lambda x: x[1])  # [('high', 1), ('med', 2)]

When to use what:

nlargest/nsmallest — best when K is small relative to N
sorted(nums)[:k] — best when K is close to N
heapq.heapify + K pops — best when you need to process elements one at a time

Tuple Priority

heapq compares elements directly. For tuples, it compares element by element — first element is the primary key, second breaks ties, etc.

import heapq

# Priority queue with (priority, item)
pq = []
heapq.heappush(pq, (3, "low priority"))
heapq.heappush(pq, (1, "high priority"))
heapq.heappush(pq, (2, "medium priority"))

priority, item = heapq.heappop(pq)  # (1, "high priority")

Tie-breaking problem: If priorities are equal and items aren't comparable, Python raises a TypeError. Solution: add a unique counter as a tiebreaker. Comparable types: str, int, float, tuple, list. Not comparable: dict, set, custom objects without __lt__.

import heapq

class PriorityQueue:
    def __init__(self):
        self._heap = []
        self._counter = 0

    def push(self, priority, item):
        heapq.heappush(self._heap, (priority, self._counter, item))
        self._counter += 1

    def pop(self):
        priority, _, item = heapq.heappop(self._heap)
        return priority, item

# Usage — items are dicts (not comparable without counter)
pq = PriorityQueue()
pq.push(3, {"task": "email"})
pq.push(1, {"task": "deploy"})
pq.push(1, {"task": "fix bug"})   # same priority — FIFO via counter

print(pq.pop())  # (1, {"task": "deploy"}) — highest priority, arrived first
print(pq.pop())  # (1, {"task": "fix bug"}) — same priority, arrived second
print(pq.pop())  # (3, {"task": "email"})

# Without the counter, (1, {"task": "deploy"}) < (1, {"task": "fix bug"})
# would crash with TypeError — dicts aren't comparable

The counter ensures FIFO order for equal priorities and prevents comparison of incomparable items.

Class with `lt` (Alternative)

Instead of tuples, define __lt__ on a class — heapq only needs < to work:

import heapq

class Task:
    def __init__(self, priority, name):
        self.priority = priority
        self.name = name

    def __lt__(self, other):
        return self.priority < other.priority

pq = []
heapq.heappush(pq, Task(3, "email"))
heapq.heappush(pq, Task(1, "deploy"))
print(heapq.heappop(pq).name)  # "deploy"

No counter trick needed — Python never compares beyond __lt__. Use this when objects have many fields. For interviews, tuples are faster to write.

`@dataclass(order=True)` (Cleanest)

Auto-generates comparison methods from fields. Exclude fields with field(compare=False):

from dataclasses import dataclass, field
from typing import Any
import heapq

@dataclass(order=True)
class PrioritizedItem:
    priority: int
    item: Any = field(compare=False)

pq = []
heapq.heappush(pq, PrioritizedItem(3, "email"))
heapq.heappush(pq, PrioritizedItem(1, "deploy"))
print(heapq.heappop(pq).item)  # "deploy"

No __lt__ boilerplate — order=True handles it. The compare=False field is ignored during sorting, so incomparable items (dicts, custom objects) won't crash.

Approach	Use case
`(priority, counter, item)` tuple	Interviews — fast, minimal code
`__lt__` class	Many fields, custom comparison logic
`@dataclass(order=True)`	Clean production code, exclude fields easily

Pattern: Top-K Elements

Find the K largest (or smallest) elements from a stream or array. Use a heap of size K.

LeetCode 215 - Kth Largest Element in an Array

import heapq

def findKthLargest(nums: list[int], k: int) -> int:
    heap = []
    for n in nums:
        heapq.heappush(heap, n)
        if len(heap) > k:
            heapq.heappop(heap)  # remove smallest, keeping K largest
    return heap[0]  # smallest of the K largest = Kth largest

Why not sort? Sorting is O(n log n). A heap of size K processes each element in O(log k). When k << n, the heap wins. Also works on streams where you don't have all data upfront.

LeetCode 347 - Top K Frequent Elements

import heapq
from collections import Counter

def topKFrequent(nums: list[int], k: int) -> list[int]:
    counts = Counter(nums)
    return [x for x, _ in heapq.nlargest(k, counts.items(), key=lambda x: x[1])]

Without nlargest — min-heap of size k, evict least frequent:

def topKFrequent(nums: list[int], k: int) -> list[int]:
    freq = Counter(nums)
    heap = []
    for num, count in freq.items():
        heapq.heappush(heap, (count, num))
        if len(heap) > k:
            heapq.heappop(heap)  # evict smallest frequency
    return [num for count, num in heap]
# O(n log k) — heap never exceeds size k

Pattern: Merge K Sorted Lists

Merge K sorted sequences into one sorted sequence. Push the first element of each list into a min-heap. Pop the smallest, push the next element from that list.

LeetCode 23 - Merge K Sorted Lists

import heapq

def mergeKLists(lists: List[Optional[ListNode]]) -> Optional[ListNode]:
    heap: list[tuple[int, int, ListNode]] = []  # (val, index, node)
    # Push the head of each list; index breaks ties (avoids comparing ListNodes)
    for i, lst in enumerate(lists):
        if lst:
            heapq.heappush(heap, (lst.val, i, lst))

    dummy = curr = ListNode(0)  # dummy head to simplify appending
    while heap:
        val, i, node = heapq.heappop(heap)  # smallest across all k lists
        curr.next = node          # append to result
        curr = curr.next
        if node.next:             # push the next node from the same list
            heapq.heappush(heap, (node.next.val, i, node.next))

    return dummy.next  # skip dummy, return actual head

O(n log k) where n is total elements and k is number of lists. The index i serves as a tiebreaker to avoid comparing ListNode objects.

Pattern: Running Median (Two Heaps)

Maintain a max-heap for the lower half and a min-heap for the upper half. The median is always at the top of one or both heaps.

LeetCode 295 - Find Median from Data Stream

import heapq

class MedianFinder:
    def __init__(self):
        self.lo = []  # max-heap (negated) — lower half
        self.hi = []  # min-heap — upper half

    def addNum(self, num: int) -> None:
        heapq.heappush(self.lo, -num)
        # Ensure max of lo <= min of hi
        heapq.heappush(self.hi, -heapq.heappop(self.lo))
        # Balance sizes: lo can have at most 1 more than hi
        if len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def findMedian(self) -> float:
        if len(self.lo) > len(self.hi):
            return -self.lo[0]
        return (-self.lo[0] + self.hi[0]) / 2

Each insertion is O(log n). Getting the median is O(1).

Operation Costs

Operation	Heap	Sorted List	Unsorted List
Find min	O(1)	O(1)	O(n)
Insert	O(log n)	O(n)*	O(1)
Extract min	O(log n)	O(1)**	O(n)
Build from list	O(n)	O(n log n)	O(1)

*O(n) due to shifting. **O(1) if popping from the end of a sorted list (max).

The heap is the sweet spot: fast extraction AND fast insertion.

When to Use Heaps

"Give me the K largest/smallest" → min/max heap of size K
Scheduling by priority → priority queue
Merging sorted streams → min-heap
Running statistics (median, percentiles) → two heaps
Dijkstra's shortest path → min-heap of distances
Huffman coding → min-heap of frequencies

When NOT to Use Heaps

You need to search for arbitrary elements → dict
You need all elements sorted → just sort the list
You only need min OR max once → min()/max() is O(n) and simpler
You need the kth element once from static data → heapq.nlargest(k, nums)[-1] or quickselect

Practice Problems

Problem	Difficulty	Link
Kth Largest Element in an Array	Medium	LeetCode 215
Top K Frequent Elements	Medium	LeetCode 347
Merge K Sorted Lists	Hard	LeetCode 23
Find Median from Data Stream	Hard	LeetCode 295
Task Scheduler	Medium	LeetCode 621
Last Stone Weight	Easy	LeetCode 1046

Key Takeaways

heapq operates on regular Python lists — no special class needed
heapq is min-heap only — negate values for max-heap behavior
heapify() is O(n), heappush()/heappop() are O(log n)
Use tuples for priority: (priority, tiebreaker, item)
Add a counter as tiebreaker when items aren't comparable
Top-K, merge K sorted, and running median are the three core heap patterns
nlargest(k, iterable) and nsmallest(k, iterable) are convenient for one-shot queries