11. Heaps and Priority Queues

📋 Jump to Takeaways

A heap answers one question efficiently: what's the smallest (or largest) element right now? Not "find element X" — dicts do that. Not "give me everything sorted" — sorting does that. Just "give me the extreme value, and let me add/remove elements while maintaining that guarantee." That's O(log n) per operation instead of O(n).

The Structure

A heap is a complete binary tree stored as a list. The heap property: every parent is smaller than (min-heap) or larger than (max-heap) its children.

Min-heap as tree:        As list:
       1                 [1, 3, 2, 7, 4, 5, 6]
      / \
     3   2               Parent of i: (i-1)//2
    / \ / \              Left child:  2*i + 1
   7  4 5  6             Right child: 2*i + 2

No pointers needed. The tree structure is implicit in the list indices.

Python's heapq Module

Python provides heapq — a min-heap implementation that operates on regular lists. No special class needed.

import heapq

nums = [5, 3, 8, 1, 9, 2]

# heapify — transform list into a heap in-place, O(n)
heapq.heapify(nums)  # nums is now [1, 3, 2, 5, 9, 8]

# heappush — add element, O(log n)
heapq.heappush(nums, 4)

# heappop — remove and return smallest, O(log n)
smallest = heapq.heappop(nums)  # 1

# peek — look at smallest without removing, O(1)
smallest = nums[0]

# heappushpop — push then pop (more efficient than separate operations)
result = heapq.heappushpop(nums, 6)

# heapreplace — pop then push (more efficient than separate operations)
result = heapq.heapreplace(nums, 7)

Key fact: heapq is min-heap only. There is no built-in max-heap in Python.

The Max-Heap Trick: Negate Values

Since heapq only supports min-heap, simulate a max-heap by negating values:

import heapq

nums = [5, 3, 8, 1, 9]

# Max-heap: negate on push, negate on pop
max_heap = [-x for x in nums]
heapq.heapify(max_heap)

# Get largest
largest = -heapq.heappop(max_heap)  # 9

# Push a new value
heapq.heappush(max_heap, -7)  # push 7 as -7

# Peek at largest
largest = -max_heap[0]

This is the standard Python idiom. Every competitive programmer and interviewer knows it.

nlargest and nsmallest

For finding the top/bottom K elements without maintaining a heap yourself:

import heapq

nums = [5, 3, 8, 1, 9, 2, 7]

# K largest elements
heapq.nlargest(3, nums)   # [9, 8, 7]

# K smallest elements
heapq.nsmallest(3, nums)  # [1, 2, 3]

# With key function
tasks = [('low', 3), ('high', 1), ('med', 2)]
heapq.nsmallest(2, tasks, key=lambda x: x[1])  # [('high', 1), ('med', 2)]

When to use what:

  • nlargest/nsmallest — best when K is small relative to N
  • sorted(nums)[:k] — best when K is close to N
  • heapq.heapify + K pops — best when you need to process elements one at a time

Tuple Priority

heapq compares elements directly. For tuples, it compares element by element — first element is the primary key, second breaks ties, etc.

import heapq

# Priority queue with (priority, item)
pq = []
heapq.heappush(pq, (3, "low priority"))
heapq.heappush(pq, (1, "high priority"))
heapq.heappush(pq, (2, "medium priority"))

priority, item = heapq.heappop(pq)  # (1, "high priority")

Tie-breaking problem: If priorities are equal and items aren't comparable, Python raises a TypeError. Solution: add a unique counter as a tiebreaker. Comparable types: str, int, float, tuple, list. Not comparable: dict, set, custom objects without __lt__.

import heapq

class PriorityQueue:
    def __init__(self):
        self._heap = []
        self._counter = 0

    def push(self, priority, item):
        heapq.heappush(self._heap, (priority, self._counter, item))
        self._counter += 1

    def pop(self):
        priority, _, item = heapq.heappop(self._heap)
        return priority, item
# Usage — items are dicts (not comparable without counter)
pq = PriorityQueue()
pq.push(3, {"task": "email"})
pq.push(1, {"task": "deploy"})
pq.push(1, {"task": "fix bug"})   # same priority — FIFO via counter

print(pq.pop())  # (1, {"task": "deploy"}) — highest priority, arrived first
print(pq.pop())  # (1, {"task": "fix bug"}) — same priority, arrived second
print(pq.pop())  # (3, {"task": "email"})

# Without the counter, (1, {"task": "deploy"}) < (1, {"task": "fix bug"})
# would crash with TypeError — dicts aren't comparable

The counter ensures FIFO order for equal priorities and prevents comparison of incomparable items.

Class with __lt__ (Alternative)

Instead of tuples, define __lt__ on a class — heapq only needs < to work:

import heapq

class Task:
    def __init__(self, priority, name):
        self.priority = priority
        self.name = name

    def __lt__(self, other):
        return self.priority < other.priority

pq = []
heapq.heappush(pq, Task(3, "email"))
heapq.heappush(pq, Task(1, "deploy"))
print(heapq.heappop(pq).name)  # "deploy"

No counter trick needed — Python never compares beyond __lt__. Use this when objects have many fields. For interviews, tuples are faster to write.

@dataclass(order=True) (Cleanest)

Auto-generates comparison methods from fields. Exclude fields with field(compare=False):

from dataclasses import dataclass, field
from typing import Any
import heapq

@dataclass(order=True)
class PrioritizedItem:
    priority: int
    item: Any = field(compare=False)

pq = []
heapq.heappush(pq, PrioritizedItem(3, "email"))
heapq.heappush(pq, PrioritizedItem(1, "deploy"))
print(heapq.heappop(pq).item)  # "deploy"

No __lt__ boilerplate — order=True handles it. The compare=False field is ignored during sorting, so incomparable items (dicts, custom objects) won't crash.

Approach Use case
(priority, counter, item) tuple Interviews — fast, minimal code
__lt__ class Many fields, custom comparison logic
@dataclass(order=True) Clean production code, exclude fields easily

Pattern: Top-K Elements

Find the K largest (or smallest) elements from a stream or array. Use a heap of size K.

LeetCode 215 - Kth Largest Element in an Array

import heapq

def findKthLargest(nums: list[int], k: int) -> int:
    heap = []
    for n in nums:
        heapq.heappush(heap, n)
        if len(heap) > k:
            heapq.heappop(heap)  # remove smallest, keeping K largest
    return heap[0]  # smallest of the K largest = Kth largest

Why not sort? Sorting is O(n log n). A heap of size K processes each element in O(log k). When k << n, the heap wins. Also works on streams where you don't have all data upfront.

LeetCode 347 - Top K Frequent Elements

import heapq
from collections import Counter

def topKFrequent(nums: list[int], k: int) -> list[int]:
    counts = Counter(nums)
    return [x for x, _ in heapq.nlargest(k, counts.items(), key=lambda x: x[1])]

Without nlargest — min-heap of size k, evict least frequent:

def topKFrequent(nums: list[int], k: int) -> list[int]:
    freq = Counter(nums)
    heap = []
    for num, count in freq.items():
        heapq.heappush(heap, (count, num))
        if len(heap) > k:
            heapq.heappop(heap)  # evict smallest frequency
    return [num for count, num in heap]
# O(n log k) — heap never exceeds size k

Pattern: Merge K Sorted Lists

Merge K sorted sequences into one sorted sequence. Push the first element of each list into a min-heap. Pop the smallest, push the next element from that list.

LeetCode 23 - Merge K Sorted Lists

import heapq

def mergeKLists(lists: List[Optional[ListNode]]) -> Optional[ListNode]:
    heap: list[tuple[int, int, ListNode]] = []  # (val, index, node)
    # Push the head of each list; index breaks ties (avoids comparing ListNodes)
    for i, lst in enumerate(lists):
        if lst:
            heapq.heappush(heap, (lst.val, i, lst))

    dummy = curr = ListNode(0)  # dummy head to simplify appending
    while heap:
        val, i, node = heapq.heappop(heap)  # smallest across all k lists
        curr.next = node          # append to result
        curr = curr.next
        if node.next:             # push the next node from the same list
            heapq.heappush(heap, (node.next.val, i, node.next))

    return dummy.next  # skip dummy, return actual head

O(n log k) where n is total elements and k is number of lists. The index i serves as a tiebreaker to avoid comparing ListNode objects.

Pattern: Running Median (Two Heaps)

Maintain a max-heap for the lower half and a min-heap for the upper half. The median is always at the top of one or both heaps.

LeetCode 295 - Find Median from Data Stream

import heapq

class MedianFinder:
    def __init__(self):
        self.lo = []  # max-heap (negated) — lower half
        self.hi = []  # min-heap — upper half

    def addNum(self, num: int) -> None:
        heapq.heappush(self.lo, -num)
        # Ensure max of lo <= min of hi
        heapq.heappush(self.hi, -heapq.heappop(self.lo))
        # Balance sizes: lo can have at most 1 more than hi
        if len(self.hi) > len(self.lo):
            heapq.heappush(self.lo, -heapq.heappop(self.hi))

    def findMedian(self) -> float:
        if len(self.lo) > len(self.hi):
            return -self.lo[0]
        return (-self.lo[0] + self.hi[0]) / 2

Each insertion is O(log n). Getting the median is O(1).

Operation Costs

Operation Heap Sorted List Unsorted List
Find min O(1) O(1) O(n)
Insert O(log n) O(n)* O(1)
Extract min O(log n) O(1)** O(n)
Build from list O(n) O(n log n) O(1)

*O(n) due to shifting. **O(1) if popping from the end of a sorted list (max).

The heap is the sweet spot: fast extraction AND fast insertion.

When to Use Heaps

  • "Give me the K largest/smallest" → min/max heap of size K
  • Scheduling by priority → priority queue
  • Merging sorted streams → min-heap
  • Running statistics (median, percentiles) → two heaps
  • Dijkstra's shortest path → min-heap of distances
  • Huffman coding → min-heap of frequencies

When NOT to Use Heaps

  • You need to search for arbitrary elements → dict
  • You need all elements sorted → just sort the list
  • You only need min OR max once → min()/max() is O(n) and simpler
  • You need the kth element once from static data → heapq.nlargest(k, nums)[-1] or quickselect

Practice Problems

Problem Difficulty Link
Kth Largest Element in an Array Medium LeetCode 215
Top K Frequent Elements Medium LeetCode 347
Merge K Sorted Lists Hard LeetCode 23
Find Median from Data Stream Hard LeetCode 295
Task Scheduler Medium LeetCode 621
Last Stone Weight Easy LeetCode 1046

Key Takeaways

  • heapq operates on regular Python lists — no special class needed
  • heapq is min-heap only — negate values for max-heap behavior
  • heapify() is O(n), heappush()/heappop() are O(log n)
  • Use tuples for priority: (priority, tiebreaker, item)
  • Add a counter as tiebreaker when items aren't comparable
  • Top-K, merge K sorted, and running median are the three core heap patterns
  • nlargest(k, iterable) and nsmallest(k, iterable) are convenient for one-shot queries

📝 Ready to test your knowledge?

Answer the quiz below to mark this lesson complete.

Spot something off? Report an issue

© 2026 ByteLearn.dev. Free courses for developers. · Privacy