11. Heaps and Priority Queues
📋 Jump to TakeawaysA heap answers one question efficiently: what's the smallest (or largest) element right now? Not "find element X" — dicts do that. Not "give me everything sorted" — sorting does that. Just "give me the extreme value, and let me add/remove elements while maintaining that guarantee." That's O(log n) per operation instead of O(n).
The Structure
A heap is a complete binary tree stored as a list. The heap property: every parent is smaller than (min-heap) or larger than (max-heap) its children.
Min-heap as tree: As list:
1 [1, 3, 2, 7, 4, 5, 6]
/ \
3 2 Parent of i: (i-1)//2
/ \ / \ Left child: 2*i + 1
7 4 5 6 Right child: 2*i + 2No pointers needed. The tree structure is implicit in the list indices.
Python's heapq Module
Python provides heapq — a min-heap implementation that operates on regular lists. No special class needed.
import heapq
nums = [5, 3, 8, 1, 9, 2]
# heapify — transform list into a heap in-place, O(n)
heapq.heapify(nums) # nums is now [1, 3, 2, 5, 9, 8]
# heappush — add element, O(log n)
heapq.heappush(nums, 4)
# heappop — remove and return smallest, O(log n)
smallest = heapq.heappop(nums) # 1
# peek — look at smallest without removing, O(1)
smallest = nums[0]
# heappushpop — push then pop (more efficient than separate operations)
result = heapq.heappushpop(nums, 6)
# heapreplace — pop then push (more efficient than separate operations)
result = heapq.heapreplace(nums, 7)Key fact: heapq is min-heap only. There is no built-in max-heap in Python.
The Max-Heap Trick: Negate Values
Since heapq only supports min-heap, simulate a max-heap by negating values:
import heapq
nums = [5, 3, 8, 1, 9]
# Max-heap: negate on push, negate on pop
max_heap = [-x for x in nums]
heapq.heapify(max_heap)
# Get largest
largest = -heapq.heappop(max_heap) # 9
# Push a new value
heapq.heappush(max_heap, -7) # push 7 as -7
# Peek at largest
largest = -max_heap[0]This is the standard Python idiom. Every competitive programmer and interviewer knows it.
nlargest and nsmallest
For finding the top/bottom K elements without maintaining a heap yourself:
import heapq
nums = [5, 3, 8, 1, 9, 2, 7]
# K largest elements
heapq.nlargest(3, nums) # [9, 8, 7]
# K smallest elements
heapq.nsmallest(3, nums) # [1, 2, 3]
# With key function
tasks = [('low', 3), ('high', 1), ('med', 2)]
heapq.nsmallest(2, tasks, key=lambda x: x[1]) # [('high', 1), ('med', 2)]When to use what:
nlargest/nsmallest— best when K is small relative to Nsorted(nums)[:k]— best when K is close to Nheapq.heapify+ K pops — best when you need to process elements one at a time
Tuple Priority
heapq compares elements directly. For tuples, it compares element by element — first element is the primary key, second breaks ties, etc.
import heapq
# Priority queue with (priority, item)
pq = []
heapq.heappush(pq, (3, "low priority"))
heapq.heappush(pq, (1, "high priority"))
heapq.heappush(pq, (2, "medium priority"))
priority, item = heapq.heappop(pq) # (1, "high priority")Tie-breaking problem: If priorities are equal and items aren't comparable, Python raises a TypeError. Solution: add a unique counter as a tiebreaker. Comparable types: str, int, float, tuple, list. Not comparable: dict, set, custom objects without __lt__.
import heapq
class PriorityQueue:
def __init__(self):
self._heap = []
self._counter = 0
def push(self, priority, item):
heapq.heappush(self._heap, (priority, self._counter, item))
self._counter += 1
def pop(self):
priority, _, item = heapq.heappop(self._heap)
return priority, item# Usage — items are dicts (not comparable without counter)
pq = PriorityQueue()
pq.push(3, {"task": "email"})
pq.push(1, {"task": "deploy"})
pq.push(1, {"task": "fix bug"}) # same priority — FIFO via counter
print(pq.pop()) # (1, {"task": "deploy"}) — highest priority, arrived first
print(pq.pop()) # (1, {"task": "fix bug"}) — same priority, arrived second
print(pq.pop()) # (3, {"task": "email"})
# Without the counter, (1, {"task": "deploy"}) < (1, {"task": "fix bug"})
# would crash with TypeError — dicts aren't comparableThe counter ensures FIFO order for equal priorities and prevents comparison of incomparable items.
Class with __lt__ (Alternative)
Instead of tuples, define __lt__ on a class — heapq only needs < to work:
import heapq
class Task:
def __init__(self, priority, name):
self.priority = priority
self.name = name
def __lt__(self, other):
return self.priority < other.priority
pq = []
heapq.heappush(pq, Task(3, "email"))
heapq.heappush(pq, Task(1, "deploy"))
print(heapq.heappop(pq).name) # "deploy"No counter trick needed — Python never compares beyond __lt__. Use this when objects have many fields. For interviews, tuples are faster to write.
@dataclass(order=True) (Cleanest)
Auto-generates comparison methods from fields. Exclude fields with field(compare=False):
from dataclasses import dataclass, field
from typing import Any
import heapq
@dataclass(order=True)
class PrioritizedItem:
priority: int
item: Any = field(compare=False)
pq = []
heapq.heappush(pq, PrioritizedItem(3, "email"))
heapq.heappush(pq, PrioritizedItem(1, "deploy"))
print(heapq.heappop(pq).item) # "deploy"No __lt__ boilerplate — order=True handles it. The compare=False field is ignored during sorting, so incomparable items (dicts, custom objects) won't crash.
| Approach | Use case |
|---|---|
(priority, counter, item) tuple |
Interviews — fast, minimal code |
__lt__ class |
Many fields, custom comparison logic |
@dataclass(order=True) |
Clean production code, exclude fields easily |
Pattern: Top-K Elements
Find the K largest (or smallest) elements from a stream or array. Use a heap of size K.
LeetCode 215 - Kth Largest Element in an Array
import heapq
def findKthLargest(nums: list[int], k: int) -> int:
heap = []
for n in nums:
heapq.heappush(heap, n)
if len(heap) > k:
heapq.heappop(heap) # remove smallest, keeping K largest
return heap[0] # smallest of the K largest = Kth largestWhy not sort? Sorting is O(n log n). A heap of size K processes each element in O(log k). When k << n, the heap wins. Also works on streams where you don't have all data upfront.
LeetCode 347 - Top K Frequent Elements
import heapq
from collections import Counter
def topKFrequent(nums: list[int], k: int) -> list[int]:
counts = Counter(nums)
return [x for x, _ in heapq.nlargest(k, counts.items(), key=lambda x: x[1])]Without nlargest — min-heap of size k, evict least frequent:
def topKFrequent(nums: list[int], k: int) -> list[int]:
freq = Counter(nums)
heap = []
for num, count in freq.items():
heapq.heappush(heap, (count, num))
if len(heap) > k:
heapq.heappop(heap) # evict smallest frequency
return [num for count, num in heap]
# O(n log k) — heap never exceeds size kPattern: Merge K Sorted Lists
Merge K sorted sequences into one sorted sequence. Push the first element of each list into a min-heap. Pop the smallest, push the next element from that list.
LeetCode 23 - Merge K Sorted Lists
import heapq
def mergeKLists(lists: List[Optional[ListNode]]) -> Optional[ListNode]:
heap: list[tuple[int, int, ListNode]] = [] # (val, index, node)
# Push the head of each list; index breaks ties (avoids comparing ListNodes)
for i, lst in enumerate(lists):
if lst:
heapq.heappush(heap, (lst.val, i, lst))
dummy = curr = ListNode(0) # dummy head to simplify appending
while heap:
val, i, node = heapq.heappop(heap) # smallest across all k lists
curr.next = node # append to result
curr = curr.next
if node.next: # push the next node from the same list
heapq.heappush(heap, (node.next.val, i, node.next))
return dummy.next # skip dummy, return actual headO(n log k) where n is total elements and k is number of lists. The index i serves as a tiebreaker to avoid comparing ListNode objects.
Pattern: Running Median (Two Heaps)
Maintain a max-heap for the lower half and a min-heap for the upper half. The median is always at the top of one or both heaps.
LeetCode 295 - Find Median from Data Stream
import heapq
class MedianFinder:
def __init__(self):
self.lo = [] # max-heap (negated) — lower half
self.hi = [] # min-heap — upper half
def addNum(self, num: int) -> None:
heapq.heappush(self.lo, -num)
# Ensure max of lo <= min of hi
heapq.heappush(self.hi, -heapq.heappop(self.lo))
# Balance sizes: lo can have at most 1 more than hi
if len(self.hi) > len(self.lo):
heapq.heappush(self.lo, -heapq.heappop(self.hi))
def findMedian(self) -> float:
if len(self.lo) > len(self.hi):
return -self.lo[0]
return (-self.lo[0] + self.hi[0]) / 2Each insertion is O(log n). Getting the median is O(1).
Operation Costs
| Operation | Heap | Sorted List | Unsorted List |
|---|---|---|---|
| Find min | O(1) | O(1) | O(n) |
| Insert | O(log n) | O(n)* | O(1) |
| Extract min | O(log n) | O(1)** | O(n) |
| Build from list | O(n) | O(n log n) | O(1) |
*O(n) due to shifting. **O(1) if popping from the end of a sorted list (max).
The heap is the sweet spot: fast extraction AND fast insertion.
When to Use Heaps
- "Give me the K largest/smallest" → min/max heap of size K
- Scheduling by priority → priority queue
- Merging sorted streams → min-heap
- Running statistics (median, percentiles) → two heaps
- Dijkstra's shortest path → min-heap of distances
- Huffman coding → min-heap of frequencies
When NOT to Use Heaps
- You need to search for arbitrary elements → dict
- You need all elements sorted → just sort the list
- You only need min OR max once →
min()/max()is O(n) and simpler - You need the kth element once from static data →
heapq.nlargest(k, nums)[-1]or quickselect
Practice Problems
| Problem | Difficulty | Link |
|---|---|---|
| Kth Largest Element in an Array | Medium | LeetCode 215 |
| Top K Frequent Elements | Medium | LeetCode 347 |
| Merge K Sorted Lists | Hard | LeetCode 23 |
| Find Median from Data Stream | Hard | LeetCode 295 |
| Task Scheduler | Medium | LeetCode 621 |
| Last Stone Weight | Easy | LeetCode 1046 |
Key Takeaways
heapqoperates on regular Python lists — no special class neededheapqis min-heap only — negate values for max-heap behaviorheapify()is O(n),heappush()/heappop()are O(log n)- Use tuples for priority:
(priority, tiebreaker, item) - Add a counter as tiebreaker when items aren't comparable
- Top-K, merge K sorted, and running median are the three core heap patterns
nlargest(k, iterable)andnsmallest(k, iterable)are convenient for one-shot queries