01. How to Think About Complexity

📋 Jump to Takeaways

Every algorithm has a cost. Big-O notation tells you how that cost grows as input gets bigger. It's the single most important concept in DSA — every data structure choice and every algorithm decision comes back to "what's the complexity?"

The Core Idea

Big-O measures growth rate, not exact speed. An O(n) algorithm isn't "fast" — it means the work scales linearly with input size. Double the input, double the work.

# O(n) — one loop through the data
def total(nums: list[int]) -> int:
    result = 0
    for n in nums:
        result += n
    return result

Constants don't matter. O(2n) is O(n). O(n/2) is O(n). Big-O cares about the shape of the growth curve, not the exact multiplier.

The Five Common Classes

Almost every algorithm you'll encounter falls into one of these:

Class	Name	Example
O(1)	Constant	List access by index, dict lookup
O(log n)	Logarithmic	Binary search, bisect module
O(n)	Linear	Single loop, `in` on a list
O(n log n)	Linearithmic	`sorted()`, `list.sort()` (Timsort)
O(n²)	Quadratic	Nested loops, bubble sort

With 1 million elements:

O(1) → 1 operation
O(log n) → ~20 operations
O(n) → 1,000,000 operations
O(n log n) → ~20,000,000 operations
O(n²) → 1,000,000,000,000 operations

The jump from O(n) to O(n²) is where algorithms go from "instant" to "won't finish before the heat death of the universe."

Beyond Quadratic

These classes appear in brute-force solutions and exhaustive search. They're usually too slow for large inputs but acceptable when n is small.

Class	Name	Example
O(n³)	Cubic	Three nested loops, naive matrix multiplication
O(2ⁿ)	Exponential	Recursive with two branches per call, subsets
O(n!)	Factorial	Generating all permutations

Time vs Space

Every algorithm has two costs:

Time complexity — how many operations it performs.

Space complexity — how much extra memory it needs.

# O(n) time, O(n) space — builds a new list
def double(nums: list[int]) -> list[int]:
    return [n * 2 for n in nums]

# O(n) time, O(1) space — modifies in place
def double_in_place(nums: list[int]) -> None:
    for i in range(len(nums)):
        nums[i] *= 2

Often you trade one for the other. Use extra memory to save time (caching, hash maps). Use extra time to save memory (recompute instead of store).

How to Analyze Code

Count the loops:

No loops → O(1)
One loop over n → O(n)
Nested loops over n → O(n²)
Loop that halves the problem each step → O(log n)
One loop with a sort inside → O(n log n)

# O(n²) — nested loops, both depend on n
def all_pairs(nums: list[int]) -> None:
    for i in range(len(nums)):
        for j in range(i + 1, len(nums)):
            print(nums[i], nums[j])

# O(n log n) — sort dominates
def sorted_unique(nums: list[int]) -> list[int]:
    nums.sort()  # O(n log n)
    result = [nums[0]]
    for i in range(1, len(nums)):  # O(n)
        if nums[i] != nums[i - 1]:
            result.append(nums[i])
    return result  # Total: O(n log n) + O(n) = O(n log n)

When you have sequential steps, take the largest. When you have nested steps, multiply.

Python-Specific Costs You Must Know

Python's built-in operations have complexities that aren't always obvious:

Operation	Time	Notes
`list[i]`	O(1)	Index access
`list.append(x)`	O(1) amortized	Dynamic array doubling
`list.pop()`	O(1)	Remove from end
`list.pop(0)`	O(n)	Shifts all elements left
`list.insert(i, x)`	O(n)	Shifts elements right
`x in list`	O(n)	Linear scan
`x in set`	O(1)	Hash lookup
`x in dict`	O(1)	Hash lookup
`dict[key]`	O(1)	Hash lookup
`sorted(list)`	O(n log n)	Timsort
`list.sort()`	O(n log n)	In-place Timsort
`"".join(parts)`	O(n)	Total chars
`s + t` (strings)	O(len(s) + len(t))	Creates new string

The critical trap: x in list is O(n), x in set is O(1). Converting a list to a set for repeated lookups can change your algorithm from O(n²) to O(n).

# O(n²) — linear search inside a loop
def has_common_brute(a: list[int], b: list[int]) -> bool:
    for x in a:        # O(n)
        if x in b:     # O(n) — list scan!
            return True
    return False

# O(n) — set lookup inside a loop
def has_common_fast(a: list[int], b: list[int]) -> bool:
    b_set = set(b)     # O(n) one-time cost
    for x in a:        # O(n)
        if x in b_set: # O(1) — hash lookup
            return True
    return False

Amortized Complexity

Some operations are usually cheap but occasionally expensive. Python's list.append() is O(1) amortized — most calls just write to the next slot, but occasionally the list doubles its capacity and copies everything.

nums = []
nums.append(1)  # O(1) — fits in capacity
nums.append(2)  # O(1)
nums.append(3)  # O(1)
nums.append(4)  # O(1)
nums.append(5)  # O(n) — capacity exceeded, allocate and copy

Over n appends, the total cost is O(n), so each append is O(1) amortized. You'll see this pattern in dynamic arrays, hash maps, and union-find.

Best, Worst, Average

Big-O usually describes the worst case. But sometimes the distinction matters:

Binary search: O(log n) worst, O(1) best (target is the middle element)
Timsort: O(n log n) worst, O(n) best (already sorted data)
Dict lookup: O(n) worst (all keys collide), O(1) average

In interviews, state the worst case unless asked otherwise. In practice, average case often matters more — Python's Timsort is faster than textbook mergesort on real data because it exploits existing order.

The Practical Rule

For competitive programming and interviews, these are the rough limits for n:

n	Max acceptable complexity
≤ 20	O(2ⁿ), O(n!)
≤ 500	O(n³)
≤ 10,000	O(n²)
≤ 1,000,000	O(n log n)
≤ 100,000,000	O(n)
Any	O(log n), O(1)

If n is 10⁶ and your solution is O(n²), it won't pass. This table tells you which algorithmic class you need before you start coding.

Python caveat: Python is ~10-100x slower than C/Go. For LeetCode, the constants matter more. A tight O(n) in Python can time out where a sloppy O(n) in C++ passes. When n approaches 10⁶, favor built-in functions (implemented in C) over pure Python loops.

Practice Problems

These problems have brute-force solutions that are too slow. The challenge is recognizing which complexity class you need and picking the right approach.

Problem	Difficulty	Why it fits
Contains Duplicate	Easy	O(n²) brute force vs O(n) with a set
Two Sum	Easy	O(n²) nested loops vs O(n) with a dict
3Sum	Medium	O(n³) brute force vs O(n²) with sorting + two pointers
Search in Rotated Sorted Array	Medium	Must be O(log n) — linear scan won't pass
Kth Largest Element in an Array	Medium	O(n log n) sort vs O(n) quickselect — input size determines which matters

Key Takeaways

Big-O measures growth rate, not absolute speed — constants are dropped
Five classes cover almost everything: O(1), O(log n), O(n), O(n log n), O(n²)
Time and space are separate costs — you often trade one for the other
Count loops to analyze: one loop = O(n), nested = O(n²), halving = O(log n)
Amortized O(1) means "usually O(1), occasionally O(n), averages out"
Use the input size to determine which complexity class your solution needs
Know Python's costs: in list is O(n), in set is O(1) — this single fact changes algorithm design
Python is slower than compiled languages — favor built-in C-implemented functions for hot paths