10 - Deadlocks & Race Conditions

📋 Jump to Takeaways

Concurrency bugs are the hardest to find because they don't always happen. Your code works 99 times, then fails on the 100th. It'll pass every test on your machine and blow up in production. This lesson covers how to detect, debug, and prevent the two most common concurrency bugs.

Deadlock

A deadlock happens when goroutines are waiting for each other and none can proceed. Go detects some deadlocks at runtime and crashes with fatal error: all goroutines are asleep - deadlock!

Classic Deadlock: Unbuffered Channel

func main() {
    ch := make(chan int)
    ch <- 1 // blocks forever — no goroutine to receive
}
// fatal error: all goroutines are asleep - deadlock!

The send blocks because nobody is receiving. Since main is the only goroutine, Go detects the deadlock.

Deadlock: Two Goroutines Waiting on Each Other

func main() {
    ch1 := make(chan int)
    ch2 := make(chan int)

    go func() {
        fmt.Println("goroutine 1 started")
        val := <-ch1 // waiting for ch1
        ch2 <- val   // will send to ch2
        fmt.Println("goroutine 1 done") // never prints
    }()

    go func() {
        fmt.Println("goroutine 2 started")
        val := <-ch2 // waiting for ch2
        ch1 <- val   // will send to ch1
        fmt.Println("goroutine 2 done") // never prints
    }()

    // without this, main exits immediately and goroutines never get a chance to run
    time.Sleep(time.Second) // ❌ BAD: masks the deadlock — main isn't blocked, so Go won't detect it
    fmt.Println("done") // always prints — main doesn't know about the deadlock
}

Goroutine 1 waits for ch1. Goroutine 2 waits for ch2. Neither sends first. Deadlock. But main sleeps, wakes up, prints "done", and exits — completely unaware. Go's runtime only detects deadlocks when ALL goroutines are blocked. Since main is sleeping (not blocked on a channel), the runtime doesn't crash. The deadlocked goroutines are silently abandoned.

Deadlock: Lock Ordering

var mu1, mu2 sync.Mutex

// Goroutine A
go func() {
    mu1.Lock()
    time.Sleep(time.Millisecond) // force interleaving (deliberate, to trigger deadlock)
    mu2.Lock() // waits for mu2
    // ...
    mu2.Unlock()
    mu1.Unlock()
}()

// Goroutine B
go func() {
    mu2.Lock()
    time.Sleep(time.Millisecond) // force interleaving (deliberate, to trigger deadlock)
    mu1.Lock() // waits for mu1
    // ...
    mu1.Unlock()
    mu2.Unlock()
}()

A holds mu1, wants mu2. B holds mu2, wants mu1. Classic deadlock.

The fix: always acquire locks in the same order.

// Both goroutines lock mu1 first, then mu2
mu1.Lock()
mu2.Lock()
// ...
mu2.Unlock()
mu1.Unlock()

This is exactly how the Dining Philosophers problem (next lesson) is solved — always pick up the lower-numbered fork first.

Preventing Deadlocks

Consistent lock ordering — always acquire multiple locks in the same order
Timeouts — use context.WithTimeout instead of blocking forever
Consistent lock ordering — if you must hold multiple locks, always acquire them in the same order. Deadlock happens when goroutine 1 locks A then B, while goroutine 2 locks B then A
Use channels instead of mutexes when possible — channels have clearer ownership

Race Conditions

A race condition happens when multiple goroutines access shared data and at least one writes. The result depends on timing — it's completely unpredictable. This is the bug that makes you question your sanity.

func main() {
    counter := 0
    var wg sync.WaitGroup

    for i := 0; i < 1000; i++ {
        wg.Add(1)
        go func() {
            defer wg.Done()
            counter++ // race condition
        }()
    }

    wg.Wait()
    fmt.Println(counter) // not always 1000
}

counter++ is not atomic. It's read → increment → write. Two goroutines can read the same value, both increment, and one write is lost.

The Race Detector

Go has a built-in race detector, and it's one of the best tools in the language. The compiler won't catch race conditions — your code compiles fine, runs fine most of the time, and silently corrupts data. The race detector catches what the compiler can't. Use the -race flag.

go run -race main.go

Output:

==================
WARNING: DATA RACE
Read at 0x00c0000b4010 by goroutine 8:
  main.main.func1()
      /path/main.go:14 +0x6a

Previous write at 0x00c0000b4010 by goroutine 7:
  main.main.func1()
      /path/main.go:14 +0x80

Goroutine 8 (running) created at:
  main.main()
      /path/main.go:12 +0x98
==================

The race detector tells you exactly which goroutines are racing and where. Use it in development and CI.

go test -race ./...    # run tests with race detection
go build -race         # build with race detection (slower, for testing)

Always run tests with -race. No exceptions. Make it part of your CI pipeline. If you skip this, you're flying blind.

Common Race Conditions

Map Concurrent Access

m := make(map[string]int)

// This will crash with: fatal error: concurrent map writes
var wg sync.WaitGroup
for i := 0; i < 100; i++ {
    wg.Add(1)
    go func(n int) {
        defer wg.Done()
        m[fmt.Sprintf("key-%d", n)] = n // race + crash
    }(i)
}

Maps are not goroutine-safe. Fix with a mutex or sync.Map.

Slice Append

var results []int
var wg sync.WaitGroup

for i := 0; i < 100; i++ {
    wg.Add(1)
    go func(n int) {
        defer wg.Done()
        results = append(results, n) // race condition
    }(i)
}

append may reallocate the underlying array. Multiple goroutines appending simultaneously corrupt the slice. Fix with a mutex or collect through a channel.

Loop Variable Capture

for i := 0; i < 5; i++ {
    go func() {
        fmt.Println(i) // race — captures the variable, not the value
    }()
}

By the time the goroutine runs, i may have changed. This is one of the most common Go bugs — it bites everyone at least once. Fix by passing as a parameter:

for i := 0; i < 5; i++ {
    go func(n int) {
        fmt.Println(n) // safe — n is a copy
    }(i)
}

Note: Go 1.22+ changed loop variable semantics so each iteration gets its own variable. But passing as a parameter is still clearer and works in all versions.

Debugging Techniques

1. Race Detector

Always the first tool. Catches most data races.

go test -race -count=1 ./...

-count=1 disables test caching so races are checked every time.

2. Goroutine Dump

If your program hangs, press Ctrl+\ in the terminal to send SIGQUIT. Go dumps the stack trace of every goroutine and exits — no setup needed, it's built into the runtime.

goroutine 1 [sleep]:
    time.Sleep(...)
    main.main()
        /path/main.go:38

goroutine 5 [chan receive]:
    main.main.func1()
        /path/main.go:28

goroutine 6 [chan receive]:
    main.main.func2()
        /path/main.go:33

You can see exactly where each goroutine is stuck. Two goroutines blocked on chan receive — instant deadlock diagnosis.

You can also dump goroutine stacks from code — useful for logging or HTTP debug endpoints:

func dumpGoroutines() {
    buf := make([]byte, 1<<16)          // 64KB buffer
    n := runtime.Stack(buf, true)        // true = capture all goroutines
    fmt.Printf("=== goroutine dump ===\n%s\n", buf[:n])
}

Call dumpGoroutines() when you suspect a hang — it prints the same output as Ctrl+\ but without killing the program.

3. pprof for Goroutine Leaks

If your goroutine count keeps growing over time, you have a leak — goroutines that start but never exit. Add a debug server to inspect them while your app is running:

import _ "net/http/pprof"

go func() {
    http.ListenAndServe("localhost:6060", nil)
}()

While your app runs, visit these in a browser:

http://localhost:6060/debug/pprof/goroutine?debug=1 — list all goroutines and where they're blocked
http://localhost:6060/debug/pprof/goroutine?debug=2 — full stack traces

If you see hundreds of goroutines stuck on the same line (e.g. chan receive or select), that's your leak. Only enable this in development or behind an internal network — don't expose it publicly.

Prevention Checklist

Problem	Prevention
Deadlock	Consistent lock ordering, timeouts, context cancellation
Data race on variable	Mutex, atomic, or channel
Data race on map	`sync.Map` or map + `RWMutex`
Data race on slice	Mutex or collect via channel
Goroutine leak	Always use context, close channels, check `ctx.Done()`
Loop variable capture	Pass as goroutine parameter
Panic in goroutine	`defer recover()` wrapper

Key Takeaways

Deadlocks: goroutines waiting on each other forever. Fix with consistent lock ordering and timeouts
Race conditions: concurrent access to shared data. Fix with mutexes, atomics, or channels
go run -race and go test -race are essential — use them always
Go only detects deadlocks when ALL goroutines are blocked — partial deadlocks hang silently
Maps and slices are not goroutine-safe — protect them or use concurrent alternatives
Pass loop variables as parameters to goroutines
Use pprof to detect goroutine leaks in long-running services
Make -race part of your CI pipeline