08 - Error Handling in Concurrent Code
📋 Jump to TakeawaysError handling in sequential code is simple. Check err, return it. In concurrent code, errors happen in goroutines that can't return values to the caller. You need a strategy.
Why This Is Different
In sequential code, errors bubble up through return values. In concurrent code, a goroutine can't return an error to whoever launched it — it's running independently. If you ignore this, errors disappear silently. Your program looks like it succeeded when it didn't.
You need concurrent error handling when:
- Multiple goroutines do work and any of them can fail
- You want to stop all work when the first error happens
- You need to collect errors from all goroutines, not just the first
- A goroutine might panic and you need to prevent it from crashing the whole program
Error Channels
The simplest approach: send errors through a channel.
func doWork(id int) error {
if id == 3 {
return fmt.Errorf("worker %d failed", id)
}
time.Sleep(200 * time.Millisecond) // simulate work
return nil
}
func main() {
errCh := make(chan error, 5)
var wg sync.WaitGroup
for i := 1; i <= 5; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
if err := doWork(id); err != nil {
errCh <- err
}
}(i)
}
wg.Wait()
close(errCh)
for err := range errCh {
fmt.Println("error:", err)
}
}Buffer the error channel to match the number of goroutines. Otherwise a goroutine could block trying to send an error if nobody is reading yet.
First Error Wins
Often you want to stop everything on the first error. Combine an error channel with context cancellation.
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
errCh := make(chan error, 1) // buffer 1 — only need the first error
var wg sync.WaitGroup
for i := 1; i <= 5; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
select {
case <-ctx.Done():
return
default:
}
if err := doWork(id); err != nil {
select {
case errCh <- err: // send first error
cancel() // cancel all other goroutines
default: // another goroutine already sent an error
}
}
}(i)
}
wg.Wait()
close(errCh)
if err := <-errCh; err != nil {
fmt.Println("failed:", err)
} else {
fmt.Println("all succeeded")
}
}The first goroutine to fail sends its error and cancels the context. Other goroutines check ctx.Done() and exit early.
errgroup
golang.org/x/sync/errgroup wraps the "First Error Wins" pattern into a clean API. It's the standard tool for concurrent error handling in Go.
The manual pattern gives you first-error behavior, but you handle everything else yourself — WaitGroup, context, cleanup. errgroup adds:
| Manual "First Error Wins" | errgroup | |
|---|---|---|
| First error returned | You build it | Built-in |
| WaitGroup | Manual wg.Add/wg.Done |
Automatic |
| Context cancellation | Wire it yourself | WithContext does it for you |
| Concurrency limit | Semaphore or channel | SetLimit |
| Boilerplate | More | Less |
Here's how it looks in practice:
go get golang.org/x/sync/errgroupimport "golang.org/x/sync/errgroup"
func main() {
g, ctx := errgroup.WithContext(context.Background())
for i := 1; i <= 5; i++ {
id := i
g.Go(func() error {
select {
case <-ctx.Done():
return ctx.Err()
default:
}
if id == 3 {
return fmt.Errorf("worker %d failed", id)
}
time.Sleep(200 * time.Millisecond) // simulate work
fmt.Printf("worker %d done\n", id)
return nil
})
}
if err := g.Wait(); err != nil {
fmt.Println("failed:", err)
} else {
fmt.Println("all succeeded")
}
}errgroup.WithContext creates a group and a derived context. When any goroutine returns an error, the context is cancelled. g.Wait() blocks until all goroutines finish and returns the first error.
Important: cancelling the context doesn't kill goroutines — they have to check ctx.Done() themselves. If a goroutine ignores the context, it keeps running until it finishes on its own.
errgroup with Concurrency Limit
errgroup supports limiting concurrent goroutines since Go 1.20.
func main() {
g, ctx := errgroup.WithContext(context.Background())
g.SetLimit(3) // max 3 concurrent goroutines
urls := []string{
"https://go.dev",
"https://pkg.go.dev",
"https://github.com",
"https://example.com",
"https://httpbin.org/get",
}
for _, url := range urls {
url := url
g.Go(func() error {
req, err := http.NewRequestWithContext(ctx, "GET", url, nil)
if err != nil {
return fmt.Errorf("%s: %w", url, err)
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
return fmt.Errorf("%s: %w", url, err)
}
resp.Body.Close()
fmt.Printf("%s: %d\n", url, resp.StatusCode)
return nil
})
}
if err := g.Wait(); err != nil {
fmt.Println("error:", err)
}
}SetLimit(3) means at most 3 goroutines run at once. This combines errgroup with semaphore behavior — no need for a separate semaphore.
Notice that http.NewRequestWithContext(ctx, ...) ties each request to the errgroup's context. If one request fails, the context is cancelled and in-flight requests abort immediately. If we used http.Get(url) instead (no context), the other requests would keep running until they finish on their own.
Collecting All Errors
Sometimes you want every error, not just the first one. errgroup only returns the first. For all errors, use a channel or a mutex-protected slice.
func main() {
var (
mu sync.Mutex
errs []error
wg sync.WaitGroup
)
for i := 1; i <= 5; i++ {
wg.Add(1)
go func(id int) {
defer wg.Done()
if err := doWork(id); err != nil {
mu.Lock()
errs = append(errs, err)
mu.Unlock()
}
}(i)
}
wg.Wait()
if len(errs) > 0 {
fmt.Printf("%d errors:\n", len(errs))
for _, err := range errs {
fmt.Println(" -", err)
}
}
}Then combine with errors.Join (Go 1.20+):
wg.Wait()
if len(errs) > 0 {
combined := errors.Join(errs...)
fmt.Println(combined)
// the combined error is unwrappable — you can check for specific errors
if errors.Is(combined, ErrTimeout) {
// handles any ErrTimeout in the list
}
}errors.Join merges the slice into a single error that works with errors.Is() and errors.As(). Note that errors.Join itself is not concurrency-safe — call it after wg.Wait() when all goroutines are done and the slice is fully populated.
Panic Recovery in Goroutines
A panic in a goroutine crashes the entire program. Always recover in goroutines that might panic.
func safeGo(fn func(), wg *sync.WaitGroup) {
go func() {
defer wg.Done()
defer func() {
if r := recover(); r != nil {
fmt.Println("recovered from panic:", r)
}
}()
fn()
}()
}
func main() {
var wg sync.WaitGroup
wg.Add(1)
safeGo(func() {
panic("something went wrong")
}, &wg)
wg.Wait()
fmt.Println("program still running")
}In production, log the panic with a stack trace and continue. A safeGo wrapper is a common utility.
Which Approach to Use
| Scenario | Tool |
|---|---|
| Stop on first error | errgroup.WithContext |
| Stop on first error + limit concurrency | errgroup with SetLimit |
| Collect all errors | Mutex + slice, or buffered error channel |
| Fire-and-forget with safety | safeGo wrapper with panic recovery |
| Custom cancellation logic | Error channel + context.WithCancel |
For most cases, errgroup is the right answer. It handles the WaitGroup, context cancellation, and first-error collection in one package.
Key Takeaways
- Goroutines can't return errors — use channels or errgroup
- Buffer error channels to prevent goroutines from blocking
errgroup.WithContextcancels all goroutines on first errorerrgroup.SetLimit(n)combines error handling with concurrency limiting- Collect all errors with a mutex-protected slice when you need every failure
- Always recover panics in goroutines — an unrecovered panic kills the program
errgroupis the standard tool — use it unless you need custom behavior