Goroutines
A goroutine is a function running concurrently. Starting cost: a 2KB stack and
one keyword (go). The runtime scheduler multiplexes thousands of
them onto a handful of OS threads. You can spawn 100,000 goroutines without
breaking a sweat — that's the headline feature.
1 · The intuition
An OS thread costs ~2MB of stack and a few microseconds to start. You can't have 100,000 threads on most machines. A goroutine starts at 2KB and grows on demand (the runtime allocates more pages as needed). The scheduler multiplexes them onto a small pool of OS threads — by default, one per CPU core.
2 · Try it — the bare minimum
package main
import (
"fmt"
"time"
)
func say(name string) {
for i := 0; i < 3; i++ {
fmt.Println(name, i)
time.Sleep(100 * time.Millisecond)
}
}
func main() {
go say("alice")
go say("bob")
say("carol")
}
// Output interleaves all three — exact order varies.Three goroutines run concurrently. Note: when main returns, the
entire program ends — even if other goroutines are still working. Use
synchronization primitives (next sections) to wait for them.
3 · Wait for completion — sync.WaitGroup
package main
import (
"fmt"
"sync"
)
func main() {
var wg sync.WaitGroup
for i := 0; i < 5; i++ {
wg.Add(1) // increment counter
go func(n int) {
defer wg.Done() // decrement on exit
fmt.Printf("worker %d
", n)
}(i) // pass i — don't capture the loop var
}
wg.Wait() // block until counter is 0
fmt.Println("all done")
}4 · The scheduler — what the runtime is doing
The Go scheduler is the M:P:G model. M = OS thread. P = processor (a logical scheduler), one per CPU core by default. G = goroutine. Each P has a local run queue of Gs; when a P empties its queue, it work-steals from a busier P.
Goroutines yield at function calls, channel sends/receives, system calls, GC synchronisation points, and (since 1.14) preemptively when they run too long. You don't think about any of this — but it's why Go scales.
runtime.NumGoroutine() reports the current
count. GOMAXPROCS=4 go run main.go limits to 4 processors. The
scheduler internals live in runtime/proc.go — ~6000 LOC, worth a
skim once you're comfortable.5 · The cost of one goroutine
package main
import (
"fmt"
"runtime"
"sync"
"time"
)
func main() {
start := time.Now()
var wg sync.WaitGroup
n := 100_000
for i := 0; i < n; i++ {
wg.Add(1)
go func() {
defer wg.Done()
time.Sleep(10 * time.Millisecond)
}()
}
peak := runtime.NumGoroutine()
wg.Wait()
fmt.Printf("spawned %d goroutines
", n)
fmt.Printf("peak alive: %d
", peak)
fmt.Printf("total time: %v
", time.Since(start))
}100,000 goroutines, 50ms total wall time. The same in OS threads would either OOM your machine or take seconds. This is the lever that lets Go HTTP servers handle 100k concurrent connections per box.
6 · The patterns you'll write
// Fan-out: distribute work across N goroutines
func fanOut(jobs []Job, workers int) {
var wg sync.WaitGroup
in := make(chan Job, workers)
// Start workers
for w := 0; w < workers; w++ {
wg.Add(1)
go func() {
defer wg.Done()
for job := range in {
job.Process()
}
}()
}
// Feed
for _, j := range jobs {
in <- j
}
close(in)
wg.Wait()
}
// Bounded worker pool (preferred over "go for every request")
type Pool struct {
sem chan struct{}
}
func NewPool(size int) *Pool { return &Pool{sem: make(chan struct{}, size)} }
func (p *Pool) Run(f func()) {
p.sem <- struct{}{} // acquire
go func() {
defer func() { <-p.sem }() // release
f()
}()
}7 · From the wild
func (srv *Server) Serve(l net.Listener) error {
// The main accept loop — one goroutine here.
for {
rw, err := l.Accept()
if err != nil { return err }
// Each connection becomes its own goroutine. Cheap.
c := srv.newConn(rw)
go c.serve(srv.connCtx())
}
}
// 100k concurrent HTTP connections = 100k goroutines.
// Without goroutines, you'd need a thread pool + epoll-style event loop.
// With goroutines, the scheduler turns blocking I/O into "just yield to another G".8 · Coming from another language?
| If you know… | The bridge |
|---|---|
| Python | ≈ asyncio.create_task but without async/await — every function is "async" implicitly. No GIL — true parallelism by default. |
| JavaScript / Node | ≈ a promise that doesn't block but isn't single-threaded either. No event loop — the scheduler is. |
| Java | ≈ Thread in cost (tiny), like virtual threads (Project Loom). Earlier than Loom by a decade. |
| Rust | ≈ tokio::spawn but no async/await. The Go runtime is the executor; in Rust you choose one. |
| Erlang | The closest match. Lightweight processes, scheduled by a runtime, communicate via messages (channels). |
9 · Common mistakes
- Goroutine leaks. A goroutine blocked on a channel that's never sent to never exits. Always have a cancellation path:
context.Context, a closed signal channel, or a timeout. - Capturing the loop variable. Pre-1.22,
for i := range xs { go func() { use(i) }() }sees the finaliin every goroutine. Always pass as argument or shadow withi := i. - Sharing variables without synchronization. Two goroutines writing to the same variable is a race. Run with
-raceto catch. - Unbounded goroutine spawning. 10M goroutines does fit in RAM... but the scheduler thrashes. Use a worker pool with bounded concurrency.
- Assuming order. Goroutine output interleaves arbitrarily. If you need order, serialize through a channel or a mutex.
10 · Exercises (~15 min)
- Race detector. Write two goroutines writing to the same int. Run with
go run -race main.go. Watch it fire. - Goroutine count. Spawn 10000 goroutines that sleep for 1 second. Use
runtime.NumGoroutine()before, during, and after. What does the peak look like? - Bounded pool. Adapt the pattern from section 6 to limit a workload of 1000 tasks to 10 concurrent. Measure: how does total time compare to "spawn all 1000 at once"?
- Leak it. Start a goroutine that
<-chon a channel that's never closed. Printruntime.NumGoroutine()aftermain's sleep. The leak is visible.
11 · When it clicks
- You spawn a goroutine for any concurrent work without thinking about cost.
- You instinctively pair every
go func()with a "how does this exit" plan. - You reach for a bounded pool over unbounded spawning.
- You run
go test -raceas part of every test pass.