Deadlocks and Races
Concurrency bugs show up rarely, hide from tests, and crash production at the worst times. Two of the worst are deadlocks and race conditions.
Race Conditions
A race condition is any bug whose outcome depends on the order of unsynchronized operations. A classic example is two threads incrementing a shared counter without a lock. The fix is to put shared state behind a lock, atomic operation, or a single owning thread.
Races are hard to find because they may only show up under load, on certain hardware, or in production. Tools like ThreadSanitizer (C++), Java Flight Recorder, and Python's faulthandler help, but careful design beats hunting bugs after the fact.
Deadlock
A deadlock happens when two or more threads each hold a resource the other needs and neither will let go. All four of these conditions must hold:
- Mutual exclusion: each lock is held by one thread at a time
- Hold and wait: a thread holds one lock while waiting for another
- No preemption: locks must be released voluntarily
- Circular wait: thread A waits on B, which waits on A
Break any one and you cannot deadlock.
Prevention Strategies
The simplest fix is to always acquire locks in the same global order. If everyone takes lock A before lock B, there is no circular wait. The example above shows this pattern.
Other strategies:
- Try-lock with timeout: give up and retry if a lock cannot be taken quickly
- Lock-free data structures for hot paths
- Single-writer designs where only one thread mutates a piece of state
Detection
Run your code under tooling that reports lock-order inversions. Java has thread dumps that label deadlocks explicitly. C++ thread sanitizer flags suspicious orderings. In production, watch for stuck threads as a deadlock symptom.
Try It Yourself
- Modify the example so one thread takes lock A then B and the other takes B then A, and observe a real deadlock.
- Add a
try_lockwith a timeout and back off on failure. - Sketch the dining-philosophers problem and propose two solutions.