What is race condition?
Welcome to racecondition’s blog. Since this is the first post, let’s start with the idea behind the name: the race condition.
A race condition happens when the result of your program depends on the timing/order of concurrent operations that access shared state. If two things “race” to read/modify the same data, the outcome can change run-to-run—leading to flaky bugs, security issues, and production pain.
Below are two practical examples: one in Python (threads) and one on microcontrollers (ISR vs main loop). Each includes the bug and a fix.
Python: When i += 1 Isn’t Atomic
In CPython there’s a Global Interpreter Lock (GIL), but it does not make compound operations like x += 1 atomic. That statement expands to read → add → write, which can interleave across threads.
Python
The Python example below launches four threads that increment a shared counter concurrently. Because no synchronization is used, some increments are lost, and the final counter value is lower than expected.
import threading
N_THREADS = 4
N_INCREMENTS = 100_000
counter = 0 # shared state
def worker():
global counter
for _ in range(N_INCREMENTS):
# Not atomic: read -> add -> write
counter += 1
threads = [threading.Thread(target=worker) for _ in range(N_THREADS)]
[t.start() for t in threads]
[t.join() for t in threads]
print("Expected:", N_THREADS * N_INCREMENTS)
print("Actual :", counter) # Often less than expected
You’ll often see Actual != Expected, because increments get lost when threads interleave.
Fix 1: Use a Lock
The solution below uses threading.Lock to ensure that each thread acquires the lock before incrementing the counter, making the operation safe and preventing lost updates.
import threading
N_THREADS = 4
N_INCREMENTS = 100_000
counter = 0
lock = threading.Lock()
def worker():
global counter
for _ in range(N_INCREMENTS):
with lock: # critical section
counter += 1
threads = [threading.Thread(target=worker) for _ in range(N_THREADS)]
[t.start() for t in threads]
[t.join() for t in threads]
print("Expected:", N_THREADS * N_INCREMENTS)
print("Actual :", counter) # Matches expected
Fix 2: Avoid shared mutability (Queues / Actors)
The solution below uses a queue as a safe way to record increments, ensuring data is accessed without race conditions.
import threading, queue
N_THREADS = 4
N_INCREMENTS = 100_000
q = queue.Queue()
def worker():
for _ in range(N_INCREMENTS):
q.put(1) # no shared mutable counter in threads
threads = [threading.Thread(target=worker) for _ in range(N_THREADS)]
[t.start() for t in threads]
[t.join() for t in threads]
total = 0
while not q.empty():
total += q.get()
print("Expected:", N_THREADS * N_INCREMENTS)
print("Actual :", total)
Queues serialize access and remove the need for a shared counter entirely.
Key takeaways (Python)
x += 1 is not atomic.
Use threading.Lock, higher-level concurrency primitives (Queue, concurrent.futures), or design out shared state.
Embedded C (Microcontrollers): ISR vs Main Loop
On MCUs, a common race is between an interrupt service routine (ISR) and the main loop manipulating the same variable. Classic failure: a read-modify-write in main interleaves with an ISR update.
The bug: Lost events due to interleaving
Imagine an ISR increments a tick counter each millisecond. The main loop consumes pending ticks by decrementing the counter:
#include <stdint.h>
#include <stdbool.h>
volatile uint32_t tick_count = 0; // updated in ISR
// Called by SysTick or a timer interrupt at 1 kHz
void SysTick_Handler(void) {
tick_count++; // producer
}
int main(void) {
// init systick/timer...
for (;;) {
if (tick_count > 0) { // read
// --- RACE WINDOW ---
// ISR could fire here: tick_count++ happens
tick_count--; // write (consume one tick)
// -------------------
// If an ISR runs between read and write, an increment can be lost.
// Symptom: you "miss" ticks -> drift, timing jitter, or slow loops.
}
// ... do other work ...
}
}
What goes wrong? The if (tick_count > 0) tick_count–; is a non-atomic read-modify-write. If the ISR fires between the if check and the decrement, you can decrement a value that already increased—effectively dropping an event.
Note: volatile only addresses visibility/optimization, not atomicity.
Fix 1: Make the decrement atomic (short critical section)
Disable the interrupt briefly around the read-modify-write. Keep it as short as possible.
#include <stdint.h>
#include <stdbool.h>
volatile uint32_t tick_count = 0;
void SysTick_Handler(void) {
tick_count++;
}
static inline uint32_t irq_save(void) {
// Cortex-M example:
uint32_t primask;
__asm volatile ("MRS %0, PRIMASK" : "=r" (primask) );
__asm volatile ("CPSID i"); // disable IRQs
return primask;
}
static inline void irq_restore(uint32_t primask) {
__asm volatile ("MSR PRIMASK, %0" :: "r" (primask) );
}
int main(void) {
for (;;) {
uint32_t key = irq_save(); // enter critical section
if (tick_count > 0) {
tick_count--; // atomic now w.r.t. ISR
}
irq_restore(key); // exit critical section
// ... rest of loop ...
}
}
On AVR you’d use cli() / sei(). On some SDKs (STM32 HAL, ESP-IDF, Zephyr, FreeRTOS) there are helpers/macros for critical sections—prefer those.
Fix 2: Swap-and-drain (minimize IRQ mask time)
Grab the whole count at once atomically, then process without interrupts masked:
#include <stdint.h>
volatile uint32_t tick_count = 0;
void SysTick_Handler(void) {
tick_count++;
}
int main(void) {
for (;;) {
// atomically copy and reset
uint32_t key = irq_save();
uint32_t pending = tick_count;
tick_count = 0;
irq_restore(key);
// handle all pending ticks with interrupts enabled
while (pending--) {
// ... do 1-tick worth of work ...
}
}
}
This pattern reduces interrupt-off time, improving latency.
Fix 3: Use true atomics if available
On some toolchains/architectures you can use C11 atomics or compiler builtins:
#include <stdatomic.h>
_Atomic uint32_t tick_count = 0;
void SysTick_Handler(void) {
atomic_fetch_add(&tick_count, 1);
}
int main(void) {
for (;;) {
// Decrement only if positive, atomically:
uint32_t old = atomic_load(&tick_count);
while (old > 0 &&
!atomic_compare_exchange_weak(&tick_count, &old, old - 1)) {
// old reloaded by the CAS loop
}
if (old > 0) {
// consumed one tick
}
}
}
Support for lock-free atomics on small MCUs varies; check your compiler and core.
Key takeaways (MCU)
ISRs and main must not do unsynchronized read-modify-write on shared data.
volatile is necessary for visibility, but not sufficient—you also need atomicity.
Use brief critical sections, swap-and-drain, or C11 atomics where supported.
Design Patterns That Prevent Races
Protect shared state with locks/critical sections or true atomics.
Prefer message passing (queues/mailboxes): ISRs push events; main loop drains them.
Immutable data and ownership transfer reduce shared mutable state.
Time-bounded critical sections: keep interrupts masked for as little as possible.
Why This Blog Exists
I named this place racecondition.blog because modern systems—from Python services to bare-metal firmware—are full of tiny, invisible “races.” Spotting and fixing them is a superpower. Here you’ll find hands-on posts spanning software, embedded, robotics, and edge AI, always with real code and reproducible patterns.