Async Rust Without the Footguns, Tokio Patterns in 2024
TL;DR — Async Rust is fine when you respect a few rules: bounded channels, structured concurrency via
JoinSet, carefulselect!arms, and a clear cancellation story. Most prod incidents come from violating one of these. Patterns matter more than language tricks.
Last week’s overview said async Rust on Tokio is the boring choice in 2024. Boring doesn’t mean fuss-free. The runtime has sharp edges, and the patterns you reach for shape whether your service is reliable. This post is the pattern catalog I wish I had three years ago.
The four rules
If you internalize four things, async Rust gets considerably less scary:
- Bounded channels everywhere. Unbounded means unbounded memory.
- Cancellation is implicit; design for it.
Futures drop when the task does. Anything that needs cleanup needs to know. - Structured concurrency wins.
JoinSetfor groups; avoid orphantokio::spawnwhen you can. select!arms must be cancellation-safe. Otherwise you lose work.
Each one is its own pattern. Walk through them.
Rule 1: bounded channels
// Wrong — under load, OOM
let (tx, mut rx) = tokio::sync::mpsc::unbounded_channel();
// Right — back-pressure built in
let (tx, mut rx) = tokio::sync::mpsc::channel::<Job>(1024);
The bounded channel applies back-pressure. When the consumer is slow, the producer’s tx.send().await blocks until there’s room. The latency budget propagates back up the call chain. Without back-pressure, the queue grows until the process dies.
Pick the bound based on the realistic burst size and the cost of memory. 1024 is a reasonable default for typical message sizes. For large messages, drop to 64-256.
try_send exists for cases where you want to fail fast instead of blocking — overload protection, dropping non-essential work. Use it deliberately.
Rule 2: cancellation is implicit
When a Tokio task is dropped — because its JoinHandle was dropped, or its parent select! chose another arm, or the runtime is shutting down — the Future is dropped at its current await point. Stack-allocated locals run their destructors. Any work in flight is abandoned.
This is great for cancellation semantics. It’s a footgun if your Future was holding important state that should have been committed.
// Risky — if cancelled, the database write never happens
async fn handle_request(db: &PgPool, msg: Message) -> Result<()> {
let processed = expensive_compute(&msg).await?;
db.write(&processed).await?; // never reached if cancelled before this
Ok(())
}
If the function is cancelled between expensive_compute and db.write, the work is lost. Fix it with explicit transactional semantics — either commit before any cancellable await, or design the work to be idempotent and retryable, or use tokio::spawn to detach the commit step so it survives the parent’s cancellation:
async fn handle_request(db: PgPool, msg: Message) -> Result<()> {
let processed = expensive_compute(&msg).await?;
let handle = tokio::spawn(async move {
db.write(&processed).await
});
handle.await??;
Ok(())
}
The detached task continues even if the caller is cancelled. The caller’s handle.await is also cancellable, but the work itself runs to completion.
Rule 3: structured concurrency via JoinSet
Reaching for tokio::spawn and storing the JoinHandle somewhere is fine for one-off tasks. For groups of related work, JoinSet is the pattern.
use tokio::task::JoinSet;
async fn fan_out(urls: Vec<String>) -> Vec<Result<String>> {
let mut set = JoinSet::new();
for url in urls {
set.spawn(async move { fetch(url).await });
}
let mut results = Vec::new();
while let Some(res) = set.join_next().await {
results.push(res.expect("task panicked"));
}
results
}
When the JoinSet is dropped, all its tasks are cancelled. This gives you structured concurrency: if the caller goes away, the spawned children go away too. No orphan tasks running after the request that started them is done.
Compare to the naive version that spawns and stores handles separately. That version requires you to remember to cancel on every error path. Forget once and you have leaked tasks.
For unordered collection with early termination, JoinSet::join_next is the primitive. For ordered results, build them into the task’s return value and sort post-hoc, or use a different pattern (channels, futures::stream).
Rule 4: cancellation-safe select arms
select! is one of Tokio’s most useful primitives and one of the most dangerous. The rule: each arm must be cancellation-safe — that is, dropping the future mid-execution must not lose work.
// Dangerous — receiving from a stream is not cancellation-safe in general
tokio::select! {
item = stream.next() => process(item),
_ = shutdown.recv() => return,
}
If shutdown fires while stream.next() was in the middle of pulling a frame from the network, that frame is lost. For a TCP-backed stream the next call probably resumes correctly. For other streams it depends on the implementation.
The safe versions of common operations:
tokio::sync::mpsc::Receiver::recv— cancellation-safetokio::sync::oneshot::Receiver::await— cancellation-safetokio::time::sleep— cancellation-safetokio::io::AsyncReadExt::readon a borrowed stream — generally safe; the next read picks up where this one left offfutures::Stream::next— depends on the stream- A library you didn’t write — assume unsafe until proven otherwise
The defensive pattern: park the borrow outside the select, store partial state in your task struct, and resume on the next iteration.
loop {
tokio::select! {
item = some_safe_recv(&mut rx) => {
handle_item(item).await;
}
_ = shutdown.changed() => return,
}
}
Spawning blocking work
CPU-bound work blocks the runtime if you naively .await it. Use spawn_blocking:
let result = tokio::task::spawn_blocking(move || {
expensive_cpu_bound_thing(input)
}).await?;
spawn_blocking puts the work on a separate thread pool sized for blocking operations. The default pool is large; the runtime keeps moving.
The flip side: don’t spawn_blocking for things that aren’t actually blocking. The overhead of moving work across threads is real.
Tracing your futures
Tokio Console (tokio-console) is the must-have debugging tool in 2024. Enable it in your service:
// Cargo.toml
// console-subscriber = "0.2"
#[tokio::main(flavor = "multi_thread")]
async fn main() {
console_subscriber::init();
// your service
}
Then run tokio-console from another terminal. You see live task list, the futures they’re blocked on, lock contention, and task ages. The first time you find an unbounded queue or a task that never completes via the console, it pays for itself.
Production use: gate console-subscriber behind a feature flag and enable on a single canary pod for debugging. The overhead isn’t enormous but it’s not free.
Common Pitfalls
- Holding a sync
Mutexacross anawait. Usetokio::sync::Mutexif the lock must span an await;parking_lot::Mutexis fine when the lock guard is dropped before the await. Arc<Mutex<HashMap>>for shared state. Works, but contended. Preferdashmapfor concurrent maps, or partition state by key.- Forgetting that
JoinHandle::abortis not synchronous. Calling abort returns immediately; the task is cancelled at its next await point. If the task is in a tight CPU loop without awaits, abort won’t help. Yield. tokio::spawnwithoutintoJoinSetfor groups. Orphan tasks survive their parent. They keep running, hold resources, occupy memory. UseJoinSet.- Recursion in async fns. Async recursion needs
Box::pin(future)because the future size is otherwise infinite. The compiler tells you; don’t fight it. - Mixing runtimes. Don’t call
tokio::spawnfrom inside anasync-stdtask. The runtimes are not interoperable in the general case.
The pitfall that I personally cost a team a Sunday on: an unbounded channel between a fast producer (a database changefeed) and a slow consumer (an external API). Memory grew silently for hours. The fix was a bounded channel with explicit backpressure, plus a metric on queue depth.
Wrapping Up
Async Rust’s footguns are catalogued and dodgeable. The four rules — bounded channels, cancellation discipline, structured concurrency, cancellation-safe selects — cover most production incidents I’ve debugged. Internalize them and the language gets considerably calmer.
Next post walks through building a complete Axum service top to bottom — handlers, extractors, middleware, tracing, the full setup that takes a week of small mistakes to learn on your own. The Tokio tutorial covers more of the runtime primitives if you want canonical references.