Safe Shared State in Rust, Arc, Mutex, and the Channel You Should Pick
TL;DR — Most Rust concurrency bugs in production aren’t data races (the compiler catches those); they’re contention, lock ordering, and async deadlocks. Pick
Arc<Mutex>only when state is genuinely shared and small. Reach for channels —mpsc,broadcast,watch, oroneshot— whenever the access pattern is “one writer talks to one or many readers.”
I’ve reviewed enough Rust code now to spot a pattern: developers coming from Go or Java reach for Arc<Mutex<HashMap>> reflexively. It works, the compiler accepts it, the tests pass, and then in production p99 latency spikes because four tasks are queued behind a write lock that shouldn’t have been a write lock at all.
The Rust standard library gives you the primitives. Tokio gives you the async-aware versions. The hard part is choosing. This post is the decision framework I use, with code, for the four cases I see weekly. If you’re new to async patterns in Rust, my Tokio patterns post covers the runtime mechanics this builds on.
The Three Questions to Ask First
Before you reach for any primitive, answer:
- Does the state actually need to be shared, or can you move ownership? If task A produces values and task B consumes them, you don’t share — you channel.
- Is the access read-heavy or write-heavy? Read-heavy with rare writes is the only case where
RwLockbeatsMutexin practice. Mixed access usually performs worse underRwLockbecause of writer starvation logic. - Are you in async or blocking code? Holding a
std::sync::Mutexacross.awaitis a deadlock waiting to happen. Usetokio::sync::Mutexthere — or restructure so you don’t hold the lock across an await.
Get those right and 80% of the choices make themselves.
When Arc<Mutex<T>> Is Correct
A small, frequently-mutated piece of state that multiple tasks read and write — a connection pool’s idle list, a rate limiter’s token bucket, a shared counter. The lock is held for microseconds. The state is small.
use std::sync::Arc;
use parking_lot::Mutex;
#[derive(Default)]
struct RateLimit {
tokens: u32,
capacity: u32,
}
#[derive(Clone)]
pub struct Limiter(Arc<Mutex<RateLimit>>);
impl Limiter {
pub fn new(capacity: u32) -> Self {
Self(Arc::new(Mutex::new(RateLimit { tokens: capacity, capacity })))
}
pub fn try_acquire(&self) -> bool {
let mut g = self.0.lock();
if g.tokens == 0 { return false; }
g.tokens -= 1;
true
}
pub fn refill(&self, n: u32) {
let mut g = self.0.lock();
g.tokens = (g.tokens + n).min(g.capacity);
}
}
Two choices worth defending. First, parking_lot::Mutex instead of std::sync::Mutex. It’s faster on contended workloads, doesn’t poison on panic, and has a smaller memory footprint. For new code, I default to it. Second, no RwLock. The critical section is two reads and a write — RwLock would be slower because of its larger atomic footprint, and there are no concurrent readers worth optimizing for.
Most importantly: this lock is never held across an .await. The whole operation is lock → check → mutate → drop.
When RwLock Earns Its Keep
Configuration that’s loaded once, read by every request, and reloaded on SIGHUP. Routing tables. Feature flag snapshots. The read-to-write ratio is 10,000:1 or higher.
use arc_swap::ArcSwap;
use std::sync::Arc;
#[derive(Clone)]
pub struct Config {
pub upstream: String,
pub timeout_ms: u32,
}
pub struct ConfigStore(ArcSwap<Config>);
impl ConfigStore {
pub fn new(initial: Config) -> Arc<Self> {
Arc::new(Self(ArcSwap::from_pointee(initial)))
}
pub fn load(&self) -> Arc<Config> {
self.0.load_full()
}
pub fn reload(&self, next: Config) {
self.0.store(Arc::new(next));
}
}
Note what I did: I skipped RwLock entirely. For this access pattern, arc-swap is faster. Readers do a single atomic load and get a cheap Arc<Config> they can hold as long as they want. Writers atomically swap in a new pointer. Old readers see the old config; new readers see the new one. No contention.
RwLock is correct when you really do need exclusive write access in-place. In my experience that’s a much smaller set of cases than people assume. When you can express writes as “replace the whole thing,” arc-swap beats RwLock and Mutex both.
Channels Are the Default
If you take one thing from this post: when in doubt, channel. Tokio gives you four flavors, each with a clear use case.
use tokio::sync::{broadcast, mpsc, oneshot, watch};
mpsc::channel(N) — multiple producers, one consumer. The bread-and-butter. Backpressure is built in: producers .await when the channel is full. Use when you have N tasks producing work and one task draining it.
let (tx, mut rx) = mpsc::channel::<Job>(256);
tokio::spawn(async move {
while let Some(job) = rx.recv().await {
process(job).await;
}
});
for j in incoming {
tx.send(j).await.expect("worker died");
}
broadcast::channel(N) — one producer, many consumers, each receives every message. Slower-than-expected receivers can lag behind by N messages before they’re forcibly dropped. Use for fan-out events: cache invalidation pings, shutdown signals to many tasks.
let (tx, _) = broadcast::channel::<Event>(64);
// Each task calls tx.subscribe() to get its own Receiver.
let mut rx = tx.subscribe();
tokio::spawn(async move {
while let Ok(ev) = rx.recv().await {
handle(ev).await;
}
});
watch::channel(initial) — one producer, many consumers, each receives only the latest value. Perfect for config reload, leader election state, “is the system healthy” flags. Readers can .borrow() synchronously or .changed().await to wait for the next update.
let (tx, mut rx) = watch::channel(Config::default());
tokio::spawn(async move {
loop {
rx.changed().await.unwrap();
let snapshot = rx.borrow().clone();
apply(snapshot).await;
}
});
tx.send(new_config).unwrap();
oneshot::channel() — exactly one send, exactly one receive. The reply channel for a request/response pattern across tasks. Use when you spawn work and need its result back later.
let (reply_tx, reply_rx) = oneshot::channel();
work_tx.send(Job { payload, reply: reply_tx }).await?;
let result = reply_rx.await?;
When you find yourself reaching for Arc<Mutex<HashMap<RequestId, Response>>> to thread responses back to callers, you almost certainly wanted oneshot instead.
The Async-Mutex Trap
Two locks. std::sync::Mutex (and parking_lot::Mutex) are synchronous — they block the thread until the lock is acquired. tokio::sync::Mutex is async — it yields if the lock is held.
The rule:
- If you hold the lock across a
.await, usetokio::sync::Mutex. - If the critical section is purely synchronous, use
parking_lot::Mutex(faster, no await).
The mistake I see most: people use tokio::sync::Mutex everywhere “for safety.” It’s actually slower than parking_lot when no await happens inside the critical section, because async mutexes have more bookkeeping. Worse, by hiding the await inside a lock-held block, you create the exact deadlock the async-mutex was supposed to prevent.
// Bad: blocking mutex across an await
let g = sync_mutex.lock();
let data = fetch_from_upstream(&g.url).await; // deadlock under load
drop(g);
// Better: copy what you need, drop the lock, then await
let url = { sync_mutex.lock().url.clone() };
let data = fetch_from_upstream(&url).await;
Tokio’s synchronization primitives docs spell out the contract for each type. Worth a careful read once a year.
A Worked Example, Pulling It Together
Imagine a metrics aggregator that ingests events on many tasks, batches them, and flushes to an upstream every second. Here’s the shape with the right primitives:
use std::sync::Arc;
use std::time::Duration;
use arc_swap::ArcSwap;
use parking_lot::Mutex;
use tokio::sync::{mpsc, watch};
use tokio::time::interval;
#[derive(Clone, Default)]
struct Settings { batch_size: usize, endpoint: String }
#[derive(Clone)]
struct Metric { name: String, value: f64 }
pub async fn run(settings: Arc<ArcSwap<Settings>>, mut shutdown: watch::Receiver<bool>) {
let (tx, mut rx) = mpsc::channel::<Metric>(8_192);
let buf = Arc::new(Mutex::new(Vec::with_capacity(1024)));
// Ingest task
let buf_in = buf.clone();
tokio::spawn(async move {
while let Some(m) = rx.recv().await {
buf_in.lock().push(m);
}
});
// Flush task
let mut tick = interval(Duration::from_secs(1));
loop {
tokio::select! {
_ = tick.tick() => {
let cfg = settings.load_full();
let batch: Vec<Metric> = std::mem::take(&mut *buf.lock());
if !batch.is_empty() { send(&cfg.endpoint, batch).await; }
}
_ = shutdown.changed() => {
if *shutdown.borrow() { break; }
}
}
}
drop(tx);
}
async fn send(_endpoint: &str, _batch: Vec<Metric>) { /* ... */ }
Four primitives, each chosen for its strengths. mpsc for backpressured ingest. parking_lot::Mutex for the short-held batch buffer (never crosses an await). arc-swap for hot-reloadable config. watch for shutdown signaling.
Common Pitfalls
Cloning Arc is cheap, but not free. Reference count touches are atomic. On hot paths with thousands of clones per second, it shows up in profiles. Pass &Arc<T> where you can.
MutexGuard lifetime extends to the end of the statement. if mutex.lock().is_empty() { ... } holds the lock through the entire if body. Bind to a let _g = mutex.lock(); only if you mean to.
Don’t lock to read a Copy type. If your state is AtomicU64, you don’t need a mutex.
Bounded vs unbounded channels. Unbounded means OOM under producer overload. Bounded means producers exert backpressure. Default to bounded; pick the bound based on the longest tolerable queueing delay times throughput.
broadcast drops slow receivers. If a consumer falls more than N messages behind, the next recv returns RecvError::Lagged(n). Handle it, or use watch if you only care about latest-value semantics.
Wrapping Up
Shared state is unavoidable, but it’s rarely the right default. Channels are the message-passing escape hatch that turns most “shared state” problems into ownership problems the compiler can reason about. When you do need a lock, pick the smallest, simplest one that fits — and never hold it across an await.
The Rust toolbox here is unusually rich. Spend an afternoon learning which tool fits which shape and you’ll spend the rest of the year writing concurrent code that just works.