Production HTTP APIs with axum 0.6 | Hi, I'm Muhammad Amal

July 25, 2023 · 8 min read · by Muhammad Amal programming

TL;DR — axum 0.6’s extractor model with State<T> is the right shape for typed shared state / Wrap your error type as a newtype around anyhow::Error and implement IntoResponse once / tower-http’s TraceLayer plus tracing gives you structured logs and request IDs for free

axum reached the shape I’m comfortable shipping to production around the 0.6 release. The extractor model, typed state, and tower middleware integration are stable, the ecosystem of middleware crates is filling out, and the performance is competitive with anything in the JVM or Go space. This post is the skeleton of a real service — not a hello-world — with the integrations I reach for by default.

This builds on the error-handling patterns from my previous post and the async model from the tokio post. If you haven’t read those, the code below will still make sense, but the why might not.

The dependency stack

[package]
name = "api"
version = "0.5.0"
edition = "2021"
rust-version = "1.71"

[dependencies]
axum = "0.6"
tokio = { version = "1.29", features = ["full"] }
tower = "0.4"
tower-http = { version = "0.4", features = ["trace", "cors", "timeout", "limit", "compression-gzip"] }
hyper = { version = "0.14", features = ["full"] }

sqlx = { version = "0.7", features = ["postgres", "runtime-tokio-rustls", "uuid", "chrono", "macros"] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
uuid = { version = "1.4", features = ["v4", "serde"] }
chrono = { version = "0.4", features = ["serde"] }

tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "json"] }

anyhow = "1.0"
thiserror = "1.0"

axum 0.6 sits on hyper 0.14. The hyper 1.0 transition is coming but not here yet in July 2023; staying on 0.14 is the boring choice. tower-http 0.4 is the version that ships with axum 0.6 compatibly.

The application state pattern

axum 0.6 made State<T> a first-class extractor. The pattern is: one struct holds everything handlers need (DB pool, HTTP clients, config), wrapped in a clone-friendly form, and extracted by handlers as needed.

use axum::extract::State;
use sqlx::PgPool;
use std::sync::Arc;

#[derive(Clone)]
pub struct AppState {
    pub db: PgPool,
    pub http: reqwest::Client,
    pub config: Arc<Config>,
}

pub struct Config {
    pub jwt_secret: secrecy::Secret<String>,
    pub origin: String,
}

PgPool and reqwest::Client are both cheap to clone (they wrap Arc internally), so the outer struct can derive Clone. The Config is wrapped in Arc because the inner secret type isn’t Clone.

Handlers extract state by type:

async fn get_user(
    State(state): State<AppState>,
    Path(id): Path<uuid::Uuid>,
) -> Result<Json<User>, AppError> {
    let user = sqlx::query_as::<_, User>("SELECT * FROM users WHERE id = $1")
        .bind(id)
        .fetch_optional(&state.db)
        .await?
        .ok_or_else(|| AppError::not_found("user"))?;
    Ok(Json(user))
}

The router and middleware stack

use axum::{routing::{get, post}, Router};
use tower_http::{
    trace::TraceLayer,
    cors::CorsLayer,
    timeout::TimeoutLayer,
    limit::RequestBodyLimitLayer,
    compression::CompressionLayer,
};
use std::time::Duration;

pub fn build_router(state: AppState) -> Router {
    let api = Router::new()
        .route("/users", post(create_user))
        .route("/users/:id", get(get_user))
        .route("/users/:id", axum::routing::patch(update_user))
        .route("/health", get(health))
        .with_state(state);

    Router::new()
        .nest("/v1", api)
        .layer(CompressionLayer::new())
        .layer(RequestBodyLimitLayer::new(1024 * 1024)) // 1 MiB
        .layer(TimeoutLayer::new(Duration::from_secs(30)))
        .layer(CorsLayer::permissive()) // tighten in prod
        .layer(TraceLayer::new_for_http())
}

Middleware order matters. TraceLayer should be outermost so it sees the request before timeout or limit middleware can reject it; that way you get logs for rejected requests too. Compression is fine to put inside; you only want to compress responses you’re actually returning.

The body limit is critical. Without it, a malicious client can stream a 50 GB body at your handler before the JSON deserializer notices. 1 MiB is a reasonable default for JSON APIs; bump per-route for upload endpoints.

The error type

Lifting the pattern from the error-handling post:

use axum::{http::StatusCode, response::{IntoResponse, Response}, Json};

#[derive(Debug)]
pub struct AppError {
    inner: anyhow::Error,
    status: StatusCode,
    code: &'static str,
}

impl AppError {
    pub fn not_found(what: &str) -> Self {
        Self {
            inner: anyhow::anyhow!("{what} not found"),
            status: StatusCode::NOT_FOUND,
            code: "not_found",
        }
    }

    pub fn bad_request(msg: impl Into<String>) -> Self {
        Self {
            inner: anyhow::anyhow!(msg.into()),
            status: StatusCode::BAD_REQUEST,
            code: "bad_request",
        }
    }
}

impl<E> From<E> for AppError where E: Into<anyhow::Error> {
    fn from(e: E) -> Self {
        Self {
            inner: e.into(),
            status: StatusCode::INTERNAL_SERVER_ERROR,
            code: "internal_error",
        }
    }
}

impl IntoResponse for AppError {
    fn into_response(self) -> Response {
        if self.status.is_server_error() {
            tracing::error!(error.message = %self.inner, error.chain = ?self.inner.chain().collect::<Vec<_>>(), "request failed");
        }
        let body = Json(serde_json::json!({
            "error": { "code": self.code, "message": self.status.canonical_reason() }
        }));
        (self.status, body).into_response()
    }
}

The blanket From impl is the convenience that makes ? work everywhere — any error type converts to AppError as a 500, then specific cases get explicit constructors.

Extractors for cross-cutting concerns

axum extractors are how you handle authn and request validation in a way that composes. A common pattern: an extractor that pulls the user out of a JWT, and handlers that take it as a parameter.

use axum::{async_trait, extract::FromRequestParts, http::request::Parts};

pub struct AuthUser {
    pub id: uuid::Uuid,
    pub email: String,
}

#[async_trait]
impl FromRequestParts<AppState> for AuthUser {
    type Rejection = AppError;

    async fn from_request_parts(parts: &mut Parts, state: &AppState) -> Result<Self, Self::Rejection> {
        let auth = parts.headers
            .get(axum::http::header::AUTHORIZATION)
            .and_then(|v| v.to_str().ok())
            .ok_or_else(|| AppError::unauthorized())?;
        let token = auth.strip_prefix("Bearer ")
            .ok_or_else(|| AppError::unauthorized())?;
        let claims = verify_jwt(token, &state.config.jwt_secret)
            .map_err(|_| AppError::unauthorized())?;
        Ok(AuthUser { id: claims.sub, email: claims.email })
    }
}

// Now any handler can opt in:
async fn me(user: AuthUser) -> Json<UserSummary> {
    Json(UserSummary { id: user.id, email: user.email })
}

The handler signature is the documentation. If a route needs auth, it has an AuthUser parameter. If it doesn’t, it doesn’t. The compiler catches the case where you forgot.

Tracing and request IDs

A production HTTP service needs request IDs that flow through logs. The pattern I use is a middleware that generates an ID (or accepts one from the x-request-id header), and a tracing span around each request:

use tower_http::trace::{DefaultMakeSpan, DefaultOnResponse, TraceLayer};
use tracing::Level;
use uuid::Uuid;

let trace_layer = TraceLayer::new_for_http()
    .make_span_with(|req: &hyper::Request<_>| {
        let request_id = req.headers()
            .get("x-request-id")
            .and_then(|v| v.to_str().ok())
            .map(|s| s.to_string())
            .unwrap_or_else(|| Uuid::new_v4().to_string());
        tracing::info_span!(
            "http_request",
            method = %req.method(),
            uri = %req.uri(),
            request_id = %request_id,
        )
    })
    .on_response(DefaultOnResponse::new().level(Level::INFO));

Combined with tracing_subscriber::fmt().json(), every log line gets the request ID, method, and URI. In a real deployment you’d ship those to your log aggregator and filter by request ID when debugging.

For initialization:

pub fn init_tracing() {
    tracing_subscriber::fmt()
        .json()
        .with_env_filter(tracing_subscriber::EnvFilter::try_from_default_env()
            .unwrap_or_else(|_| "info,sqlx=warn,hyper=warn".into()))
        .with_target(false)
        .with_current_span(false)
        .with_span_list(true)
        .init();
}

Graceful shutdown

A service that doesn’t drain in-flight requests on SIGTERM is going to drop traffic during deploys. axum 0.6 supports this via hyper’s with_graceful_shutdown:

use tokio::signal;

pub async fn run(state: AppState, addr: std::net::SocketAddr) -> anyhow::Result<()> {
    let app = build_router(state);
    let listener = tokio::net::TcpListener::bind(addr).await?;
    tracing::info!(?addr, "listening");

    axum::Server::from_tcp(listener.into_std()?)?
        .serve(app.into_make_service())
        .with_graceful_shutdown(shutdown_signal())
        .await?;
    Ok(())
}

async fn shutdown_signal() {
    let ctrl_c = async { signal::ctrl_c().await.ok(); };

    #[cfg(unix)]
    let terminate = async {
        signal::unix::signal(signal::unix::SignalKind::terminate())
            .expect("install signal handler")
            .recv()
            .await;
    };

    #[cfg(not(unix))]
    let terminate = std::future::pending::<()>();

    tokio::select! {
        _ = ctrl_c => {},
        _ = terminate => {},
    }
    tracing::info!("shutdown signal received, draining");
}

Kubernetes will SIGTERM the pod and wait up to terminationGracePeriodSeconds (default 30) before SIGKILL. With this in place, in-flight requests finish, the listener stops accepting new ones, and the pod exits cleanly.

Health checks

A health endpoint should distinguish liveness from readiness. Liveness is “is the process alive” (almost always 200 OK if the handler runs at all). Readiness is “can this instance serve traffic right now” — DB reachable, upstream dependencies reachable.

async fn live() -> &'static str { "ok" }

async fn ready(State(state): State<AppState>) -> Result<&'static str, AppError> {
    sqlx::query("SELECT 1")
        .execute(&state.db)
        .await?;
    Ok("ok")
}

In Kubernetes, point livenessProbe at /live and readinessProbe at /ready. A failing readiness drops the pod from service load balancing without restarting it; a failing liveness restarts the container.

Common Pitfalls

Sharing a single reqwest::Client is right; building one per request is wrong. The client owns a connection pool — putting it in AppState and cloning is the intended pattern. The same applies to PgPool.
Returning String errors from handlers. Loses type information, makes consistent error responses hard, leaks internal messages. Use a typed AppError.
Forgetting with_state(state) on a sub-router. You’ll get a confusing trait bound error at the call to into_make_service. Each Router needs its state attached before merging.
No timeout on outbound HTTP. A reqwest::Client without a default timeout will hang forever on a slow upstream. Set timeout(Duration::from_secs(10)) at client construction.
Putting auth checks inside handlers. Use an extractor. Putting it in the handler body means every handler can forget to do it. The extractor pattern makes auth a type-system requirement.
Not validating Content-Length. RequestBodyLimitLayer handles the worst of this, but if you accept multipart or streaming uploads, validate sizes as you read.
Returning 500 for client errors. Audit the error mapping. A user submitting bad JSON should get 400, not 500.

Wrapping Up

axum 0.6 hit the right balance for production work — typed, async-native, tower-compatible, with an ecosystem of middleware that covers the boring parts. The patterns above are what I reach for on any new service: typed state, an AppError newtype, structured tracing, graceful shutdown, real health probes. The axum docs cover the surface area; the tower-http docs are worth reading for the middleware catalog.

Next post is on Rustls vs OpenSSL — the TLS layer underneath all of this, which has real implications for static binaries, FIPS compliance, and deployment complexity.

The dependency stack

The application state pattern

The router and middleware stack

The error type

Extractors for cross-cutting concerns

Tracing and request IDs

Graceful shutdown

Health checks

Common Pitfalls

Wrapping Up

Related posts

Why Rust Is Growing Fast in Backend Engineering

Building an HTTP Service with Axum 0.7, From Zero to Tracing

Rust in Production, Where the 2024 Stack Has Matured

March Retro, What Rust Earned Its Keep For

Rust vs Go for Backend APIs, When Each One Wins

Building a JSON API in Rust with Axum 0.4

Why I'm Learning Rust in 2022 (as a Go Developer)

Rustls vs OpenSSL for Backend TLS in 2023

Let’s Start a Project