background-shape
Shipping Rust to Kubernetes, Smaller Images and Faster Cold Starts
March 27, 2024 · 8 min read · by Muhammad Amal programming

TL;DR — Rust on Kubernetes can ship as a 15MB image that cold-starts in under 100ms, but only if you respect three things: a real multi-stage Dockerfile with cargo-chef for caching, a distroless or scratch runtime, and a static or near-static binary built against musl or a pinned glibc. This post is the recipe I actually use.

Most of the Rust services I’ve shipped end up in Kubernetes. The runtime story is good — there’s no JVM to warm up, no GC pause storms, no node_modules bloat. But the build story is more nuanced than “FROM rust && cargo build”. Done naively, your image is 1.5GB, your CI takes 8 minutes per push, and your container surface area is wider than the Node app you replaced.

I’ll walk through what I do now in 2024, the pieces that matter, and where the trade-offs hide. The HTTP service plumbing in my Axum tracing post and the metrics setup in the observability post are what runs inside these images.

The Goal: Three Numbers

Every Rust service I ship targets:

  • Image size under 30MB. Smaller is faster to pull on a cold node.
  • Cold start under 200ms from container start to first 200 OK on /readyz. Helps autoscaling and graceful rollouts.
  • Build time under 90 seconds in CI on a warm cache. Slow CI is a tax on every developer.

Those are achievable. They’re not the absolute minimums, but they’re the sweet spot where you stop fighting tools.

The Dockerfile

The whole shape:

# syntax=docker/dockerfile:1.6

# 1. Plan dependencies for caching
FROM rust:1.76-slim-bookworm AS chef
RUN cargo install cargo-chef --version 0.1.66 --locked
WORKDIR /app

FROM chef AS planner
COPY . .
RUN cargo chef prepare --recipe-path recipe.json

# 2. Cook dependencies (cached layer)
FROM chef AS builder
COPY --from=planner /app/recipe.json recipe.json
RUN cargo chef cook --release --recipe-path recipe.json

# 3. Build the actual app
COPY . .
RUN cargo build --release --bin myservice
RUN strip target/release/myservice

# 4. Runtime: distroless
FROM gcr.io/distroless/cc-debian12:nonroot AS runtime
WORKDIR /app
COPY --from=builder /app/target/release/myservice /app/myservice
USER nonroot:nonroot
EXPOSE 8080
ENTRYPOINT ["/app/myservice"]

That’s the whole thing. Four stages, each doing one job. The trick is cargo-chef in stages one and two: it computes a recipe of all your dependencies and builds them in a layer that only rebuilds when Cargo.lock changes. Your app source changing — which is most builds — only triggers stage three.

On a warm cache, a one-line code change rebuilds in 20–30 seconds. On a cold cache, the dependencies dominate at 3–5 minutes for a typical service. That’s the right ratio.

The distroless cc image gives you libc, libgcc, and not much else. ~24MB base, no shell, no package manager, no busybox. The attack surface is roughly “your binary and what it dynamically links against.”

Going Smaller With musl or scratch

If your binary doesn’t need glibc-specific features, statically link against musl and use a scratch base. The image is just your binary.

FROM rust:1.76-slim-bookworm AS builder
RUN rustup target add x86_64-unknown-linux-musl
RUN apt-get update && apt-get install -y musl-tools && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY . .
RUN cargo build --release --target x86_64-unknown-linux-musl --bin myservice

FROM scratch
COPY --from=builder /app/target/x86_64-unknown-linux-musl/release/myservice /myservice
EXPOSE 8080
ENTRYPOINT ["/myservice"]

This produces images in the 8–15MB range. Caveats:

  1. musl’s allocator is slower than glibc’s. For allocation-heavy services, latencies can regress 5–15%. Drop in mimalloc or jemallocator to recover.
  2. No DNS resolver in scratch. If your service does outbound HTTP, the reqwest crate’s rustls feature avoids the OpenSSL dependency, but Tokio uses libc’s getaddrinfo by default. Use hickory-resolver for a pure-Rust DNS resolver if you really want zero libc.
  3. No /etc/ssl/certs in scratch. Use rustls-native-certs to embed the cert bundle, or copy it in explicitly.

For most services I use distroless cc and skip the musl tax. For sidecars and tiny utilities I use scratch.

Cross-Compilation for ARM

Half my fleet runs on ARM64 nodes for the price difference. Cross-compiling from an amd64 builder uses cross or, increasingly, just buildx with a foreign-arch target.

# Cross-compile to ARM64 from any host
FROM --platform=$BUILDPLATFORM rust:1.76-slim-bookworm AS builder
RUN rustup target add aarch64-unknown-linux-gnu
RUN apt-get update && apt-get install -y g++-aarch64-linux-gnu && rm -rf /var/lib/apt/lists/*
ENV CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=aarch64-linux-gnu-gcc \
    CC_aarch64_unknown_linux_gnu=aarch64-linux-gnu-gcc
WORKDIR /app
COPY . .
RUN cargo build --release --target aarch64-unknown-linux-gnu --bin myservice

Build it with docker buildx build --platform linux/amd64,linux/arm64 ... and you get a manifest list that runs natively on either architecture. Kubernetes pulls the right one based on node affinity.

The C-linker dance is the only annoying part. If a transitive dependency tries to compile C code (ring, openssl-sys), it needs the cross-toolchain available. aws-lc-rs and rustls are usually less fussy than openssl-sys.

Cold Start Tuning

The container is small. The binary still has to load and initialize. Three things help:

LTO and codegen-units. Cargo.toml release profile:

[profile.release]
lto           = "fat"
codegen-units = 1
strip         = true
opt-level     = 3

This costs ~30% more build time but typically buys you 10–20% smaller binaries and faster startup. Worth it for production builds.

Lazy static init. Avoid heavy work in main() before the readiness check passes. Open the DB pool, register metrics, then immediately start serving /healthz as 200. Do schema migrations and warmup queries in a background task that flips /readyz to 200 when ready.

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    observability::init("myservice")?;
    let app_state = AppState::new().await?;     // cheap: pool config only

    let router = build_router(app_state.clone());
    let listener = tokio::net::TcpListener::bind("0.0.0.0:8080").await?;

    // Background: warm caches, run migrations, then flip readiness
    let warm_state = app_state.clone();
    tokio::spawn(async move {
        if let Err(e) = warm_state.warmup().await {
            tracing::error!(error = ?e, "warmup failed");
        }
        warm_state.ready.store(true, std::sync::atomic::Ordering::SeqCst);
    });

    axum::serve(listener, router).await?;
    Ok(())
}

Kubernetes routes traffic when /readyz returns 200. Background warmup means the pod is up faster, even if it can’t serve real requests for another second or two.

PGO if you can afford it. Profile-Guided Optimization in rustc is stable in 1.76. The two-pass build (instrumented run, then PGO build) is a 15-minute CI investment for 5–15% startup wins on hot services. The official PGO guide has the exact incantations.

Kubernetes Resource Tuning

Two settings I see misconfigured constantly.

CPU limits cause more harm than good. Tokio scales with available cores; CPU throttling tail-latencies your runtime by stalling it at random. Set CPU requests (so the scheduler reserves capacity) but leave limits off, or set them generously (4x requests).

Memory limits should be a hard ceiling. Rust services are usually predictable in memory. Set requests == limits and let OOMKiller restart you cleanly if something runs away.

resources:
  requests:
    cpu:    "250m"
    memory: "128Mi"
  limits:
    memory: "256Mi"   # no cpu limit

For HPA, scale on a custom metric (queue depth, request rate) over CPU. Rust services often have 5–15% CPU at peak, and the HPA can’t make sense of that.

A Sample Deployment

To round it out:

apiVersion: apps/v1
kind: Deployment
metadata: { name: myservice }
spec:
  replicas: 3
  selector: { matchLabels: { app: myservice } }
  template:
    metadata:
      labels: { app: myservice }
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port:   "9090"
    spec:
      containers:
        - name: app
          image: ghcr.io/muhammadamal/myservice:1.0.0
          ports:
            - { name: http,    containerPort: 8080 }
            - { name: metrics, containerPort: 9090 }
          readinessProbe:
            httpGet: { path: /readyz, port: http }
            periodSeconds: 2
          livenessProbe:
            httpGet: { path: /healthz, port: http }
            periodSeconds: 10
          resources:
            requests: { cpu: "250m", memory: "128Mi" }
            limits:   { memory: "256Mi" }
          env:
            - { name: RUST_LOG,                  value: "info" }
            - { name: LOG_FORMAT,                value: "json" }
            - { name: OTEL_EXPORTER_OTLP_ENDPOINT, value: "http://otel-collector:4317" }

Three replicas as a starting point, separate metrics port, JSON logs, OTel pointed at a collector. That’s the whole production deployment shape for a typical Rust HTTP service.

Common Pitfalls

Building inside a tini or dumb-init wrapper. Tokio handles SIGTERM correctly if you wire it up. You don’t need a PID-1 reaper for Rust services that have one process.

Forgetting .dockerignore. Without one, target/ gets copied into the build context. Suddenly your “tiny” build is shipping 2GB of artifacts through the build network. Always exclude target/, .git/, and editor temp dirs.

Pinning to :latest. Every Kubernetes deploy I’ve seen go wrong because of this involved a base-image surprise. Pin to a digest: gcr.io/distroless/cc-debian12@sha256:.... CI tools like Renovate keep it current.

Static linking with OpenSSL. openssl-sys is the source of half the cross-compilation pain. Use rustls and aws-lc-rs (or ring) wherever possible.

No shell in distroless. Means no kubectl exec debugging. Ship a separate debug image with the same base plus busybox, deploy it as a sidecar only when you need it. Don’t compromise the production image.

Wrapping Up

Shipping Rust to Kubernetes well is a recipe, not a research project. Multi-stage Dockerfiles with cargo-chef for caching, distroless or scratch for runtime, careful cold-start tuning, and resource limits that match Rust’s actual behavior. Get those right once and your deploy pipeline stops being a thing you think about.

The reward is real: small images, fast pulls, fast starts, predictable resource use. The kind of operational profile that makes platform engineers happy and pager rotations quiet.