background-shape
Building Secure CLIs in Rust with clap 4
July 18, 2023 · 8 min read · by Muhammad Amal programming

TL;DR — clap 4 derive is the right default for new CLIs in 2023, with subcommands, env-var fallback, and built-in validation / Never accept secrets via command-line flags — use env vars, stdin, or a credential helper / Ship as a fully static binary so deployment is scp and chmod +x

A surprising fraction of the Rust I’ve written in production isn’t long-running services — it’s internal CLIs. Migration tools, ops scripts, secret rotation utilities, validators that run in CI. Rust is unusually good for this work: one binary, no runtime, fast startup, easy to harden against the kinds of mistakes that bite shell scripts.

This post is what I’ve landed on after building a dozen of these. It’s opinionated, and the opinions are about security defaults, not about clap’s API surface. For the runtime model these CLIs sit on top of, see my tokio post. For the language guarantees, see the memory safety post.

The starting Cargo.toml

[package]
name = "rotate-keys"
version = "0.4.1"
edition = "2021"
rust-version = "1.71"

[dependencies]
clap = { version = "4.3", features = ["derive", "env", "wrap_help"] }
tokio = { version = "1.29", features = ["rt", "macros", "fs", "io-util", "signal"] }
anyhow = "1.0"
thiserror = "1.0"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
secrecy = "0.8"
rpassword = "7.2"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"

[profile.release]
lto = "thin"
codegen-units = 1
strip = true
panic = "abort"
overflow-checks = true

A few things worth calling out: panic = "abort" for CLIs means no unwinding, which both shrinks the binary and makes the failure mode unambiguous (process exits, not “thread panicked but main lived”). strip = true removes debug symbols from release builds — important for binaries you’ll distribute. overflow-checks = true because the perf cost is negligible for CLIs and the correctness gain matters when you’re parsing numeric input.

The clap 4 derive shape

clap 4’s derive API gives you a struct that documents the CLI. Subcommands are enums:

use clap::{Parser, Subcommand};
use std::path::PathBuf;

#[derive(Parser)]
#[command(name = "rotate-keys")]
#[command(version, about = "Rotate service credentials in vault", long_about = None)]
struct Cli {
    /// Path to config file
    #[arg(short, long, env = "ROTATE_KEYS_CONFIG", default_value = "/etc/rotate-keys.toml")]
    config: PathBuf,

    /// Increase verbosity (-v, -vv, -vvv)
    #[arg(short, long, action = clap::ArgAction::Count)]
    verbose: u8,

    /// Dry run — don't make changes
    #[arg(long)]
    dry_run: bool,

    #[command(subcommand)]
    command: Command,
}

#[derive(Subcommand)]
enum Command {
    /// Rotate a single key by name
    Rotate {
        /// Key identifier
        name: String,
        /// Skip the confirmation prompt
        #[arg(short, long)]
        yes: bool,
    },
    /// List rotatable keys
    List {
        /// Output format
        #[arg(short, long, value_enum, default_value_t = OutputFormat::Table)]
        format: OutputFormat,
    },
    /// Verify the current vault credential without rotating
    Verify,
}

#[derive(Clone, Copy, clap::ValueEnum)]
enum OutputFormat {
    Table,
    Json,
    Yaml,
}

This is a complete CLI surface. clap generates --help, --version, -h, error messages for unknown flags, and shell completions on request. The env feature means you can fall back to environment variables for any flag.

Secrets: never on the command line

The first rule of CLI security: command-line arguments are visible in ps, in shell history, in process listings shared with monitoring agents, and in the parent process if it forked. A token passed as --token abc123 is leaked across all of those.

// BAD
#[derive(Parser)]
struct Bad {
    #[arg(long)]
    api_token: String,
}

// GOOD: env var, stdin, or interactive prompt
use secrecy::{Secret, ExposeSecret};

fn load_token() -> anyhow::Result<Secret<String>> {
    // 1. Env var (preferred for non-interactive)
    if let Ok(t) = std::env::var("API_TOKEN") {
        return Ok(Secret::new(t));
    }
    // 2. Stdin if piped
    if !atty::is(atty::Stream::Stdin) {
        let mut buf = String::new();
        std::io::stdin().read_line(&mut buf)?;
        return Ok(Secret::new(buf.trim_end().to_string()));
    }
    // 3. Interactive prompt with hidden input
    let t = rpassword::prompt_password("API token: ")?;
    Ok(Secret::new(t))
}

secrecy::Secret<T> wraps a value so it doesn’t show up in Debug output (the default Debug returns Secret([REDACTED String])). It also zeroes on drop if the inner type supports it. This is defense in depth — it won’t help if you println!("{}", secret.expose_secret()), but it stops the common case of secrets leaking into logs via tracing or dbg!.

For env-var-based config, document the variables in --help and provide a .env.example in the repo. Don’t let people guess.

Validating input at the boundary

Every CLI input is untrusted. Even from your own ops team. Validate at the boundary, produce a typed value, and propagate the typed value through the rest of the program. clap 4’s value_parser is the right hook:

fn parse_duration_secs(s: &str) -> Result<Duration, String> {
    let n: u64 = s.parse().map_err(|_| format!("invalid number: {s}"))?;
    if n == 0 || n > 86_400 {
        return Err("duration must be between 1 and 86400 seconds".into());
    }
    Ok(Duration::from_secs(n))
}

#[derive(Parser)]
struct Args {
    #[arg(long, default_value = "30", value_parser = parse_duration_secs)]
    timeout: Duration,

    /// Must be an absolute path under /var/lib/rotate-keys/
    #[arg(long, value_parser = parse_state_path)]
    state_dir: PathBuf,
}

fn parse_state_path(s: &str) -> Result<PathBuf, String> {
    let p = PathBuf::from(s);
    if !p.is_absolute() {
        return Err("must be an absolute path".into());
    }
    let canon = p.canonicalize().map_err(|e| e.to_string())?;
    if !canon.starts_with("/var/lib/rotate-keys/") {
        return Err("must be under /var/lib/rotate-keys/".into());
    }
    Ok(canon)
}

The canonicalize step is important: it resolves symlinks and .., which closes the obvious path-traversal hole. If the path is being used to read or write files, canonicalize before any file operation.

Exit codes that mean something

CLIs are composed in shell pipelines and CI scripts. Your exit code is a public API. Pick a small set of codes and stick to them:

use std::process::ExitCode;

#[derive(thiserror::Error, Debug)]
enum CliError {
    #[error("configuration error: {0}")]
    Config(String),
    #[error("authentication failed")]
    Auth,
    #[error("network error: {0}")]
    Network(String),
    #[error("not found: {0}")]
    NotFound(String),
    #[error(transparent)]
    Other(#[from] anyhow::Error),
}

impl CliError {
    fn exit_code(&self) -> u8 {
        match self {
            CliError::Config(_) => 78,    // EX_CONFIG from sysexits.h
            CliError::Auth => 77,         // EX_NOPERM
            CliError::Network(_) => 69,   // EX_UNAVAILABLE
            CliError::NotFound(_) => 66,  // EX_NOINPUT
            CliError::Other(_) => 1,
        }
    }
}

#[tokio::main(flavor = "current_thread")]
async fn main() -> ExitCode {
    let cli = Cli::parse();
    init_tracing(cli.verbose);
    match run(cli).await {
        Ok(()) => ExitCode::SUCCESS,
        Err(e) => {
            tracing::error!("{e:#}");
            ExitCode::from(e.exit_code())
        }
    }
}

I use the sysexits.h conventions for system tools because they integrate cleanly with systemd unit restart policies and CI step exit-code-based branching. For end-user tools you can use 1 for everything; what matters is consistency within a tool.

Static binaries for deployment

A CLI you deploy should be one file. No runtime dependencies, no shared libraries, no surprise at the destination because the libc is older. On Linux, build against musl:

# Add the musl target once
rustup target add x86_64-unknown-linux-musl

# Build
cargo build --release --target x86_64-unknown-linux-musl

# Verify it's static
file target/x86_64-unknown-linux-musl/release/rotate-keys
# expected: ELF 64-bit LSB executable ... statically linked ...

# Confirm no dynamic linking
ldd target/x86_64-unknown-linux-musl/release/rotate-keys
# expected: "not a dynamic executable" or "statically linked"

A few gotchas with musl: TLS via OpenSSL is painful (the build wants system OpenSSL); use rustls instead. DNS resolution can behave differently for some edge cases. Profile under load before shipping if your CLI is high-throughput.

For macOS distribution, use lipo to build universal binaries:

cargo build --release --target aarch64-apple-darwin
cargo build --release --target x86_64-apple-darwin
lipo -create \
  target/aarch64-apple-darwin/release/rotate-keys \
  target/x86_64-apple-darwin/release/rotate-keys \
  -output rotate-keys-macos

Signal handling

Long-running CLIs (rotations that take minutes, migrations) should handle SIGINT cleanly so partial state is flushed:

use tokio::signal;

async fn run_with_cancel(work: impl Future<Output = anyhow::Result<()>>) -> anyhow::Result<()> {
    tokio::select! {
        res = work => res,
        _ = signal::ctrl_c() => {
            tracing::warn!("interrupted, flushing state");
            // checkpoint, then exit non-zero so callers know it was cancelled
            Err(anyhow::anyhow!("interrupted by user"))
        }
    }
}

Common Pitfalls

  • Accepting secrets in args. Already covered, but worth repeating. Even --password-stdin style flags should read from stdin, not from the arg.
  • println! for machine-readable output. Use --format json to switch to structured output; humans get the table, scripts get the JSON. Mixing the two via eprintln! for logs and println! for output is the standard pattern.
  • Not honoring NO_COLOR and CLICOLOR. If you colorize output, check std::env::var("NO_COLOR"). clap handles this for help output already.
  • Ignoring stdout being piped. If stdout isn’t a TTY, drop progress bars and color. atty::is(atty::Stream::Stdout) is the check.
  • Panics on bad input. Every unwrap() in the parser path is a way for a user to get a stack trace instead of a useful error. Use Result everywhere user input touches.
  • Not testing the CLI surface. clap makes it easy: Cli::try_parse_from(["bin", "rotate", "foo", "--yes"]) lets you unit-test the parser without spawning a process.

Wrapping Up

A solid Rust CLI is one of the highest-leverage artifacts you can ship as a backend engineer. The combination of a static binary, fast startup, and the security defaults you can build in makes it the right replacement for shell scripts and Python tools in any context where reliability matters. The clap 4 docs cover the API surface; the patterns above are what holds up in practice.

Next post is error handling — thiserror for libraries, anyhow for applications, and the patterns that make ? work in real codebases without losing context.