background-shape
Vault 1.14 Dynamic Secrets in Kubernetes, Past the Sidecar Demo
September 14, 2023 · 6 min read · by Muhammad Amal programming

TL;DR — Dynamic secrets mean every pod gets a unique short-lived DB user; revocation is automatic when the lease expires / Use the Agent Injector for apps that can read a file and reload, CSI for apps that mount secrets as env vars / Database connection pools and lease TTLs interact in ways the docs do not cover — plan for it or pages will follow.

Static database passwords in Secret objects are the worst secret to leak. They are long-lived, broadly used, and rotating them is a coordinated outage. Vault’s dynamic database secrets engine fixes this: every pod that needs DB access gets a unique user with a short TTL. When the pod dies, the lease expires, and the user is dropped. When the credential is stolen, it stops working in fifteen minutes.

The demo is straightforward. The production setup is not. I have run this in three clusters and the failure modes are consistent enough to be worth documenting. Vault 1.14.2 is the current line as I write this.

The Components

Four moving parts in a typical Kubernetes deployment.

Vault server. Highly available, ideally Raft storage, KMS-backed auto-unseal. Whether you host this yourself or use HCP Vault, the API surface is the same.

Kubernetes auth method. A Vault auth method that trusts service-account tokens from your cluster. The cluster’s TokenReview API is the trust anchor.

Database secrets engine. Configured with a connection (a privileged user that Vault uses to create ephemeral users) and a role (the template for the ephemeral user — privileges, TTL, max TTL).

Vault Agent Injector or Secrets Store CSI Driver. The thing that delivers secrets into the pod. They have different tradeoffs.

Wiring It Up

Assume Vault is running. Enable the auth method against your cluster.

vault auth enable kubernetes

vault write auth/kubernetes/config \
  kubernetes_host="https://kubernetes.default.svc.cluster.local:443" \
  kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
  token_reviewer_jwt=@/var/run/secrets/kubernetes.io/serviceaccount/token \
  issuer="https://kubernetes.default.svc.cluster.local"

The issuer matters in Kubernetes 1.21+ where projected service-account tokens use a discoverable issuer. Get this wrong and auth fails with a JWT verification error that does not say so.

Enable the database engine and configure the Postgres connection.

vault secrets enable database

vault write database/config/orders-pg \
  plugin_name=postgresql-database-plugin \
  connection_url="postgresql://{{username}}:{{password}}@orders-pg.prod:5432/orders?sslmode=require" \
  allowed_roles="orders-app" \
  username="vault_admin" \
  password="$VAULT_PG_ADMIN_PASSWORD"

vault write database/roles/orders-app \
  db_name=orders-pg \
  creation_statements="CREATE ROLE \"{{name}}\" WITH LOGIN PASSWORD '{{password}}' VALID UNTIL '{{expiration}}'; GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA public TO \"{{name}}\";" \
  default_ttl="1h" \
  max_ttl="24h"

A Vault policy and a role binding the policy to a Kubernetes service account.

# orders-app-policy.hcl
path "database/creds/orders-app" {
  capabilities = ["read"]
}
vault policy write orders-app orders-app-policy.hcl

vault write auth/kubernetes/role/orders-app \
  bound_service_account_names=orders-app \
  bound_service_account_namespaces=orders \
  policies=orders-app \
  ttl=1h

The Kubernetes side, using the Agent Injector annotations:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-app
  namespace: orders
spec:
  template:
    metadata:
      annotations:
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/role: "orders-app"
        vault.hashicorp.com/agent-inject-secret-db.env: "database/creds/orders-app"
        vault.hashicorp.com/agent-inject-template-db.env: |
          {{- with secret "database/creds/orders-app" -}}
          DB_USER={{ .Data.username }}
          DB_PASS={{ .Data.password }}
          {{- end }}
        vault.hashicorp.com/agent-inject-file-db.env: "db.env"
        vault.hashicorp.com/agent-inject-perms-db.env: "0440"
    spec:
      serviceAccountName: orders-app
      containers:
        - name: app
          image: ghcr.io/myorg/orders-app:1.4.0

The injector mutates the pod, adds a sidecar that authenticates to Vault, fetches the creds, writes them to /vault/secrets/db.env, and renews the lease. The application reads the file.

Injector or CSI

The Agent Injector is the easier path. It runs as a sidecar, handles authentication and lease renewal, and atomically rewrites the file on rotation. The downside is the sidecar overhead and the fact that the app must be able to reload credentials from a file.

The Secrets Store CSI Driver mounts secrets as a volume and can sync them into Kubernetes Secret objects, which can then be projected as environment variables. No sidecar. The downside is environment variables cannot be updated in a running process — you would have to restart the pod when the credential rotates, which defeats much of the point of dynamic secrets.

My rule: injector for apps that can re-read a file or use the Vault SDK directly; CSI only for apps where env vars are the only option and you are willing to accept rolling restarts as the rotation mechanism.

The Connection-Pool Problem

This is the failure mode that catches everyone. Your app gets a Postgres credential, opens a connection pool with 20 connections, and the pool holds those connections open indefinitely. Three hours later the credential’s lease has expired, Vault has revoked the user, and the pool’s idle connections are now dead. The next time the app borrows a connection, the query fails with FATAL: role "v-kubernet-orders-app-..." does not exist.

There are three fixes and you need at least two of them.

Connection lifetime shorter than lease TTL. Set max_lifetime on the pool to something less than the lease’s default_ttl. The pool recycles connections before Vault drops them. In Go’s database/sql, that is db.SetConnMaxLifetime(45 * time.Minute) for a 1h lease.

Lease renewal in the Agent. The Agent Injector renews the lease automatically up to max_ttl. Set max_ttl to something like 24h so leases can be renewed throughout a day-long run. After 24h the connection has to rotate regardless.

Health-check the pool. A periodic SELECT 1 from the pool surfaces dead connections quickly instead of letting them sit until first use.

What you cannot do is hold a pooled connection forever. Dynamic secrets and forever-live connections are incompatible by design.

Common Pitfalls

Database admin user with too much privilege. The user Vault uses to create ephemeral users needs CREATEROLE on Postgres, plus GRANT rights on whatever schemas the role grants access to. It does not need SUPERUSER. I have seen Vault configured with the Postgres superuser more than once.

max_ttl too short. If max_ttl equals default_ttl, the lease cannot be renewed at all. The pod gets a credential, uses it for an hour, the credential expires, and the Agent has to fetch a brand new one — which means a brand new Postgres user, which means existing connections die. Make max_ttl significantly longer than default_ttl.

Cluster-wide Vault token. Tempting in dev to give every workload one Vault role. In prod, one role per workload, scoped to a service account. Otherwise a compromise of any pod gets DB access for every database role that token can reach.

Auto-unseal misconfigured. Vault HA with file storage and Shamir unseal will not survive a node reboot at 3am. Raft storage plus KMS auto-unseal (AWS KMS, GCP KMS, Azure Key Vault) is the only setup I recommend for production. The Vault HA reference architecture is worth reading carefully.

Forgetting about static creds. Dynamic secrets are great for databases and cloud APIs. Some secrets are inherently static — third-party API keys, signing keys, webhook secrets. Use Vault’s KV v2 engine for those, with versioning enabled. Rotation is manual but at least audited.

Audit log not enabled. Vault does not log requests by default. vault audit enable file file_path=/vault/logs/audit.log is the first command on any new Vault. Without it, you cannot answer “who read this secret yesterday”.

Wrapping Up

Dynamic secrets are not a drop-in upgrade. They change the lifecycle of credentials in a way that interacts with connection pools, application restart behaviour, and your incident-response playbook. The work is worth it: a stolen credential expires before the attacker has time to scan, and rotation stops being a quarterly fire drill. Once secrets are short-lived, the next thing worth tightening is what is happening inside the container at runtime, which is where Falco’s runtime detection comes in.