Pod Security Standards in 2023, Migrating Off PSPs Without Breaking Everything
TL;DR — PSP was removed in Kubernetes 1.25; PSS via the Pod Security Admission controller is the in-tree replacement / Roll out per namespace with
pod-security.kubernetes.io/enforce: baselineafter one to two weeks atwarnandaudit/ The workloads that always break: privileged sidecars, node exporters withhostPath, and anything mounting/var/run/docker.sock.
PodSecurityPolicy was deprecated in Kubernetes 1.21 and removed in 1.25. Pod Security Standards is the replacement, implemented by the built-in Pod Security Admission controller. By Kubernetes 1.28, every cluster should be running PSS — but a lot of clusters are not, because the migration is genuinely fiddly when you have a long tail of legacy workloads. I have done this migration on three clusters this year. Here is the playbook that worked.
What PSS Is and Is Not
PSS is three profiles — Privileged, Baseline, Restricted — applied via namespace labels. The admission controller is in-tree, no installation. The profiles are versioned by Kubernetes version; you can pin to v1.27 or use latest.
What PSS is not: a replacement for PSP’s full flexibility. PSPs let you write arbitrary policy. PSS is three opinionated profiles, take it or leave it. If you need something between Baseline and Restricted, or a custom rule, you need a policy engine — Gatekeeper, Kyverno, or similar. The Kubernetes team’s stance is that PSS handles 80% of cases simply and the rest should use a general policy engine.
The three profiles, paraphrased from the Kubernetes Pod Security Standards docs:
Privileged. Anything goes. Use for system namespaces and trusted tooling.
Baseline. Blocks known privilege escalation — no privileged containers, no host namespaces, no hostPath except specific allowed types, no Linux capabilities beyond a default set. The right floor for most application namespaces.
Restricted. Substantially harder. Non-root required, read-only root filesystem encouraged, seccomp profiles required, capabilities dropped. Suitable for hardened production workloads but several common patterns will fail without changes.
The Three Modes
Each profile can be applied in three modes per namespace, via label:
apiVersion: v1
kind: Namespace
metadata:
name: orders
labels:
pod-security.kubernetes.io/enforce: baseline
pod-security.kubernetes.io/enforce-version: v1.27
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: v1.27
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/audit-version: v1.27
enforce. Violating pods are rejected at admission. This is the production setting.
warn. Pods are admitted; kubectl apply returns a warning. Useful for telling developers that something will fail under a stricter profile soon.
audit. Pods are admitted, but violations are recorded in the API server’s audit log. The way you see what would be rejected if you tightened enforcement, without breaking anything.
The pattern I always use: enforce at one level, warn and audit at the next level up. Above, the namespace enforces Baseline but tells you what would fail under Restricted. When the audit log is clean for two weeks, you bump enforce to Restricted.
The Migration, In Order
For a cluster moving off PSP (or starting from nothing), this is the order that has worked.
1. Cluster-default labels. Apply default labels at the cluster level so new namespaces get reasonable defaults. The PodSecurity admission plugin in kube-apiserver supports this via the AdmissionConfiguration file:
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1
kind: PodSecurityConfiguration
defaults:
enforce: "baseline"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces: ["kube-system", "kube-public", "kube-node-lease"]
Mounted into the kube-apiserver as a config file. On managed Kubernetes (EKS, GKE, AKS), the way you set this varies by provider; check your cloud’s docs. On EKS, this is configured via the cluster’s API server flags.
2. Audit existing workloads. Before flipping anything to enforce, run with audit everywhere and watch the audit log. The events look like this in JSON:
{
"auditID": "...",
"stage": "RequestReceived",
"annotations": {
"pod-security.kubernetes.io/audit-violations": "would violate PodSecurity \"restricted:v1.27\": allowPrivilegeEscalation != false, ...",
"pod-security.kubernetes.io/enforce-policy": "baseline:v1.27"
}
}
Aggregate these. The number of distinct workloads that violate each profile is the migration backlog.
3. Per-namespace rollout. For each namespace:
- Label it with the target
enforcelevel. - Set
warnandauditat the next level up. - Wait two weeks. Watch for warnings in deploy logs and violations in audit.
- Fix or exempt offending workloads.
- Move to the next namespace.
Do not try to do this cluster-wide in one pass. The blast radius if you misjudge is too big.
4. Hold the line. Once a namespace is at enforce: restricted, that has to be the new normal. The single biggest mistake is letting one team negotiate down to baseline because their workload is “special”. A privileged sidecar deserves its own namespace, not a relaxed namespace label on the entire app.
The Workloads That Always Break
After three migrations, the same five workloads break every time. Plan for them.
Privileged DaemonSets. Falco, the GPU operator, CSI drivers, networking plugins. They legitimately need privileged mode. Put them in their own namespace with enforce: privileged. Do not exempt them in the application namespace.
Node exporter and metrics collectors. They mount /proc, /sys, or /. Baseline allows hostPath only for a specific allowlist; node exporter is one of them, but Restricted disallows hostPath entirely. These go in their own privileged or baseline namespace.
Anything mounting the docker socket. Some CI runners, image builders, and legacy tools mount /var/run/docker.sock. Restricted blocks this. The right answer is to move to rootless builders (BuildKit, Kaniko) rather than fight the policy.
Apps that run as UID 0. Baseline allows this; Restricted does not. Most Java and Node apps will run fine as non-root once you set runAsUser and runAsNonRoot, but the image has to permit it — some base images bake in root-only paths.
Apps that need extra capabilities. Restricted requires capabilities.drop: ["ALL"]. An app that needs NET_BIND_SERVICE (to listen on port 80 as non-root) has to add it back explicitly:
securityContext:
runAsNonRoot: true
runAsUser: 1000
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
add: ["NET_BIND_SERVICE"]
seccompProfile:
type: RuntimeDefault
This is the minimum Restricted-compliant securityContext for a non-trivial app. Save it as a snippet; you will type it a lot.
Where PSS Is Not Enough
Three cases I have hit where PSS does not do the job and Gatekeeper or Kyverno takes over.
Custom seccomp profiles. PSS allows RuntimeDefault or a Localhost profile but does not let you mandate a specific localhost profile. If you want every pod to use a specific custom profile, that needs Gatekeeper.
Image registry restrictions. “Only images from ghcr.io/myorg are allowed” is not a PSS concept. Gatekeeper handles this cleanly.
Label and annotation requirements. “Pods must have an owner label” is also not PSS. Gatekeeper.
The two systems coexist. PSS handles the security-context-shaped checks; Gatekeeper handles everything else. There is some overlap (you can express PSS rules in Rego if you want to) but the practical advice is to use PSS where it fits and Gatekeeper where it does not.
Common Pitfalls
Forgetting enforce-version. Without a version, the namespace pins to “latest”, which means a Kubernetes upgrade can silently tighten policy. Pin to a specific minor version and re-evaluate on each upgrade.
Labeling kube-system. Do not. It is exempt by default in most distros for good reason. System components have legitimate need for privileged mode and will not start under Baseline.
Mistaking warnings for enforcement. A warn violation does not block the pod. I have seen teams ship known-broken workloads because they assumed the warning was preventing them. Read the actual kubectl apply output, not assumptions.
Audit log not being read. PSS audit events go to the API server’s audit log. If your audit log is not being collected and indexed, you cannot see violations. Wire it to your log pipeline before starting the migration.
Helm chart compatibility. Many community Helm charts ship with security contexts that fail Restricted. The chart maintainers are catching up but not all are there yet. Either fork the chart, override values, or live with Baseline for those namespaces until upstream supports Restricted.
One namespace per team is too coarse. A “team namespace” mixing apps, sidecars, and one-off jobs forces you to pick the lowest-common-denominator profile. Per-app namespaces let each app run at the strictest profile it can support.
Wrapping Up
PSS is a downgrade from PSP in raw flexibility and an upgrade in operability. The migration is mostly about cataloguing legacy workloads, splitting namespaces by required privilege level, and accepting that “Restricted everywhere” is a destination, not a Day 1 state. With PSS holding the runtime configuration line, container scanning at build, signing at publish, SBOMs attached, secrets dynamic, runtime detection live, and admission policies enforced — the picture this month’s posts have been building toward is a containerised pipeline where every layer has a control, and no single compromise is enough to ship code into production. That is the working definition of digital immunity I have been using with teams this year, and the rest is execution.