background-shape
DevSecOps in AI ML Pipelines, A Comprehensive Tutorial
September 8, 2025 · 9 min read · by Muhammad Amal programming

TL;DR — ML pipelines have more attack surface than typical app code (training data, model artifacts, notebooks, registries) and most teams secure them less. The 2025 baseline is secret scanning on every push, SAST that knows ML idioms, signed artifacts at every promotion gate, and policy enforcement at the cluster admission layer.

ML pipelines have been the embarrassing security backwater of the last five years. We talk about data poisoning and model theft at conferences while production training jobs pull arbitrary tarballs from S3, run unpinned pip install, and write checkpoints to buckets that allow public read for “convenience.” The same engineering organization that requires a multi-person review for a React change will let a data scientist push a 40GB pickle file straight to production.

The good news is that 2025 finally has the tooling to fix this without making ML teams hate you. Semgrep 1.95 ships rule packs for ML frameworks. TruffleHog v3.85 and Gitleaks 8.22 catch secrets in notebooks. Cosign 2.5 signs model artifacts the same way it signs containers. Kyverno 1.13 enforces admission policies that block unsigned models from running.

This tutorial wires these tools into a realistic ML pipeline end to end: training, evaluation, registration, deployment. I’m assuming you already have the Zero Trust identity setup from the previous post so we can focus on the pipeline controls.

1. The Pipeline We’re Securing

A typical 2025 ML pipeline has these stages and these threats.

+-------+    +-------+    +-------+    +---------+    +--------+
| code  | -> | train | -> | eval  | -> | register| -> | deploy |
+-------+    +-------+    +-------+    +---------+    +--------+
   |             |            |             |             |
   v             v            v             v             v
secrets,     data        eval         unsigned       unverified
malicious   poisoning  manipulation   artifacts      models in
deps                                                  prod

Each arrow is a promotion gate. Each box is an attack surface. The DevSecOps job is to add policy enforcement at every arrow and instrumentation in every box.

2. Step 1, Secret Scanning Across Code and Notebooks

Notebooks are the worst offenders. Data scientists paste API keys into cell 1, save the notebook, push to the repo. TruffleHog v3.85 understands .ipynb natively and Gitleaks 8.22 added smarter base64 detection in March 2025.

Run both in a pre-commit hook and in CI. They have different false-positive profiles; the union catches more than either alone.

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/trufflesecurity/trufflehog
    rev: v3.85.0
    hooks:
      - id: trufflehog
        args: ['--no-update', '--fail', 'git', 'file://.', '--since-commit', 'HEAD~1']
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.22.0
    hooks:
      - id: gitleaks

And in GitHub Actions:

# .github/workflows/secrets.yml
name: secrets
on: [push, pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: TruffleHog
        uses: trufflesecurity/trufflehog@v3.85.0
        with:
          extra_args: --only-verified --no-update
      - name: Gitleaks
        uses: gitleaks/gitleaks-action@v8.22.0

The --only-verified flag in TruffleHog actually attempts to authenticate against the suspected provider. False positive rate drops to near zero. Use it.

2.1 Custom detectors for ML keys

Vendor-specific keys (Hugging Face, Weights & Biases, Modal, Replicate) sometimes aren’t in the default detector set. Write custom Gitleaks rules:

# .gitleaks.toml
[[rules]]
id = "huggingface-token"
description = "Hugging Face access token"
regex = '''hf_[A-Za-z0-9]{34,40}'''
keywords = ["huggingface", "hf_"]

[[rules]]
id = "wandb-key"
description = "Weights & Biases API key"
regex = '''[a-f0-9]{40}'''
path = '''.*\.netrc'''

3. Step 2, SAST for ML Code

Semgrep 1.95 has the p/python ruleset for general Python and a p/ml-security pack that catches ML-specific issues: pickle deserialization of untrusted input, unsafe YAML loading in DVC files, torch.load without weights_only=True.

# semgrep.yml
rules:
  - id: torch-load-untrusted
    pattern: torch.load($PATH)
    message: |
      torch.load without weights_only=True can execute arbitrary code.
      Use torch.load(path, weights_only=True) when loading checkpoints
      from untrusted sources.
    severity: ERROR
    languages: [python]
    metadata:
      cwe: 'CWE-502: Deserialization of Untrusted Data'

  - id: pickle-load
    patterns:
      - pattern: pickle.load($X)
      - pattern-not-inside: |
          # trusted-pickle
          ...
    message: Pickle deserialization with untrusted input. Migrate to safetensors.
    severity: ERROR
    languages: [python]

The weights_only default in PyTorch 2.4+ flipped to True, but plenty of code still pins older versions. The rule catches it regardless.

3.1 Wiring Semgrep into CI

# .github/workflows/sast.yml
name: sast
on: [pull_request]
jobs:
  semgrep:
    runs-on: ubuntu-latest
    container: returntocorp/semgrep:1.95
    steps:
      - uses: actions/checkout@v4
      - run: |
          semgrep ci \
            --config=p/python \
            --config=p/ml-security \
            --config=./semgrep.yml \
            --baseline-ref=origin/main

The --baseline-ref flag means PR-level scans only fail on newly introduced findings. Existing tech debt doesn’t block every PR. Necessary compromise to keep the team using the tool.

4. Step 3, Signing Model Artifacts with Cosign

Once training finishes and evaluation passes, you have a model artifact. Sign it. Cosign 2.5 supports keyless signing via Sigstore, which means no key management to mess up.

# Sign a model file using keyless OIDC (works in GitHub Actions)
cosign sign-blob \
  --bundle model.bundle \
  --yes \
  ./checkpoints/model-v42.safetensors

# Or sign and upload to an OCI registry
cosign attach signature \
  --signature model.sig \
  registry.internal/models/sentiment:v42

The cosign sign-blob command produces a Sigstore bundle containing the signature, the certificate, and the transparency log entry. Store the bundle next to the model. Verify it at every consumer.

# Verify before loading
cosign verify-blob \
  --bundle model.bundle \
  --certificate-identity 'https://github.com/myorg/ml-pipeline/.github/workflows/train.yml@refs/heads/main' \
  --certificate-oidc-issuer 'https://token.actions.githubusercontent.com' \
  ./checkpoints/model-v42.safetensors

The verification pins the model to the exact workflow file and branch that produced it. An attacker who pushes to a different branch can’t produce a passing signature.

4.1 Attestations beyond signing

Cosign also signs in-toto attestations. Attach the training command, the data version, the eval results, and the SBOM as attestations alongside the signature:

cosign attest-blob \
  --predicate ./eval-results.json \
  --type custom \
  --bundle eval.bundle \
  ./checkpoints/model-v42.safetensors

Now consumers can verify not just that the model came from a trusted workflow but that it passed evaluation, with the eval numbers cryptographically tied to the model. This is the missing piece in most ML governance stories.

5. Step 4, Admission Policies with Kyverno

Even with signed models, you need a runtime gate. Kyverno 1.13 enforces policies on Kubernetes resources at admission time. Block any model-serving pod that references an unsigned image or unsigned model.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-signed-models
spec:
  validationFailureAction: Enforce
  webhookTimeoutSeconds: 30
  rules:
    - name: check-image-signature
      match:
        any:
          - resources:
              kinds: [Pod]
              namespaces: [inference]
      verifyImages:
        - imageReferences:
            - "registry.internal/models/*"
          attestors:
            - entries:
                - keyless:
                    subject: "https://github.com/myorg/ml-pipeline/.github/workflows/train.yml@refs/heads/main"
                    issuer: "https://token.actions.githubusercontent.com"

Apply this policy and any Pod that tries to pull a model image not signed by the trusted workflow gets rejected at the API server. No bypass through kubectl apply --force. No “I’ll just deploy from my laptop this once.”

5.1 Policy reports

Kyverno generates PolicyReport CRDs for every violation. Ship them to your SIEM. The audit trail is automatic.

apiVersion: kyverno.io/v2beta1
kind: PolicyException
metadata:
  name: temporary-eval-bypass
  namespace: eval
spec:
  exceptions:
  - policyName: require-signed-models
    ruleNames:
      - check-image-signature
  match:
    any:
      - resources:
          kinds: [Pod]
          selector:
            matchLabels:
              env: eval

Exceptions exist for evaluation environments where you want to test unsigned models. Make them explicit, scoped, and time-bounded. Track them.

6. Step 5, Training Data Provenance

The hardest part of ML DevSecOps is data. Code is reviewable; a 2TB training set is not. The discipline is: every dataset version has a hash, every hash is recorded with the training run, no training run accepts a dataset without a verified hash.

# data_loader.py
import hashlib
from pathlib import Path

EXPECTED_HASHES = {
    "v1.4.0": "8c2f5e6d9a1b3c4f...",
    "v1.5.0": "7a3b9c2d1e5f4a6b...",
}

def load_dataset(version: str, path: Path) -> Dataset:
    expected = EXPECTED_HASHES[version]
    actual = hashlib.sha256(path.read_bytes()).hexdigest()
    if actual != expected:
        raise RuntimeError(
            f"Dataset hash mismatch for {version}: "
            f"expected {expected}, got {actual}"
        )
    return Dataset.from_path(path)

Combine with content-addressed storage (DVC, LakeFS, or just S3 with object-version pinning) and you get verifiable data provenance for free. Record the hash in the model attestation from section 4.

7. The Full Pipeline Diagram

Here’s how the pieces stack up:

   developer push
        |
        v
+--------------+
| TruffleHog,  |--- block on verified secret
| Gitleaks     |
+------+-------+
        |
        v
+--------------+
| Semgrep      |--- block on new ERROR findings
+------+-------+
        |
        v
+--------------+
| training job |--- verifies dataset hash
|              |    records training args
+------+-------+
        |
        v
+--------------+
| eval suite   |--- gated on quality + safety thresholds
+------+-------+
        |
        v
+--------------+
| cosign sign  |--- keyless via Sigstore
| + attest     |    attaches eval results
+------+-------+
        |
        v
+--------------+
| OCI registry |
+------+-------+
        |
        v
+--------------+
| Kyverno      |--- admission gate on signature
|              |    + workflow identity
+------+-------+
        |
        v
   model running
   in production

Each step is independently testable and independently breakable. Don’t try to ship all of it at once.

8. Common Pitfalls

Four pitfalls that bit me or teams I’ve worked with.

8.1 Skipping the signature check “because it’s just internal”

The Kyverno policy is the only thing standing between “we sign artifacts” and “we sign artifacts and verify them.” Plenty of orgs sign and never verify. The signature is useless without the verifier.

8.2 Notebooks bypassing CI

Pre-commit hooks don’t run in Jupyter web UIs. Notebooks that get committed via the web UI (Colab, JupyterHub commit buttons) skip your hooks. Run TruffleHog as a required GitHub Actions check, not just a pre-commit.

8.3 Pickle ignorance

Loading a checkpoint from a teammate via Slack feels innocent. It’s arbitrary code execution if it’s a pickle. Forbid .pkl and .pt files in your registry, allow only .safetensors. Make it a Kyverno policy: refuse any pod whose model URL ends in .pkl.

8.4 Secret scanning ignoring archives

If a .tar.gz of training data contains a .env file, default scanner configs miss it. Either unpack archives in CI before scanning or block archives larger than a threshold from being committed.

9. Troubleshooting

When the pipeline misbehaves, three failure modes appear repeatedly.

9.1 Cosign verification failing on a freshly built image

Usually because the workflow identity claim drifted (someone renamed the workflow file, changed the branch ref, or moved to a tag). Run cosign verify-blob --certificate-identity-regexp against the actual cert to see what claim the signature carries, then update the policy or the workflow path.

9.2 Semgrep finding the same issue on every PR

The baseline comparison isn’t working. Either you didn’t fetch enough git history (fetch-depth: 0 in the checkout), or your baseline ref doesn’t exist on the runner. Print git log --oneline -5 before the Semgrep step to confirm.

9.3 Kyverno blocking system pods

If you scope the policy too broadly, you can block kube-system pods, which means cluster crashes. Always exclude system namespaces explicitly:

match:
  any:
    - resources:
        namespaces: [inference, agents]

Not:

match:
  any:
    - resources:
        kinds: [Pod]

10. Wrapping Up

DevSecOps for ML is not about adopting twenty new tools. It’s about applying the same discipline you’d apply to a regular software supply chain to the bigger, weirder artifacts that ML pipelines produce. Sign everything that gets promoted. Verify everything before it runs. Scan code and notebooks for secrets. Use SAST that understands ML idioms. Block unsigned models at admission. Record dataset hashes alongside model hashes.

None of this is glamorous. All of it is durable. The teams that adopted these patterns in 2024 sailed through the wave of model-stealing and data-poisoning incidents that hit slower-moving orgs in 2025. The teams that didn’t are still cleaning up.

For deeper reading, the SLSA framework provides the conceptual model for supply chain security and now has an ML profile in active development. My next post in this series, securing RAG against data exfiltration, tackles the retrieval side of the same problem.