GitHub Actions Caching, actions/cache + BuildKit Registry Cache

GitHub Actions Caching, actions/cache + BuildKit Registry Cache

February 23, 2022 · 6 min read · by Muhammad Amal programming

TL;DR — actions/cache@v3 for language deps and tool downloads. BuildKit cache-from/cache-to with registry backend for Docker builds. Cache key on hashFiles('**/lockfile') plus a fallback restore-keys. Done right, second-run builds are ~10× faster.

The base CI workflow got us to 90 seconds on warm cache. Most of that warmth came from setup-go’s built-in module cache. But that pattern only works for Go’s specific cache structure. For everything else — Docker layers, npm modules in a side-by-side frontend, Python wheels, custom toolchains — you need to use actions/cache@v3 directly.

This post is the patterns I use for caching that actually hits, not just the cache configs that look right and then mysteriously miss every run.

The cache contract

actions/cache@v3 works like this:

- uses: actions/cache@v3
  with:
    path: ~/.cache/go-build
    key: ${{ runner.os }}-go-${{ hashFiles('**/go.sum') }}
    restore-keys: |
      ${{ runner.os }}-go-

key is the cache identity. If a cache exists with that exact key, it’s restored. After the job, if the key didn’t exist before, the current path is saved with that key.
restore-keys is a fallback. If key doesn’t match, the action looks for any cache with a key matching one of these prefixes (longest match first) and restores that.

The crucial detail: if key matched exactly, no save happens at end of job. Only key mismatches trigger a save. So if your key doesn’t include the right “what should invalidate this” signal, you’ll silently never update the cache.

Cache key patterns that work

Three keys I use frequently:

Node.js (npm)

- uses: actions/cache@v3
  with:
    path: |
      ~/.npm
      node_modules
    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
      ${{ runner.os }}-node-

package-lock.json is the source of truth for what’s installed. Hash → key. Lock file changes → key changes → fresh cache.

Python (pip / Poetry)

- uses: actions/cache@v3
  with:
    path: ~/.cache/pip
    key: ${{ runner.os }}-py-${{ hashFiles('**/requirements*.txt', '**/poetry.lock') }}
    restore-keys: |
      ${{ runner.os }}-py-

Hashes whichever lockfile your project uses. Same pattern.

Custom toolchain / large download

- uses: actions/cache@v3
  id: cache-protoc
  with:
    path: /opt/protoc
    key: ${{ runner.os }}-protoc-21.5
- name: Install protoc
  if: steps.cache-protoc.outputs.cache-hit != 'true'
  run: |
    curl -L -o /tmp/protoc.zip https://github.com/protocolbuffers/protobuf/releases/download/v21.5/protoc-21.5-linux-x86_64.zip
    sudo unzip /tmp/protoc.zip -d /opt/protoc

cache-hit output tells you whether the restore actually matched the exact key (not just a restore-key fallback). Skip the download if it did.

BuildKit registry cache for Docker

actions/cache@v3 doesn’t help with Docker image layers — those live in the daemon, not the filesystem. For Docker, you use BuildKit’s built-in cache mechanism, ideally with a registry backend.

jobs:
  image:
    runs-on: ubuntu-22.04
    permissions:
      contents: read
      packages: write
    steps:
      - uses: actions/checkout@v3

      - uses: docker/setup-buildx-action@v2

      - uses: docker/login-action@v2
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - uses: docker/build-push-action@v3
        with:
          context: .
          push: true
          tags: ghcr.io/${{ github.repository }}/billing:${{ github.sha }}
          cache-from: type=registry,ref=ghcr.io/${{ github.repository }}/billing:buildcache
          cache-to: type=registry,ref=ghcr.io/${{ github.repository }}/billing:buildcache,mode=max
          platforms: linux/amd64,linux/arm64

What cache-from / cache-to do:

After the build, BuildKit pushes layer cache to a dedicated :buildcache tag in the registry.
On the next build, BuildKit pulls that cache and reuses layers when their inputs haven’t changed.
mode=max caches all intermediate layers (including build stages). Without it, only the final layers cache.

This works across runners, across branches, across forks. The registry holds the cache; any runner can pull it.

For our 14 MB Go service (see the Dockerfile post):

Cold build: ~4 minutes
Warm cache: ~35 seconds

Same Dockerfile, same code. Difference is the registry-backed BuildKit cache.

Alternative: GitHub Actions Cache backend for BuildKit

BuildKit also supports type=gha:

cache-from: type=gha
cache-to: type=gha,mode=max

This uses GitHub’s free Actions Cache instead of your container registry. Trade-offs:

Pro: free, no registry storage cost, doesn’t pollute your registry with :buildcache tags
Con: caches are limited per-repo (~10 GB total), evicted aggressively, sometimes flaky
Con: only available within Actions; useless if you want to share cache with developer laptops

For most teams in 2022, type=gha is fine. For very large images or shared cache between Actions and laptops, registry backend is better.

Cache miss debugging

When a cache miss surprises you, two diagnostics:

- run: gh cache list --key ${{ runner.os }}-go-
  env:
    GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}

Lists actual cache entries by prefix. Often reveals “I expected key X but Y was saved” patterns.

- name: Cache hit?
  run: echo "cache-hit=${{ steps.cache.outputs.cache-hit }}"

cache-hit is 'true' only on exact-key match. 'false' on miss or restore-key fallback. Useful when you need to know if you can skip a heavy step.

The 10 GB rule

GitHub’s per-repo Actions Cache limit is 10 GB (as of Feb 2022). Caches over 30 days unused are evicted. If you hit the cap, oldest entries go first.

In practice:

10 GB is enough for one Go monorepo’s Go cache + Docker BuildKit cache.
Not enough for “every PR branch caches everything.”
Manage via gh cache delete in scheduled workflows, or rely on natural eviction.

A combined CI workflow

Pulling it together for a service with both Go code and a Docker build:

jobs:
  test:
    runs-on: ubuntu-22.04
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-go@v3
        with:
          go-version: '1.17'
          cache: true   # built-in Go module + build cache
      - run: go test -race -count=1 ./...

  image:
    runs-on: ubuntu-22.04
    needs: test
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    permissions:
      contents: read
      packages: write
    steps:
      - uses: actions/checkout@v3
      - uses: docker/setup-buildx-action@v2
      - uses: docker/login-action@v2
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}
      - uses: docker/build-push-action@v3
        with:
          context: .
          push: true
          tags: ghcr.io/${{ github.repository }}/billing:${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

Tests run on every PR, images push only on main. Both caches are warm within a few runs.

Common Pitfalls

Cache key without lockfile hash. A static key like ${{ runner.os }}-deps matches forever. Cache silently never updates when you change dependencies. Always hash the lockfile.

Caching node_modules without also caching ~/.npm. Different cache backends have subtle differences with symlinks and platform binaries. The safe pattern is both, with key on package-lock.json.

Forgetting mode=max on BuildKit cache-to. Without it, only the final-stage layers cache. Intermediate build stages (the heavy ones) re-run every time.

Treating restore-keys as required. Often you don’t want it. Forced full key match is fine for cases where partial restoration would corrupt state (Postgres data directories, custom toolchain installs).

Hitting the 10 GB cap silently. GitHub doesn’t error when you hit the cap; it just evicts. If you have many large caches, audit periodically.

Cache poisoning across forks. If your CI runs on pull_request, forks can read but not write your cache by default. If they could write, a malicious fork could poison your cache. GitHub’s defaults are right; don’t override.

Wrapping Up

Cache configuration is the boring middle of CI work, and where most CI time gets lost. actions/cache@v3 with lockfile-hashed keys for language deps; BuildKit registry or GHA cache for Docker. Most warm-cache CI runs should be under 90 seconds with this setup. Friday: matrix builds and parallel test sharding — when your test suite gets too big for one job to finish in five minutes.

The cache contract

Cache key patterns that work

Node.js (npm)

Python (pip / Poetry)

Custom toolchain / large download

BuildKit registry cache for Docker

Alternative: GitHub Actions Cache backend for BuildKit

Cache miss debugging

The 10 GB rule

A combined CI workflow

Common Pitfalls

Wrapping Up

Related posts

Deploying Docker Images from GitHub Actions to Staging

Docker Compose for CI, Ephemeral Stacks per Test Run

Containerizing a Rust Service, A Sub-25MB Production Image

GitHub Actions Matrix Builds and Parallel Test Sharding

GitHub Actions for Go Monorepos, A 2022 Setup

Advanced GitHub Actions, Reusable Workflows, OIDC, and Matrix Patterns That Don't Become Spaghetti

Building Images Inside Docker Compose, build vs image

Docker Volumes vs Bind Mounts, When to Use Each

Let’s Start a Project