Deploying Docker Images from GitHub Actions to Staging

Github actions article cover illustration on a gradient background

February 28, 2022 · 6 min read · by Muhammad Amal programming

TL;DR — On merge to main: build, push to GHCR with both SHA + branch tags, then deploy via kubectl with the SHA tag. Use OIDC for cloud auth instead of long-lived secrets. Environment-protected jobs for prod. Wired right, “merged” to “running in staging” takes 4 minutes.

Last post of February. Tying together everything from this month: containerization from January, GitHub Actions CI , registry caching , matrix parallelism . The thing that turns it from “CI” into “CI/CD” is the last D — Deploy.

This is the deploy pipeline we’re running for staging in early 2022. Production is similar but with manual approval and extra protections we’ll cover separately. Staging auto-deploys on every push to main.

The full deploy workflow

# .github/workflows/deploy-billing-staging.yml
name: deploy-billing-staging

on:
  push:
    branches: [main]
    paths:
      - 'cmd/billing/**'
      - 'internal/billing/**'
      - 'internal/shared/**'
      - 'Dockerfile.billing'

concurrency:
  group: deploy-billing-staging
  cancel-in-progress: false   # don't cancel mid-deploy

jobs:
  build:
    runs-on: ubuntu-22.04
    permissions:
      contents: read
      packages: write
    outputs:
      image-sha: ${{ steps.meta.outputs.version }}
    steps:
      - uses: actions/checkout@v3

      - uses: docker/setup-buildx-action@v2

      - uses: docker/login-action@v2
        with:
          registry: ghcr.io
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - uses: docker/metadata-action@v4
        id: meta
        with:
          images: ghcr.io/${{ github.repository_owner }}/billing
          tags: |
            type=sha,format=long
            type=raw,value=staging

      - uses: docker/build-push-action@v3
        with:
          context: .
          file: Dockerfile.billing
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
          build-args: |
            VERSION=${{ github.ref_name }}
            COMMIT=${{ github.sha }}
          platforms: linux/amd64

  deploy:
    runs-on: ubuntu-22.04
    needs: build
    environment: staging
    permissions:
      id-token: write
      contents: read
    steps:
      - uses: actions/checkout@v3

      - uses: azure/setup-kubectl@v3
        with:
          version: v1.23.5

      - uses: aws-actions/configure-aws-credentials@v1
        with:
          role-to-assume: arn:aws:iam::123456789012:role/gh-actions-staging
          aws-region: ap-southeast-1

      - run: aws eks update-kubeconfig --name staging-cluster --region ap-southeast-1

      - name: Deploy to staging
        run: |
          kubectl -n billing set image deploy/billing \
            billing=ghcr.io/${{ github.repository_owner }}/billing@${{ needs.build.outputs.image-sha }}
          kubectl -n billing rollout status deploy/billing --timeout=5m

      - name: Smoke test
        run: |
          ENDPOINT=https://billing-staging.example.com
          for i in 1 2 3; do
            if curl -fsS --max-time 5 "$ENDPOINT/readyz"; then
              echo "Smoke test passed"
              exit 0
            fi
            sleep 5
          done
          echo "Smoke test failed"
          exit 1

Two jobs: build produces the image, deploy applies it. Sequential via needs. About 4 minutes end-to-end on warm cache.

What’s happening, walked through

Path-filtered trigger. Only runs when files under cmd/billing/**, internal/billing/**, internal/shared/**, or Dockerfile.billing change. Push to main that only touches the notifications service doesn’t trigger billing’s deploy.

concurrency without cancel-in-progress. Critical for deploys. If a second deploy queues while the first is mid-rollout, you want it to wait, not cancel the first one. Set cancel-in-progress: false (or omit it — default depends on Actions version).

docker/metadata-action@v4 for tags. Generates the image tags from one definition: type=sha,format=long produces a sha-<long-sha> tag; type=raw,value=staging produces a fixed staging tag. The image gets pushed under both tags simultaneously, which lets you kubectl set image ... @sha256:... for immutable deploys AND have a friendly :staging tag for humans to reference.

OIDC for AWS auth. This is the big 2022 win. aws-actions/configure-aws-credentials@v1 exchanges a GitHub OIDC token for short-lived AWS STS credentials. No AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY secrets. The IAM role you assume is configured to trust GitHub OIDC for your specific repo/branch/environment.

# Terraform sketch of the IAM trust policy
data "aws_iam_policy_document" "trust" {
  statement {
    actions = ["sts:AssumeRoleWithWebIdentity"]
    principals {
      type        = "Federated"
      identifiers = ["arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com"]
    }
    condition {
      test     = "StringEquals"
      variable = "token.actions.githubusercontent.com:aud"
      values   = ["sts.amazonaws.com"]
    }
    condition {
      test     = "StringEquals"
      variable = "token.actions.githubusercontent.com:sub"
      values   = ["repo:yourorg/yourrepo:environment:staging"]
    }
  }
}

Short-lived creds, scoped to a specific environment. No secret to rotate. No “an engineer copied AWS keys to Slack three years ago” risk.

environment: staging. Lets you configure required reviewers (off for staging, on for prod), environment-scoped secrets, deployment URL display in the GitHub UI. Even when you don’t need approvals, it’s worth declaring environments for the visibility and history.

Immutable image tags for kubectl set image. Setting the image by @sha256:... digest (or :sha-<long-sha> tag) means the deploy is reproducible. If you set :staging, you’re at the mercy of whatever happened to be :staging when the rollout was triggered — fine for humans but a footgun in automation.

Rollout status + smoke test. rollout status blocks until the new pods are ready (or fails on timeout). Then a smoke curl confirms the actual endpoint serves traffic. Catches the case where pods are healthy by their probe but the route hasn’t propagated yet.

Variations by deployment target

The above is EKS. The pattern adapts:

Cloud Run:

- uses: google-github-actions/auth@v0
  with:
    workload_identity_provider: projects/123/locations/global/workloadIdentityPools/gh/providers/gh-provider
    service_account: [email protected]

- uses: google-github-actions/deploy-cloudrun@v0
  with:
    service: billing-staging
    image: ghcr.io/${{ github.repository_owner }}/billing@${{ needs.build.outputs.image-sha }}
    region: asia-southeast2

Same OIDC-based auth, same immutable image, different deploy command.

Plain SSH (e.g., a VPS running Docker Compose):

- uses: appleboy/[email protected]
  with:
    host: ${{ secrets.STAGING_HOST }}
    username: deploy
    key: ${{ secrets.STAGING_SSH_KEY }}
    script: |
      cd /opt/stack
      docker pull ghcr.io/${{ github.repository_owner }}/billing@${{ needs.build.outputs.image-sha }}
      sed -i "s|image:.*billing.*|image: ghcr.io/${{ github.repository_owner }}/billing@${{ needs.build.outputs.image-sha }}|" docker-compose.yml
      docker compose up -d billing

Lower-tech but works. SSH key as secret is still required (OIDC doesn’t help here). Rotate annually.

Production = staging + protections

Production differs from staging by:

on: workflow_dispatch only (no auto-deploy on push)
environment: production with required-reviewer protection
Possibly a “promote from staging” pattern that retags the staging image as production rather than rebuilding
Stricter rollback steps in the workflow

deploy-production:
  runs-on: ubuntu-22.04
  environment: production   # requires approval
  steps:
    - name: Promote staging image to production
      run: |
        STAGING_DIGEST=$(docker buildx imagetools inspect \
          ghcr.io/${{ github.repository_owner }}/billing:staging \
          --raw | jq -r '.config.digest')
        docker buildx imagetools create \
          --tag ghcr.io/${{ github.repository_owner }}/billing:production \
          ghcr.io/${{ github.repository_owner }}/billing@${STAGING_DIGEST}

No rebuild — the exact image bytes that passed staging are what runs in prod. Higher confidence, faster deploy.

Common Pitfalls

Deploying with the latest tag. Same problem as the staging tag in automation: nondeterministic. Always deploy by digest or by SHA-tag.

Long-lived cloud secrets in Actions secrets. AWS access keys, GCP service account JSON, etc. — these used to be the standard pattern. They’ve aged into a security smell. OIDC removes the need.

cancel-in-progress: true on deploy. Cancels mid-rollout. Half-deployed services are unpleasant. Default false for deploys.

No smoke test after rollout. “Kubernetes says the pods are ready” doesn’t always equal “the app actually works.” A 3-second curl after rollout has saved me from many “deploy succeeded but service down” pages.

Branch protections that block deploys you actually want. environment: production with required reviewer is good. Adding “require status checks” that depend on CI from a different branch is a recipe for “deploy stuck waiting for a check that will never run.”

Letting deploy logs leak secrets. Some debug steps print env vars. Use add-mask:: workflow commands for any value you log, or stop logging them.

Wrapping Up

This deploy workflow ties together the month: a Docker image built with the right Dockerfile , pushed via registry-cached BuildKit , and rolled out via kubectl with OIDC-authed cloud access. End-to-end: code merged to main → running in staging in about 4 minutes. End-of-month retro (no separate post — this is it): February covered the Postgres half and the CI/CD half. March pivots to Rust and memory safety for backend APIs. Same pattern: theme per month, three articles per week, on the same blog you’re reading this on.

The full deploy workflow

What’s happening, walked through

Variations by deployment target

Production = staging + protections

Common Pitfalls

Wrapping Up

Related posts

GitHub Actions Caching, actions/cache + BuildKit Registry Cache

Docker Compose for CI, Ephemeral Stacks per Test Run

Containerizing a Rust Service, A Sub-25MB Production Image

GitHub Actions Matrix Builds and Parallel Test Sharding

GitHub Actions for Go Monorepos, A 2022 Setup

Advanced GitHub Actions, Reusable Workflows, OIDC, and Matrix Patterns That Don't Become Spaghetti

Building Images Inside Docker Compose, build vs image

Docker Volumes vs Bind Mounts, When to Use Each

Let’s Start a Project