Docker Compose for CI, Ephemeral Stacks per Test Run

Docker Compose for CI, Ephemeral Stacks per Test Run

July 25, 2022 · 5 min read · by Muhammad Amal programming

TL;DR — In CI: each test run gets an isolated stack via COMPOSE_PROJECT_NAME=ci-${GITHUB_RUN_ID}. Use --wait to block until healthchecks pass. Always teardown with down -v in a cleanup step. Parallel runs are safe because each project is independent.

After resource limits, Compose’s other big use case: spinning up integration test stacks in CI. Same compose file, different lifecycle. The patterns differ.

The core idea

In dev, you keep stacks running for hours. In CI, you bring up a stack, run tests, tear it down. Per CI run = per stack.

# docker-compose.ci.yml
services:
  postgres:
    image: postgres:14-alpine
    environment:
      POSTGRES_USER: app
      POSTGRES_PASSWORD: app
      POSTGRES_DB: app_test
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U app -d app_test"]
      interval: 2s
      retries: 30

  api:
    build:
      context: .
      target: prod
    depends_on:
      postgres: { condition: service_healthy }
    environment:
      DATABASE_URL: postgres://app:app@postgres/app_test
    ports:
      - "8080"   # random host port

Note:

Healthcheck interval: 2s (faster than dev — CI needs quick boot)
Random host port via "8080" (no host port number specified) — avoids collisions in parallel CI runs
target: prod builds the actual production image being tested

Project name for isolation

COMPOSE_PROJECT_NAME=ci-${GITHUB_RUN_ID} docker compose -f docker-compose.ci.yml up -d --wait

Sets a unique project name per run. Result:

Network: ci-12345_default
Volumes: ci-12345_postgres-data
Containers: ci-12345-postgres-1, ci-12345-api-1

Two CI runs on the same machine don’t collide. Tear down one without affecting the other.

--wait flag (v2.1+) blocks up until all healthchecks pass. Essential for CI: tests can run as soon as the stack is up.

A full GitHub Actions job

jobs:
  integration:
    runs-on: ubuntu-22.04
    env:
      COMPOSE_PROJECT_NAME: ci-${{ github.run_id }}

    steps:
      - uses: actions/checkout@v3

      - uses: docker/setup-buildx-action@v2

      - name: Bring up stack
        run: |
          docker compose -f docker-compose.ci.yml up -d --wait --quiet-pull
          docker compose -f docker-compose.ci.yml ps

      - name: Run integration tests
        env:
          API_URL: http://localhost:${{ steps.api-port.outputs.port }}
        run: |
          npm run test:integration

      - name: Show logs on failure
        if: failure()
        run: docker compose -f docker-compose.ci.yml logs

      - name: Teardown
        if: always()
        run: docker compose -f docker-compose.ci.yml down -v

Five key bits:

COMPOSE_PROJECT_NAME set at job level — used by all compose calls
--wait blocks until healthchecks green
--quiet-pull reduces log noise
logs on failure for debugging
down -v on always() — runs even if tests failed. Volumes go away too.

Getting random ports

Compose maps to a random host port; you need to know which:

api:
  ports: ["8080"]    # no host port specified

docker compose -f docker-compose.ci.yml port api 8080
# Output: 0.0.0.0:49237

In GH Actions:

- name: Get API port
  id: api-port
  run: |
    PORT=$(docker compose -f docker-compose.ci.yml port api 8080 | cut -d: -f2)
    echo "port=$PORT" >> $GITHUB_OUTPUT

Now ${{ steps.api-port.outputs.port }} is available downstream.

Alternative: bind to the host’s localhost:0 and read the port via Docker’s API. Or just expose a single known port per service.

Test runner inside or outside the stack?

Two patterns:

Pattern A — tests run on host, hit containers via port:

docker compose up -d --wait
API_URL=http://localhost:8080 npm run test:integration
docker compose down -v

Pros: tests have host access, easy to debug.

Cons: needs Node (or whatever) installed on the host alongside Docker.

Pattern B — tests run inside a container:

test-runner:
  build:
    context: .
    target: test
  depends_on:
    api: { condition: service_healthy }
  command: ["npm", "run", "test:integration"]
  network_mode: "service:api"    # share network namespace

docker compose up --abort-on-container-exit test-runner

Pros: tests run in containerland; no host deps.

Cons: more setup; debugging is harder.

For most CI: Pattern A is simpler and faster.

Parallel CI runs on the same host

Two PRs trigger CI simultaneously, both run on the same self-hosted runner:

COMPOSE_PROJECT_NAME=ci-12345
COMPOSE_PROJECT_NAME=ci-12346

Different project names = different networks + volumes + containers. They coexist. Each ports to a different random host port. Both clean up independently.

Caveat: image builds aren’t naturally isolated. If both runs build target: prod, the BuildKit cache is shared. Usually fine; in edge cases (build poisoning) it’s worth isolating builds too.

Test data setup

The migrator pattern from healthchecks post works for CI too:

migrator:
  build:
    context: .
    target: migrator
  depends_on:
    postgres: { condition: service_healthy }
  command: ["./migrate", "up"]

seeder:
  build:
    context: .
    target: seeder
  depends_on:
    migrator: { condition: service_completed_successfully }
  command: ["./seed", "test-data"]

api:
  depends_on:
    seeder: { condition: service_completed_successfully }

Each test run gets fresh data. No bleed-over between runs.

Cleanup is critical

CI runners run thousands of jobs. Leaked volumes and networks accumulate fast. Always:

- name: Teardown
  if: always()        # critical — runs on failure too
  run: docker compose -f docker-compose.ci.yml down -v

Periodically (cron job or scheduled workflow):

docker volume prune -f
docker network prune -f
docker image prune -af --filter "until=24h"

For self-hosted runners. GitHub-hosted runners are ephemeral; less issue.

Common Pitfalls

Forgetting --wait. Tests start before stack is ready. Flaky failures.

Hardcoded host ports. Two parallel runs collide. Use random.

Same project name across runs. Containers and volumes from old run interfere with new. Always unique.

No teardown step. Volumes pile up. Eventually disk full.

Building images during up. Slow CI. Build separately (docker compose build) then up, or use cached BuildKit.

Test failures swallow logs. Always print logs on failure for debugging.

Tests that assume specific hostnames. Hardcoding postgres:5432 works if tests run inside compose network; not from host. Use compose-routed URLs or localhost+random-port.

Wrapping Up

Per-CI-run ephemeral stacks: unique project name + --wait + always teardown. Same compose file as dev (or close to it) + a CI-specific override. Wednesday: Compose vs Kubernetes for local dev — the comparison many teams revisit.

The core idea

Project name for isolation

A full GitHub Actions job

Getting random ports

Test runner inside or outside the stack?

Parallel CI runs on the same host

Test data setup

Cleanup is critical

Common Pitfalls

Wrapping Up

Related posts

GitHub Actions Matrix Builds and Parallel Test Sharding

Deploying Docker Images from GitHub Actions to Staging

GitHub Actions Caching, actions/cache + BuildKit Registry Cache

GitHub Actions for Go Monorepos, A 2022 Setup

Advanced GitHub Actions, Reusable Workflows, OIDC, and Matrix Patterns That Don't Become Spaghetti

July Retro, Compose in Production-Adjacent Workflows

Docker Compose vs Kubernetes for Local Development

Docker Compose Resource Limits, Memory and CPU

Let’s Start a Project