Testing gRPC Services in Go with testcontainers and bufconn

Testing gRPC Services in Go with testcontainers and bufconn

March 30, 2023 · 8 min read · by Muhammad Amal programming

TL;DR — bufconn runs a real gRPC server in-process over an in-memory connection — perfect for handler tests / testcontainers-go spins up real Postgres, Redis, Kafka in Docker for integration tests / Contract tests (buf breaking, schema diff in CI) catch the API breakages your unit tests can’t.

There’s a comfortable lie in microservice testing: mocking everything makes tests fast. The lie is that fast tests that mock the boundaries don’t actually test the boundaries — and the boundaries are where bugs live. Network code, serialization, database semantics, gRPC interceptors that mutate context: these need real stacks under them at some level of the pyramid.

This is the last post in the March series. We’ve covered gRPC basics, streaming, concurrency, context, interceptors, pooling, and observability. Testing is where all of this comes together — or fails to.

I’ll lay out the three layers I actually run in CI, with code, and the trade-offs that drive when to use each.

Layer 1: Pure Unit Tests

The cheapest, fastest, smallest tests. They test a single function with no external dependencies. For a gRPC service, this is mostly business logic: pricing, validation, state transitions.

func TestComputeTotal(t *testing.T) {
    cases := []struct {
        name     string
        items    []*billingv1.LineItem
        expected int64
    }{
        {"empty", nil, 0},
        {"single", []*billingv1.LineItem{{Quantity: 2, UnitAmountCents: 500}}, 1000},
        {"multiple", []*billingv1.LineItem{
            {Quantity: 1, UnitAmountCents: 1000},
            {Quantity: 3, UnitAmountCents: 250},
        }, 1750},
    }
    for _, tc := range cases {
        t.Run(tc.name, func(t *testing.T) {
            got := computeTotal(tc.items)
            if got != tc.expected {
                t.Errorf("got %d, want %d", got, tc.expected)
            }
        })
    }
}

Nothing exotic. Table-driven, no mocks, no setup. If you can’t write your business logic in functions like this, you’ve coupled it too tightly to the transport. Refactor before adding test infrastructure.

Layer 2: Handler Tests with bufconn

When you want to test the gRPC layer — interceptors, context propagation, status code mapping — you need a server. You don’t need a network. google.golang.org/grpc/test/bufconn gives you an in-memory net.Listener that’s perfect for this.

import "google.golang.org/grpc/test/bufconn"

func newTestServer(t *testing.T) billingv1.InvoiceServiceClient {
    t.Helper()
    lis := bufconn.Listen(1024 * 1024)
    srv := grpc.NewServer(
        grpc.ChainUnaryInterceptor(
            recovery.UnaryServerInterceptor(),
            auth.UnaryServerInterceptor(testVerifier),
        ),
    )
    billingv1.RegisterInvoiceServiceServer(srv, &invoiceServer{
        repo: newInMemoryRepo(),
    })
    go func() {
        if err := srv.Serve(lis); err != nil {
            t.Logf("serve: %v", err)
        }
    }()
    t.Cleanup(func() {
        srv.GracefulStop()
    })

    conn, err := grpc.Dial("bufnet",
        grpc.WithContextDialer(func(ctx context.Context, _ string) (net.Conn, error) {
            return lis.DialContext(ctx)
        }),
        grpc.WithTransportCredentials(insecure.NewCredentials()),
    )
    if err != nil {
        t.Fatal(err)
    }
    t.Cleanup(func() { conn.Close() })
    return billingv1.NewInvoiceServiceClient(conn)
}

Now your tests look like real gRPC calls:

func TestGetInvoice_Unauthenticated(t *testing.T) {
    client := newTestServer(t)
    _, err := client.GetInvoice(context.Background(), &billingv1.GetInvoiceRequest{
        InvoiceId: "inv_001",
    })
    if status.Code(err) != codes.Unauthenticated {
        t.Errorf("got code %v, want Unauthenticated", status.Code(err))
    }
}

This exercises:

Real interceptor chain
Real protobuf marshaling
Real context propagation
Real status code translation

What it skips:

Real network behavior (timeouts, partial reads)
Cross-process boundaries
TLS

For most handler-level tests, that’s the right trade. The tests run in milliseconds, no Docker, no flakiness.

Layer 3: Integration Tests with testcontainers

When you want to test the parts that hit real systems — database queries, message brokers, downstream services — you need real systems. testcontainers-go makes this manageable.

import "github.com/testcontainers/testcontainers-go/modules/postgres"

func newTestPool(t *testing.T) *pgxpool.Pool {
    t.Helper()
    ctx := context.Background()
    container, err := postgres.RunContainer(ctx,
        testcontainers.WithImage("postgres:15.2-alpine"),
        postgres.WithDatabase("test"),
        postgres.WithUsername("test"),
        postgres.WithPassword("test"),
        testcontainers.WithWaitStrategy(
            wait.ForLog("database system is ready to accept connections").
                WithOccurrence(2).
                WithStartupTimeout(30*time.Second),
        ),
    )
    if err != nil {
        t.Fatal(err)
    }
    t.Cleanup(func() {
        _ = container.Terminate(ctx)
    })

    dsn, err := container.ConnectionString(ctx, "sslmode=disable")
    if err != nil {
        t.Fatal(err)
    }
    pool, err := pgxpool.New(ctx, dsn)
    if err != nil {
        t.Fatal(err)
    }
    t.Cleanup(pool.Close)

    if err := runMigrations(ctx, pool); err != nil {
        t.Fatal(err)
    }
    return pool
}

The WaitStrategy is important. Postgres logs “ready to accept connections” twice — once during init, once after init scripts run. Connecting on the first occurrence will sometimes succeed and sometimes fail. WithOccurrence(2) waits for the real ready state.

For tests that touch multiple dependencies (Postgres + Redis + a downstream gRPC service), you spin up each as a container. The runtime cost is real — a test suite with full integration tests takes minutes, not seconds — so I gate them with a build tag:

//go:build integration
// +build integration

package billing_test

// ... test code ...

Run with go test -tags=integration ./... in CI, omit from local dev. Or run a smaller suite locally and the full suite in CI; the right split depends on your team.

Starting Postgres for every test function is slow. Share the container across the test package with TestMain:

var testPool *pgxpool.Pool

func TestMain(m *testing.M) {
    ctx := context.Background()
    container, pool, err := startPostgres(ctx)
    if err != nil {
        log.Fatal(err)
    }
    testPool = pool
    code := m.Run()
    _ = container.Terminate(ctx)
    os.Exit(code)
}

func TestInvoiceCreate(t *testing.T) {
    t.Cleanup(func() {
        testPool.Exec(context.Background(), "TRUNCATE invoices")
    })
    // ... use testPool ...
}

Each test cleans up its data; the container persists across tests in the package. Trade-off: tests in the same package share state, so order-independence matters more.

For full isolation, use one container per test class via t.Parallel() with care. For most projects, package-level sharing plus per-test truncation is the right trade.

Layer 4: Contract Tests

The kind people skip and regret. Your proto file is an API contract. Breaking changes — removed fields, changed types, renamed methods — silently break clients. Contract testing catches this in CI.

buf is the tool. It lints your protos and detects breaking changes.

# buf.yaml
version: v1
breaking:
  use:
    - FILE
lint:
  use:
    - DEFAULT

In CI:

buf lint
buf breaking --against '.git#branch=main'

The breaking check compares your branch’s protos against main and fails if any change would break the wire compatibility. Renames, type changes, field number reuses — all caught.

Pair this with the protoc-gen-validate ecosystem if you want field-level validation generated from proto annotations. The validation rules become part of the contract too.

Test Doubles for Downstream Services

When your service calls a downstream gRPC service in tests, you have three options:

Mock the client interface. Generate mocks with mockery or gomock. Fast, simple, doesn’t exercise the network.
Fake server via bufconn. Run a stub server that implements just enough of the downstream interface. More realistic, still in-process.
Real downstream in a container. If the downstream service is stable and containerizable, run it.

I default to (2) for cross-service tests. Mocking the client misses serialization issues; running the real downstream is heavy. A bufconn-hosted fake gives you a real wire path with controllable behavior.

type fakeBillingServer struct {
    billingv1.UnimplementedInvoiceServiceServer
    invoices map[string]*billingv1.Invoice
}

func (s *fakeBillingServer) GetInvoice(
    ctx context.Context,
    req *billingv1.GetInvoiceRequest,
) (*billingv1.Invoice, error) {
    inv, ok := s.invoices[req.GetInvoiceId()]
    if !ok {
        return nil, status.Error(codes.NotFound, "not found")
    }
    return inv, nil
}

Wire it into a bufconn server the same way as the unit-test server above. The test setup is verbose; abstract it into a helper.

Common Pitfalls

The collection from running these for years:

Mocking gRPC clients to the point of testing nothing. If the mock returns whatever the test expects, the test passes regardless of whether your real client wiring works. Mix in some bufconn-based tests.
Not running migrations in tests. A test pool against an empty database tells you nothing. Run your migrations in the test setup.
Forgetting t.Cleanup. Container leaks, pool leaks, goroutine leaks. Always register cleanup with t.Cleanup — it runs even if the test fails.
Shared state across tests without ordering guarantees. If test A inserts and test B expects it gone, you have a flake when they run in different orders. Truncate aggressively.
Tests that depend on real time. time.Now() in business logic plus tests checking exact timestamps is a recipe for flake. Inject a clock.
No timeout on integration tests. A hanging container test holds CI for hours. Set -timeout 5m and pick a sensible per-test timeout.
Running breaking-change checks against the wrong base. buf breaking --against must compare to a stable reference (main, last release tag). Comparing to the previous commit on a feature branch doesn’t catch the breaking change you introduced earlier in the branch.
Not testing failure paths. Happy-path tests are easy. Real bugs live in timeouts, partial failures, retries. Add a chaos test with simulated latency and errors via interceptor.

Wrapping Up

Test pyramids are tired advice, but the shape is right: lots of fast unit tests, a layer of in-process handler tests, a smaller set of integration tests with real dependencies, and a thin contract-test layer that catches API breakage. The Go ecosystem in 2023 makes each of these tractable — testing, bufconn, testcontainers-go, buf. Pick the layer that matches the risk you’re guarding against and don’t reach higher than you need to.

That closes out this March series. Across these posts we built a mental model of a production Go microservice end to end — schema, transport, concurrency, lifecycle, observability, testing. There’s plenty more (deployment, security, schema evolution), but those will get their own months. Read the testcontainers-go docs if you’re new to it; it’s worth a couple of hours of reading before adopting widely.

Layer 1: Pure Unit Tests

Layer 2: Handler Tests with bufconn

Layer 3: Integration Tests with testcontainers

Sharing Containers Across Tests

Layer 4: Contract Tests

Test Doubles for Downstream Services

Common Pitfalls

Wrapping Up

Related posts

Observability for Go gRPC Services with OpenTelemetry

Connection Pooling for gRPC and Postgres in Go

gRPC Interceptors in Go, Auth, Logging, and Recovery

Context, Deadlines, and Cancellation in gRPC Microservices

gRPC Streaming RPCs in Go, Server, Client, and Bidirectional

gRPC Basics in Go, From Proto to Production Server

Goroutine Patterns for Production Go Microservices

OpenTelemetry for gRPC Services in Go, A Production Walkthrough

Let’s Start a Project