Postgres Multi-Column Indexes, Column Order Is Almost Everything

Postgres article cover illustration on a gradient background

February 18, 2022 · 7 min read · by Muhammad Amal programming

TL;DR — Multi-column B-tree indexes work for queries that filter on a leftmost prefix of the index columns. Put the most selective equality column first, range columns last, and use INCLUDE to add columns the query needs but doesn’t filter on. Wrong column order = useless index.

Last indexing post of the month. Multi-column indexes (also called composite or compound indexes) are one of those things that work great when you understand the rule and quietly fail when you don’t. The rule is “leftmost prefix” and most “my index isn’t being used” tickets resolve to a violation of it.

By the end of this post, you should be able to look at a composite index and tell which queries it will help, and look at a query and design the composite that fits it.

The leftmost-prefix rule

A B-tree index on (a, b, c) can be used by any query that filters on:

a
a, b
a, b, c

It cannot be used by queries that filter on:

b alone
c alone
b, c

The reason is structural. The index is sorted by a first, then within each a value by b, then within each (a, b) pair by c. You can binary-search down the tree if you know a. You can’t binary-search if you only know b — every value of a would need to be visited.

A concrete example

Pretend we run a multi-tenant SaaS with an orders table:

CREATE TABLE orders (
  id          bigserial PRIMARY KEY,
  tenant_id   bigint NOT NULL,
  customer_id bigint NOT NULL,
  status      text NOT NULL,
  total_cents bigint NOT NULL,
  created_at  timestamptz NOT NULL DEFAULT now()
);

The dashboard runs this query thousands of times a day:

SELECT id, total_cents, created_at
FROM orders
WHERE tenant_id = 7
  AND customer_id = 12345
  AND status = 'paid'
ORDER BY created_at DESC
LIMIT 50;

Which composite index?

Wrong: `(status, customer_id, tenant_id)`

CREATE INDEX orders_idx_wrong ON orders (status, customer_id, tenant_id);

The query filters by all three, so the index can be used. But:

status has 3 values (pending, paid, refunded). Filtering by status = 'paid' cuts the index to ~50% of rows. Then it scans within that for matching customer_id. Lots of index pages read.
The natural sort order of the index doesn’t match ORDER BY created_at — no created_at in the index.

Right: `(tenant_id, customer_id, created_at DESC)` with status filter applied after

CREATE INDEX orders_tenant_customer_created_idx
  ON orders (tenant_id, customer_id, created_at DESC);

Now:

tenant_id is the most selective equality. One tenant out of thousands → very narrow slice of the index.
customer_id further narrows within that tenant.
created_at DESC matches the ORDER BY exactly — no separate sort step.
status = 'paid' is applied as a filter after, but it’s filtering a tiny set already.

EXPLAIN shows Index Scan with Rows Removed by Filter: small. Optimal.

Better: include status in the partial index

CREATE INDEX orders_tenant_customer_paid_created_idx
  ON orders (tenant_id, customer_id, created_at DESC)
  WHERE status = 'paid';

Combines the partial-index trick from the previous post with multi-column. Now even the status filter is built into the index. Fastest possible.

The four sub-rules for column ordering

The leftmost-prefix rule plus four corollaries that make composite indexes useful:

1. Most-selective equality first. Columns where each unique value narrows to few rows belong at the front. tenant_id (1 in 10K rows) before status (1 in 3 rows).

2. Range / inequality columns last. Once the planner hits a range condition (created_at > X), it can use the index up to that column but not for any column after. A composite (a, b, c) queried as WHERE a = X AND b > Y AND c = Z uses the index for a and b, then ignores c.

3. ORDER BY columns at the end, in matching order. If the query’s ORDER BY matches the trailing columns of the index, the planner can skip the sort. (tenant_id, created_at DESC) for ORDER BY created_at DESC saves a Sort node.

4. Use INCLUDE for non-key columns the query needs. Postgres 11+ has covering indexes via INCLUDE:

CREATE INDEX orders_idx
  ON orders (tenant_id, customer_id)
  INCLUDE (total_cents, created_at);

total_cents and created_at are stored in the index leaf pages but not used for filtering/sorting. The planner can return rows from the index without touching the heap — an “index-only scan.” Faster when the query only needs the included columns.

The “covering index” trick

Combined with INCLUDE, you get index-only scans:

CREATE INDEX orders_cover_idx
  ON orders (tenant_id, customer_id, created_at DESC)
  INCLUDE (id, total_cents)
  WHERE status = 'paid';

-- This query is now index-only — never touches the heap
SELECT id, total_cents, created_at
FROM orders
WHERE tenant_id = 7 AND customer_id = 12345 AND status = 'paid'
ORDER BY created_at DESC
LIMIT 50;

EXPLAIN shows Index Only Scan instead of Index Scan + Heap Fetches. Substantial perf win, especially for hot queries.

Caveat: index-only scan requires the visibility map to be up to date. Means VACUUM has to have run recently. If your EXPLAIN shows Heap Fetches: N > 0 despite an index-only-capable plan, run VACUUM on the table.

A small “is my composite right” checklist

Before creating a composite index, run through:

Does the leftmost column of the index appear with = in the query’s WHERE?
Are the subsequent columns also filtered (with = or range)?
If there’s an ORDER BY, does it match the trailing columns?
Are there columns the query returns but doesn’t filter on? (→ INCLUDE)
Is there a constant predicate (e.g. status = 'paid') that’s always present? (→ partial)

If you can check 3–4 of these, you’ve got the right composite. If only 1, you might be over-indexing.

Index combination — when multiple single-column indexes win

Surprise: Postgres can combine multiple single-column indexes via Bitmap Index Scan. So if your queries have very different shape combinations, sometimes 3 single-column indexes serve the workload better than 5 composites.

EXPLAIN ANALYZE
SELECT * FROM orders WHERE tenant_id = 7 AND status = 'paid';

-- Plan:
-- BitmapAnd
--   -> Bitmap Index Scan on orders_tenant_id_idx
--   -> Bitmap Index Scan on orders_status_idx
-- -> Bitmap Heap Scan on orders

Two B-tree indexes, planner combines results via BitmapAnd. Not as fast as a perfect composite, but a useful fallback when your query shapes are too varied to enumerate.

Common Pitfalls

Putting low-cardinality columns first. (status, tenant_id) with 3 status values forces the planner to scan ~33% of the index just to find your tenant. Almost always worse than (tenant_id, status).

Ignoring ORDER BY in the composite. A separate Sort step on millions of rows is expensive. If your query always sorts the same way, trail the index with that sort column.

Using composite indexes where a partial would do. If every important query has the same constant predicate (WHERE deleted_at IS NULL), a partial index achieves the same thing with much less storage.

Building 10 composites “to cover all cases.” Indexes have write cost. 3 well-designed indexes that cover 95% of queries usually beat 10 narrow indexes that each match exactly one query.

Forgetting INCLUDE exists. Adding the returned columns to the key portion of the index works but makes the index larger and sorts on those columns (useless work). INCLUDE adds them only to the leaf pages.

Assuming column order in CREATE INDEX is symmetric. It’s not. (a, b) is a different index than (b, a). The first helps queries on a or a, b; the second helps queries on b or b, a.

Wrapping Up

Multi-column indexes are the workhorse of OLTP performance. Get the column order right, throw in a partial predicate when there’s a constant filter, use INCLUDE for the returned columns, and most “slow query” problems go away. Next post (Mon Feb 21): we pivot from databases to CI/CD with GitHub Actions for Go monorepos . Half the month done, half to go.

The leftmost-prefix rule

A concrete example

Wrong: (status, customer_id, tenant_id)

Right: (tenant_id, customer_id, created_at DESC) with status filter applied after

Better: include status in the partial index

The four sub-rules for column ordering

The “covering index” trick

A small “is my composite right” checklist

Index combination — when multiple single-column indexes win

Common Pitfalls

Wrapping Up

Related posts

Postgres Partial Indexes That Make Slow Queries Disappear

Postgres Index Types, B-tree, GIN, GiST, BRIN, When Each One Wins

Postgres Bulk Loading, COPY and Prepared Statements

Postgres Autovacuum, When Defaults Fail and How to Fix Them

Tuning Postgres Memory, shared_buffers, work_mem, effective_cache_size

Reading the Postgres Query Planner, A Practical EXPLAIN Walkthrough

Postgres Performance Tuning, My 2022 Checklist

Reading pg_stat_io and Modern Postgres Internals

Let’s Start a Project

Wrong: `(status, customer_id, tenant_id)`

Right: `(tenant_id, customer_id, created_at DESC)` with status filter applied after