Postgres Multi-Column Indexes, Column Order Is Almost Everything
TL;DR — Multi-column B-tree indexes work for queries that filter on a leftmost prefix of the index columns. Put the most selective equality column first, range columns last, and use
INCLUDEto add columns the query needs but doesn’t filter on. Wrong column order = useless index.
Last indexing post of the month. Multi-column indexes (also called composite or compound indexes) are one of those things that work great when you understand the rule and quietly fail when you don’t. The rule is “leftmost prefix” and most “my index isn’t being used” tickets resolve to a violation of it.
By the end of this post, you should be able to look at a composite index and tell which queries it will help, and look at a query and design the composite that fits it.
The leftmost-prefix rule
A B-tree index on (a, b, c) can be used by any query that filters on:
aa, ba, b, c
It cannot be used by queries that filter on:
balonecaloneb, c
The reason is structural. The index is sorted by a first, then within each a value by b, then within each (a, b) pair by c. You can binary-search down the tree if you know a. You can’t binary-search if you only know b — every value of a would need to be visited.
A concrete example
Pretend we run a multi-tenant SaaS with an orders table:
CREATE TABLE orders (
id bigserial PRIMARY KEY,
tenant_id bigint NOT NULL,
customer_id bigint NOT NULL,
status text NOT NULL,
total_cents bigint NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
The dashboard runs this query thousands of times a day:
SELECT id, total_cents, created_at
FROM orders
WHERE tenant_id = 7
AND customer_id = 12345
AND status = 'paid'
ORDER BY created_at DESC
LIMIT 50;
Which composite index?
Wrong: (status, customer_id, tenant_id)
CREATE INDEX orders_idx_wrong ON orders (status, customer_id, tenant_id);
The query filters by all three, so the index can be used. But:
statushas 3 values (pending,paid,refunded). Filtering bystatus = 'paid'cuts the index to ~50% of rows. Then it scans within that for matchingcustomer_id. Lots of index pages read.- The natural sort order of the index doesn’t match
ORDER BY created_at— nocreated_atin the index.
Right: (tenant_id, customer_id, created_at DESC) with status filter applied after
CREATE INDEX orders_tenant_customer_created_idx
ON orders (tenant_id, customer_id, created_at DESC);
Now:
tenant_idis the most selective equality. One tenant out of thousands → very narrow slice of the index.customer_idfurther narrows within that tenant.created_at DESCmatches theORDER BYexactly — no separate sort step.status = 'paid'is applied as a filter after, but it’s filtering a tiny set already.
EXPLAIN shows Index Scan with Rows Removed by Filter: small. Optimal.
Better: include status in the partial index
CREATE INDEX orders_tenant_customer_paid_created_idx
ON orders (tenant_id, customer_id, created_at DESC)
WHERE status = 'paid';
Combines the partial-index trick from the previous post with multi-column. Now even the status filter is built into the index. Fastest possible.
The four sub-rules for column ordering
The leftmost-prefix rule plus four corollaries that make composite indexes useful:
1. Most-selective equality first. Columns where each unique value narrows to few rows belong at the front. tenant_id (1 in 10K rows) before status (1 in 3 rows).
2. Range / inequality columns last. Once the planner hits a range condition (created_at > X), it can use the index up to that column but not for any column after. A composite (a, b, c) queried as WHERE a = X AND b > Y AND c = Z uses the index for a and b, then ignores c.
3. ORDER BY columns at the end, in matching order. If the query’s ORDER BY matches the trailing columns of the index, the planner can skip the sort. (tenant_id, created_at DESC) for ORDER BY created_at DESC saves a Sort node.
4. Use INCLUDE for non-key columns the query needs. Postgres 11+ has covering indexes via INCLUDE:
CREATE INDEX orders_idx
ON orders (tenant_id, customer_id)
INCLUDE (total_cents, created_at);
total_cents and created_at are stored in the index leaf pages but not used for filtering/sorting. The planner can return rows from the index without touching the heap — an “index-only scan.” Faster when the query only needs the included columns.
The “covering index” trick
Combined with INCLUDE, you get index-only scans:
CREATE INDEX orders_cover_idx
ON orders (tenant_id, customer_id, created_at DESC)
INCLUDE (id, total_cents)
WHERE status = 'paid';
-- This query is now index-only — never touches the heap
SELECT id, total_cents, created_at
FROM orders
WHERE tenant_id = 7 AND customer_id = 12345 AND status = 'paid'
ORDER BY created_at DESC
LIMIT 50;
EXPLAIN shows Index Only Scan instead of Index Scan + Heap Fetches. Substantial perf win, especially for hot queries.
Caveat: index-only scan requires the visibility map to be up to date. Means VACUUM has to have run recently. If your EXPLAIN shows Heap Fetches: N > 0 despite an index-only-capable plan, run VACUUM on the table.
A small “is my composite right” checklist
Before creating a composite index, run through:
- Does the leftmost column of the index appear with
=in the query’s WHERE? - Are the subsequent columns also filtered (with
=or range)? - If there’s an ORDER BY, does it match the trailing columns?
- Are there columns the query returns but doesn’t filter on? (→ INCLUDE)
- Is there a constant predicate (e.g.
status = 'paid') that’s always present? (→ partial)
If you can check 3–4 of these, you’ve got the right composite. If only 1, you might be over-indexing.
Index combination — when multiple single-column indexes win
Surprise: Postgres can combine multiple single-column indexes via Bitmap Index Scan. So if your queries have very different shape combinations, sometimes 3 single-column indexes serve the workload better than 5 composites.
EXPLAIN ANALYZE
SELECT * FROM orders WHERE tenant_id = 7 AND status = 'paid';
-- Plan:
-- BitmapAnd
-- -> Bitmap Index Scan on orders_tenant_id_idx
-- -> Bitmap Index Scan on orders_status_idx
-- -> Bitmap Heap Scan on orders
Two B-tree indexes, planner combines results via BitmapAnd. Not as fast as a perfect composite, but a useful fallback when your query shapes are too varied to enumerate.
Common Pitfalls
Putting low-cardinality columns first. (status, tenant_id) with 3 status values forces the planner to scan ~33% of the index just to find your tenant. Almost always worse than (tenant_id, status).
Ignoring ORDER BY in the composite. A separate Sort step on millions of rows is expensive. If your query always sorts the same way, trail the index with that sort column.
Using composite indexes where a partial would do. If every important query has the same constant predicate (WHERE deleted_at IS NULL), a partial index achieves the same thing with much less storage.
Building 10 composites “to cover all cases.” Indexes have write cost. 3 well-designed indexes that cover 95% of queries usually beat 10 narrow indexes that each match exactly one query.
Forgetting INCLUDE exists. Adding the returned columns to the key portion of the index works but makes the index larger and sorts on those columns (useless work). INCLUDE adds them only to the leaf pages.
Assuming column order in CREATE INDEX is symmetric. It’s not. (a, b) is a different index than (b, a). The first helps queries on a or a, b; the second helps queries on b or b, a.
Wrapping Up
Multi-column indexes are the workhorse of OLTP performance. Get the column order right, throw in a partial predicate when there’s a constant filter, use INCLUDE for the returned columns, and most “slow query” problems go away. Next post (Mon Feb 21): we pivot from databases to CI/CD with GitHub Actions for Go monorepos. Half the month done, half to go.