background-shape
Postgres Index Types, B-tree, GIN, GiST, BRIN, When Each One Wins
February 14, 2022 · 6 min read · by Muhammad Amal programming

TL;DR — B-tree for equality + range on scalar columns (90% of cases). GIN for full-text search, JSONB containment, array overlap. GiST for geometry, ranges, similarity. BRIN for huge tables with naturally ordered data (time-series, logs). Hash only for equality with no need for range. Pick the wrong type and your “index” makes things slower.

We’re shifting from server-side tuning to indexing for the rest of the month. Indexing is where most application-level Postgres wins live. The hard part isn’t creating indexes — CREATE INDEX is one line. The hard part is knowing which type to reach for. The defaults work for most queries; the non-defaults are essential for the queries the defaults can’t help.

This post tours all five Postgres 14 index types with real query examples. By the end you should be able to look at a query and pick the right index without guessing.

B-tree — the default, and right almost always

The index type you get when you write CREATE INDEX without specifying one. It indexes a single (or composite) scalar column, supports equality and range, and is what the planner reaches for first.

CREATE INDEX orders_customer_id_idx ON orders (customer_id);
CREATE INDEX orders_created_at_idx  ON orders (created_at);

What B-tree is good for:

  • =, <, <=, >, >=, BETWEEN, IN
  • IS NULL, IS NOT NULL (Postgres 12+ uses partial-index-style logic)
  • ORDER BY and GROUP BY on the indexed column
  • Pattern match LIKE 'prefix%' (anchored to start)

What B-tree is not good for:

  • LIKE '%suffix' (unanchored) — use a text_pattern_ops reverse index or pg_trgm
  • Full-text search — use GIN
  • JSONB containment (@>) — use GIN
  • Geometric / range types — use GiST

The vast majority of indexes you’ll ever create are B-tree. If you’re not sure, start with B-tree, then look at the EXPLAIN plan, then switch if needed.

GIN — for “what’s inside” queries

Generalized Inverted Index. Built for cases where each row has many “values” inside it (words in a document, keys in a JSONB blob, elements in an array). GIN indexes the inverse relationship: value → list of rows containing it.

Full-text search:

ALTER TABLE articles ADD COLUMN search tsvector
  GENERATED ALWAYS AS (to_tsvector('english', title || ' ' || body)) STORED;

CREATE INDEX articles_search_idx ON articles USING GIN (search);

-- Query
SELECT * FROM articles WHERE search @@ plainto_tsquery('english', 'postgres tuning');

JSONB containment:

CREATE INDEX events_data_idx ON events USING GIN (data);

-- Find events where data contains {"user_id": "abc"}
SELECT * FROM events WHERE data @> '{"user_id": "abc"}';

Array overlap:

CREATE INDEX posts_tags_idx ON posts USING GIN (tags);

-- Find posts tagged 'postgres' or 'indexing'
SELECT * FROM posts WHERE tags && ARRAY['postgres', 'indexing'];

GIN trade-offs:

  • Larger than B-tree, sometimes 2–5×
  • Slower to insert/update because of fastupdate machinery
  • Reads are very fast for the queries it supports

Use GIN when your queries match its operators (@@, @>, &&, ?). For everything else, B-tree.

GiST — for similarity and geometry

Generalized Search Tree. The one most people forget exists. Good for indexing types where “close to” is meaningful: geometric points, ranges, similarity strings.

Range types:

CREATE INDEX bookings_period_idx ON bookings USING GIST (period);

-- Find any booking overlapping a given time range
SELECT * FROM bookings WHERE period && tstzrange('2022-02-14', '2022-02-15');

Geometry (with PostGIS):

CREATE INDEX shops_location_idx ON shops USING GIST (geo);

-- Find shops within 5 km
SELECT * FROM shops
WHERE ST_DWithin(geo, ST_MakePoint(106.8, -6.2)::geography, 5000);

Trigram similarity (with pg_trgm):

CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX customers_name_trgm_idx ON customers USING GIST (name gist_trgm_ops);

-- Find customers with names "similar to" Muhamad
SELECT * FROM customers WHERE name % 'Muhamad';

GiST trade-offs:

  • Slower to read than GIN for full-text-style queries
  • Faster to update than GIN
  • Generalises to many data types (it’s an “infrastructure” index, not a type-specific one)

Rule of thumb: GIN if updates are infrequent and reads are heavy; GiST if both happen often.

BRIN — for huge tables with sorted data

Block Range Index. Stores the min/max value for each block range (default: 128 pages = 1 MB on disk). The smallest possible index — kilobytes for a multi-GB table. The trade-off: it only helps if your data is naturally clustered by the indexed column.

CREATE INDEX events_created_at_brin
  ON events USING BRIN (created_at);

When BRIN wins:

  • Time-series data (events, logs) where rows are inserted roughly in timestamp order
  • Append-only tables
  • Tables with sequential primary keys queried by range

When BRIN loses:

  • Random-order data — BRIN becomes nearly useless
  • Heavy updates that scatter rows across blocks

A real example: a 500 GB events table where queries always filter by created_at range. BRIN index is 2 MB. B-tree on the same column would be 50 GB. BRIN takes longer per query than B-tree but is orders of magnitude cheaper to maintain and store.

Hash — almost never

Postgres has a Hash index type:

CREATE INDEX users_email_hash ON users USING HASH (email);

Supports only equality (=). No range, no ORDER BY, no anything else. B-tree supports equality too, and also everything else. There is almost no scenario where Hash is the right answer in 2022. Skip.

(Pre-Postgres-10 Hash indexes weren’t WAL-logged and were unsafe. They’re fixed now, but the use case still hasn’t materialized.)

Decision tree

The cheat sheet I keep in my head:

  1. Is the column a scalar, queried with =, <, >, BETWEEN, ORDER BY? → B-tree.
  2. Is the column JSONB, tsvector, or array, queried with @>, @@, &&? → GIN.
  3. Is the column a range, geometry, or trigram-similarity text? → GiST.
  4. Is the table huge and the data naturally ordered (timestamps, sequential IDs)? → BRIN.
  5. Anything else? → B-tree, until proven otherwise.

Verifying your index is being used

Don’t trust that you got it right. Verify:

EXPLAIN (ANALYZE, BUFFERS)
SELECT ... WHERE indexed_column = ...;

If the plan shows Seq Scan instead of Index Scan / Bitmap Index Scan, the planner doesn’t think your index helps. Reasons:

  • Stats are stale → ANALYZE table_name;
  • Selectivity is too low (>10% of rows match) → seq scan is genuinely faster
  • Wrong operator class — e.g., LIKE '%suffix' won’t use a normal B-tree
  • Column type mismatch — WHERE int_col = '42' (string) doesn’t always use the index

Common Pitfalls

Indexing every column “just in case.” Each index makes every INSERT/UPDATE slower and bloats backups. Index queries you actually run, not queries you imagine.

Using GIN where you needed GiST or vice versa. Specifically: GIN for arrays where the workload is heavily updating those arrays will hurt. GiST handles updates better at the cost of slightly slower reads.

Forgetting USING GIN / USING GIST. Default is B-tree. A CREATE INDEX on a tsvector column without USING GIN will succeed but not help your @@ queries.

Hash indexes in 2022. No.

BRIN on tables with random insert order. Useless. Verify with EXPLAIN first.

Not analyzing after creating an index. Postgres knows the index exists but the planner doesn’t have stats on its selectivity until you ANALYZE. Always analyze after creating an index on a hot table.

Wrapping Up

Five index types, four of them useful. Pick by query shape, not by gut. Next post (Wed Feb 16): partial indexes — the most underused optimisation in Postgres, and the one most likely to make a slow query go away with one line.