Indexes and Performance

A database can satisfy a WHERE clause in two ways: scan every row, or use an index. Indexes are the single most important tool for making queries fast.

What an Index Is

An index is a separate data structure, maintained by the database, that lets it find rows by a column value without scanning the whole table. Most relational databases use a B-tree — a balanced tree with high fan-out that keeps lookup cost at O(log n) even for tables with billions of rows. Hash indexes (O(1) lookup but no range scans) and other types exist, but B-tree is the default.

When to Add an Index

Add an index on:

Columns used in WHERE clauses
Foreign-key columns (joins use them constantly)
Columns you frequently ORDER BY

The code example shows the conceptual speedup: looking up an email by hashing is constant time, while scanning the array is proportional to its length.

The Cost of Indexes

Indexes are not free. Every INSERT, UPDATE, or DELETE has to update every index on that table, which slows writes. They also use disk space. A table with many rarely-used indexes is wasteful — drop the ones you do not need.

Reading a Query Plan

EXPLAIN (or EXPLAIN ANALYZE) shows how the database will execute your query. The two operators you need to recognize are Seq Scan ("read every row") and Index Scan ("use the index"). If a query against a large table is slow, run EXPLAIN first — almost always you will find a missing index or a WHERE clause the planner could not optimize.

Try It Yourself

Time a SELECT against a 1-million-row table with and without an index on the filter column
Use EXPLAIN on a query and identify which operator is doing the most work
Drop an index and observe how INSERT performance improves