Indexes and Performance
A database can satisfy a WHERE clause in two ways: scan every row, or use an index. Indexes are the single most important tool for making queries fast.
What an Index Is
An index is a separate data structure, maintained by the database, that lets it find rows by a column value without scanning the whole table. Most relational databases use a B-tree — a balanced tree with high fan-out that keeps lookup cost at O(log n) even for tables with billions of rows. Hash indexes (O(1) lookup but no range scans) and other types exist, but B-tree is the default.
When to Add an Index
Add an index on:
- Columns used in
WHEREclauses - Foreign-key columns (joins use them constantly)
- Columns you frequently
ORDER BY
The code example shows the conceptual speedup: looking up an email by hashing is constant time, while scanning the array is proportional to its length.
The Cost of Indexes
Indexes are not free. Every INSERT, UPDATE, or DELETE has to update every index on that table, which slows writes. They also use disk space. A table with many rarely-used indexes is wasteful — drop the ones you do not need.
Reading a Query Plan
EXPLAIN (or EXPLAIN ANALYZE) shows how the database will execute your query. The two operators you need to recognize are Seq Scan ("read every row") and Index Scan ("use the index"). If a query against a large table is slow, run EXPLAIN first — almost always you will find a missing index or a WHERE clause the planner could not optimize.
Try It Yourself
- Time a
SELECTagainst a 1-million-row table with and without an index on the filter column - Use
EXPLAINon a query and identify which operator is doing the most work - Drop an index and observe how
INSERTperformance improves