Databasesintermediate13 min read

SQL Database Indexing: A Practical Guide

Learn how database indexes work, the difference between B-tree and other index types, how to create composite indexes correctly, and why some queries don't use indexes.

What is a Database Index?

A database index is a separate data structure that the database maintains alongside your table to make certain queries faster. Think of it like the index at the back of a book: instead of reading every page to find a topic, you look it up in the index and jump directly to the right page.

Indexes speed up reads at the cost of writes (the index must be updated when data changes) and storage (the index takes additional disk space).

Use `EXPLAIN` (MySQL/PostgreSQL) or `EXPLAIN ANALYZE` (PostgreSQL) to see whether your query is using an index:

-- Check if a query uses an index
EXPLAIN ANALYZE
SELECT * FROM orders WHERE user_id = 42;

-- Output will show:
-- 'Index Scan using orders_user_id_idx' → index used ✓
-- 'Seq Scan on orders' → full table scan, no index ✗

How B-Tree Indexes Work

The default index type in PostgreSQL, MySQL, and SQLite is the B-tree (balanced tree). It's a tree structure where:

  • Every path from root to leaf is the same length (balanced)
  • Each node contains sorted keys and pointers to child nodes
  • Leaf nodes contain the actual index values and pointers to table rows

To find a value, the database starts at the root and descends through comparisons until reaching a leaf — an O(log n) operation versus O(n) for a full table scan.

B-trees support: - Equality lookups — `WHERE email = 'alice@example.com'` - Range queries — `WHERE age BETWEEN 25 AND 35` - Sorting — `ORDER BY created_at` can use a B-tree index to avoid a sort step - Prefix matching — `WHERE name LIKE 'Ali%'` (leading wildcard only)

Creating and Dropping Indexes

Basic index operations:

-- Standard index
CREATE INDEX idx_orders_user_id ON orders(user_id);

-- Unique index (also enforces uniqueness constraint)
CREATE UNIQUE INDEX idx_users_email ON users(email);

-- Partial index — only indexes rows matching a condition
-- Much smaller than a full index, faster for filtered queries
CREATE INDEX idx_orders_pending ON orders(created_at)
WHERE status = 'pending';

-- Create concurrently — no table lock (PostgreSQL only)
-- Takes longer, but other queries can run during creation
CREATE INDEX CONCURRENTLY idx_events_type ON events(type);

-- Drop an index
DROP INDEX IF EXISTS idx_orders_user_id;

-- List indexes on a table (PostgreSQL)
\d orders

Composite Indexes and Column Order

A composite (multi-column) index covers multiple columns. Column order is critical.

An index on `(a, b, c)` can be used for queries that filter on: - `a` alone - `a` and `b` - `a`, `b`, and `c`

But not for queries filtering on `b` alone, `c` alone, or `b` and `c`. This is the leftmost prefix rule.

For maximum benefit, put the most selective column first (the one that filters out the most rows). If 90% of your orders are `status = 'completed'` but each user has a unique `user_id`, put `user_id` first.

-- Query: find all open orders for a specific user
SELECT * FROM orders
WHERE user_id = 42 AND status = 'open';

-- Good index: user_id first (high selectivity)
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- This index also helps:
SELECT * FROM orders WHERE user_id = 42;  -- uses leftmost prefix

-- But NOT:
SELECT * FROM orders WHERE status = 'open';  -- status alone can't use the index

When Indexes Are Not Used

Indexes can be bypassed in ways that are easy to miss:

  • Function on indexed column — `WHERE LOWER(email) = 'alice@example.com'` doesn't use an index on `email`. Fix: create a function-based index or store the lowercased value.
  • Leading wildcard — `WHERE name LIKE '%alice%'` can't use a B-tree index; `LIKE 'alice%'` can.
  • Implicit type coercion — comparing a varchar column to an integer causes a type cast that bypasses the index.
  • `NOT IN` and `!=` — the query planner often chooses a seq scan because negation can match most rows.
  • NULL handling — `WHERE col IS NULL` may or may not use an index depending on the database and whether NULLs are indexed.
-- ✗ Won't use index on email
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';

-- ✓ Fix: create a function-based index
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'alice@example.com';  -- now uses index

-- ✗ Leading wildcard — full scan
SELECT * FROM products WHERE name LIKE '%laptop%';

-- ✓ Trailing wildcard — uses index
SELECT * FROM products WHERE name LIKE 'laptop%';

Formatting and Reviewing Queries with DevForge

Before analyzing a slow query with `EXPLAIN ANALYZE`, it helps to have it properly formatted. The DevForge SQL Formatter indents complex queries with JOINs, subqueries, and CTEs, making it much easier to understand what the query is doing and where indexes might help.

The SQL Escape tool is useful when constructing test queries with literal values during performance testing, ensuring special characters in test data don't accidentally break the query syntax.

Try it on DevForge

Free online tools related to this tutorial — no signup required.

Related Tutorials