Postgres Index Optimization: A Complete Guide to B-tree, GIN, and Composite Indexes
When your Postgres database starts slowing down under load, the culprit is often missing or poorly configured indexes. In this comprehensive guide, you’ll learn how to diagnose slow queries using EXPLAIN ANALYZE and implement the right indexing strategy for your workload.
The Performance Crisis: Why Indexes Matter
Let’s start with a real-world scenario. You’ve built a SaaS application on Supabase, and everything runs smoothly in development. But as you scale to 10,000 users, queries that took 50ms now take 3 seconds. Your application is grinding to a halt.
The problem? Sequential scans on large tables. Without proper indexes, Postgres must read every single row to find matches. With indexes, it can jump directly to the relevant data.
Performance Impact:
- Without index: 3,200ms for 100,000 rows
- With B-tree index: 12ms (266x faster)
- With optimized composite index: 3ms (1,066x faster)
Understanding EXPLAIN ANALYZE: Your Diagnostic Tool
Before creating indexes, you need to understand what’s actually slow. EXPLAIN ANALYZE is your X-ray vision into query execution.
Basic EXPLAIN ANALYZE Syntax
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'user@example.com';
Reading EXPLAIN ANALYZE Output
Let’s break down a real output:
-- Query WITHOUT index
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 12345 AND status = 'pending';
-- Output:
Seq Scan on orders (cost=0.00..18334.00 rows=5 width=120) (actual time=145.234..2847.123 rows=5 loops=1)
Filter: ((user_id = 12345) AND (status = 'pending'))
Rows Removed by Filter: 999995
Planning Time: 0.234 ms
Execution Time: 2847.456 ms
Key Metrics to Watch:
- Seq Scan - Sequential scan (BAD for large tables)
- actual time - Real execution time in milliseconds
- rows - Estimated vs actual rows
- Rows Removed by Filter - Wasted work (999,995 rows scanned but discarded!)
- Execution Time - Total query time
This query scanned 1 million rows to return 5 results. That’s a 99.9995% waste of resources.
EXPLAIN ANALYZE with Index
-- After creating index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 12345 AND status = 'pending';
-- Output:
Index Scan using idx_orders_user_status on orders (cost=0.42..8.45 rows=5 width=120) (actual time=0.123..0.234 rows=5 loops=1)
Index Cond: ((user_id = 12345) AND (status = 'pending'))
Planning Time: 0.134 ms
Execution Time: 0.289 ms
Results:
- Execution time: 2847ms → 0.29ms (9,817x faster!)
- Method changed from Seq Scan to Index Scan
- Zero rows removed by filter (perfect targeting)
Index Types in Postgres: Choosing the Right Tool
Postgres offers multiple index types. Choosing the wrong type is like using a hammer to tighten a screw - it won’t work well.
1. B-tree Indexes: The Default Workhorse
Best For:
- Equality comparisons (
=) - Range queries (
<,>,BETWEEN) - Sorting (
ORDER BY) - Pattern matching with left-anchored patterns (
LIKE 'prefix%')
Use Cases:
- User lookups by ID or email
- Date range filtering
- Numeric comparisons
- Foreign key relationships
Creating B-tree Indexes
-- Single column index
CREATE INDEX idx_users_email ON users(email);
-- Multiple indexes for different query patterns
CREATE INDEX idx_orders_created_at ON orders(created_at);
CREATE INDEX idx_orders_total ON orders(total_amount);
-- Index with expression
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
B-tree Performance Example
-- Before index
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'john@example.com';
-- Seq Scan: 1,234ms for 500,000 rows
-- Create index
CREATE INDEX idx_users_email ON users(email);
-- After index
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'john@example.com';
-- Index Scan: 0.234ms (5,274x faster)
2. GIN Indexes: Full-Text Search and Arrays
Best For:
- Full-text search
- JSONB queries
- Array containment (
@>,<@) - Text search with
tsvector
Use Cases:
- Searching product descriptions
- Filtering by tags (array columns)
- JSONB attribute queries
- Multi-language text search
Creating GIN Indexes
-- JSONB index
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);
-- Array index
CREATE INDEX idx_posts_tags ON posts USING GIN (tags);
-- Full-text search index
CREATE INDEX idx_articles_search ON articles USING GIN (to_tsvector('english', content));
-- Composite GIN index for multiple JSONB keys
CREATE INDEX idx_users_preferences ON users USING GIN (preferences jsonb_path_ops);
GIN Performance Example
-- Product search in JSONB metadata
-- Without index
EXPLAIN ANALYZE
SELECT * FROM products
WHERE metadata @> '{"category": "electronics", "brand": "Apple"}';
-- Seq Scan: 4,567ms
-- Create GIN index
CREATE INDEX idx_products_metadata ON products USING GIN (metadata);
-- With index
EXPLAIN ANALYZE
SELECT * FROM products
WHERE metadata @> '{"category": "electronics", "brand": "Apple"}';
-- Bitmap Index Scan using idx_products_metadata: 23ms (198x faster)
Full-Text Search with GIN
-- Create tsvector column for better performance
ALTER TABLE articles ADD COLUMN search_vector tsvector;
-- Populate search vector
UPDATE articles
SET search_vector = to_tsvector('english', title || ' ' || content);
-- Create GIN index
CREATE INDEX idx_articles_search ON articles USING GIN (search_vector);
-- Create trigger to keep it updated
CREATE TRIGGER articles_search_update
BEFORE INSERT OR UPDATE ON articles
FOR EACH ROW EXECUTE FUNCTION
tsvector_update_trigger(search_vector, 'pg_catalog.english', title, content);
-- Query with full-text search
EXPLAIN ANALYZE
SELECT * FROM articles
WHERE search_vector @@ to_tsquery('english', 'postgres & performance');
-- Index Scan: 12ms for complex text search across 100,000 articles
3. Composite Indexes: Multiple Columns, One Index
Best For:
- Queries filtering on multiple columns
- Queries with specific column order in WHERE clause
- Covering indexes (index-only scans)
Column Order Matters!
- Put high-selectivity columns first
- Match your WHERE clause order
- Consider most common queries
Creating Composite Indexes
-- Basic composite index
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Three-column composite
CREATE INDEX idx_events_user_type_date ON events(user_id, event_type, created_at);
-- Composite with INCLUDE for covering index
CREATE INDEX idx_users_email_include ON users(email) INCLUDE (first_name, last_name);
-- Partial composite index (filtered)
CREATE INDEX idx_active_orders ON orders(user_id, created_at)
WHERE status = 'active';
Composite Index: Column Order Impact
-- Query pattern: Filter by user_id, then sort by date
SELECT * FROM orders
WHERE user_id = 12345
ORDER BY created_at DESC
LIMIT 10;
-- GOOD: Matches query pattern
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Index Scan: 0.234ms
-- BAD: Wrong column order
CREATE INDEX idx_orders_date_user ON orders(created_at, user_id);
-- Seq Scan: 234ms (index exists but can't be used efficiently)
Covering Indexes: Index-Only Scans
-- Query needs email, first_name, last_name
SELECT first_name, last_name
FROM users
WHERE email = 'user@example.com';
-- Without INCLUDE: Index Scan + Table Lookup
CREATE INDEX idx_users_email ON users(email);
-- Execution: 0.345ms (index scan + heap fetch)
-- With INCLUDE: Index-Only Scan (no table access needed)
CREATE INDEX idx_users_email_include ON users(email)
INCLUDE (first_name, last_name);
-- Execution: 0.089ms (74% faster, no heap access)
-- Verify with EXPLAIN ANALYZE
EXPLAIN ANALYZE
SELECT first_name, last_name
FROM users
WHERE email = 'user@example.com';
-- Output: "Index Only Scan using idx_users_email_include"
Practical Indexing Strategies
Strategy 1: Start with High-Impact Queries
-- Find your slowest queries
SELECT
query,
calls,
total_exec_time,
mean_exec_time,
stddev_exec_time
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Strategy 2: Index Foreign Keys
-- ALWAYS index foreign keys
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_order_items_order_id ON order_items(order_id);
CREATE INDEX idx_order_items_product_id ON order_items(product_id);
Strategy 3: Partial Indexes for Filtered Queries
-- Only index active records
CREATE INDEX idx_active_users ON users(email)
WHERE deleted_at IS NULL;
-- Only index recent orders
CREATE INDEX idx_recent_orders ON orders(user_id, created_at)
WHERE created_at > NOW() - INTERVAL '90 days';
-- Only index specific status
CREATE INDEX idx_pending_orders ON orders(user_id)
WHERE status = 'pending';
Strategy 4: Expression Indexes for Computed Values
-- Case-insensitive email search
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
-- Query using the index
SELECT * FROM users WHERE LOWER(email) = LOWER('User@Example.com');
-- Date part extraction
CREATE INDEX idx_orders_month ON orders(EXTRACT(MONTH FROM created_at));
-- JSONB attribute extraction
CREATE INDEX idx_products_price ON products((metadata->>'price')::numeric);
Real-World Case Study: E-commerce Order System
Problem: Slow Order Dashboard
Scenario:
- 5 million orders table
- Dashboard query taking 8 seconds
- Filtering by user, status, and date range
-- Original query (SLOW)
EXPLAIN ANALYZE
SELECT
o.id,
o.order_number,
o.created_at,
o.status,
o.total_amount,
u.email
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.user_id = 12345
AND o.status IN ('pending', 'processing')
AND o.created_at >= NOW() - INTERVAL '30 days'
ORDER BY o.created_at DESC
LIMIT 25;
-- Output:
Hash Join (cost=1234.56..890234.67 rows=250 width=200) (actual time=3456.123..8234.567 rows=25 loops=1)
Hash Cond: (o.user_id = u.id)
-> Seq Scan on orders o (cost=0.00..887654.00 rows=250 width=180) (actual time=2345.234..7890.123 rows=25 loops=1)
Filter: ((user_id = 12345) AND (status = ANY ('{pending,processing}')) AND (created_at >= (now() - '30 days'::interval)))
Rows Removed by Filter: 4999975
-> Hash (cost=1234.00..1234.00 rows=45 width=20) (actual time=12.345..12.345 rows=1 loops=1)
-> Index Scan using users_pkey on users u (cost=0.42..1234.00 rows=45 width=20) (actual time=0.123..12.234 rows=1 loops=1)
Execution Time: 8234.789 ms
Problems Identified:
- Sequential scan on 5M rows orders table
- 4,999,975 rows filtered out (massive waste)
- No index on (user_id, status, created_at)
Solution: Strategic Composite Index
-- Create optimized composite index
CREATE INDEX idx_orders_dashboard ON orders(user_id, status, created_at DESC)
WHERE status IN ('pending', 'processing');
-- Also ensure users.id has an index (should be primary key already)
-- And create covering index if we frequently access these columns
CREATE INDEX idx_orders_dashboard_covering ON orders(user_id, status, created_at DESC)
INCLUDE (order_number, total_amount);
After Optimization
EXPLAIN ANALYZE
SELECT
o.id,
o.order_number,
o.created_at,
o.status,
o.total_amount,
u.email
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.user_id = 12345
AND o.status IN ('pending', 'processing')
AND o.created_at >= NOW() - INTERVAL '30 days'
ORDER BY o.created_at DESC
LIMIT 25;
-- Output:
Nested Loop (cost=0.84..156.23 rows=25 width=200) (actual time=0.234..2.456 rows=25 loops=1)
-> Index Scan using idx_orders_dashboard_covering on orders o (cost=0.42..123.45 rows=25 width=180) (actual time=0.123..1.234 rows=25 loops=1)
Index Cond: ((user_id = 12345) AND (status = ANY ('{pending,processing}')) AND (created_at >= (now() - '30 days'::interval)))
-> Index Scan using users_pkey on users u (cost=0.42..1.31 rows=1 width=20) (actual time=0.012..0.012 rows=1 loops=25)
Index Cond: (id = o.user_id)
Execution Time: 2.567 ms
Results:
- 8,234ms → 2.5ms (3,293x faster!)
- Method changed from Seq Scan to Index Scan
- Zero wasted row filtering
- Index-only scan for covering columns
Advanced Indexing Techniques
1. Index Maintenance and Monitoring
-- Check index usage statistics
SELECT
schemaname,
tablename,
indexname,
idx_scan as index_scans,
idx_tup_read as tuples_read,
idx_tup_fetch as tuples_fetched
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;
-- Find unused indexes (candidates for removal)
SELECT
schemaname,
tablename,
indexname,
idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexname NOT LIKE '%_pkey';
-- Check index size
SELECT
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as index_size
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;
-- Identify bloated indexes
SELECT
schemaname,
tablename,
indexname,
pg_size_pretty(pg_relation_size(indexrelid)) as size,
idx_scan,
CASE WHEN idx_scan = 0 THEN 'UNUSED'
WHEN pg_relation_size(indexrelid) > 100000000 THEN 'LARGE'
ELSE 'OK'
END as status
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;
2. REINDEX to Rebuild Bloated Indexes
-- Reindex a specific index
REINDEX INDEX CONCURRENTLY idx_orders_user_status;
-- Reindex entire table
REINDEX TABLE CONCURRENTLY orders;
-- Reindex database (use carefully)
REINDEX DATABASE CONCURRENTLY your_database_name;
3. Concurrent Index Creation
-- Create index without blocking writes (production-safe)
CREATE INDEX CONCURRENTLY idx_orders_status ON orders(status);
-- Drop index without blocking
DROP INDEX CONCURRENTLY idx_old_index;
Common Indexing Mistakes to Avoid
Mistake 1: Over-Indexing
-- BAD: Too many overlapping indexes
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_email_status ON users(email, status);
CREATE INDEX idx_users_email_created ON users(email, created_at);
-- GOOD: Strategic composite that serves multiple queries
CREATE INDEX idx_users_email_multi ON users(email, status, created_at);
Why it matters:
- Each index slows down INSERT/UPDATE/DELETE operations
- Indexes consume disk space
- Maintenance overhead increases
Mistake 2: Wrong Column Order in Composite Index
-- Query pattern
SELECT * FROM orders WHERE status = 'pending' AND user_id = 12345;
-- BAD: Low selectivity column first
CREATE INDEX idx_orders_wrong ON orders(status, user_id);
-- Status has low selectivity (only 5 possible values)
-- GOOD: High selectivity column first
CREATE INDEX idx_orders_correct ON orders(user_id, status);
-- user_id is highly selective (thousands of unique values)
Mistake 3: Not Using Partial Indexes
-- BAD: Indexing all rows including deleted
CREATE INDEX idx_users_email ON users(email);
-- Wastes space on soft-deleted users
-- GOOD: Only index active users
CREATE INDEX idx_active_users_email ON users(email)
WHERE deleted_at IS NULL;
-- Smaller, faster, more efficient
Mistake 4: Ignoring Index Maintenance
-- Schedule regular analysis
ANALYZE orders;
-- Enable auto-vacuum (should be default)
ALTER TABLE orders SET (autovacuum_enabled = true);
-- Check for bloat and reindex periodically
REINDEX TABLE CONCURRENTLY orders;
Supabase-Specific Indexing Considerations
1. RLS (Row Level Security) and Indexes
-- RLS policies can't use indexes directly
-- Solution: Create filtered indexes matching RLS conditions
-- RLS Policy
CREATE POLICY "Users see own orders" ON orders
FOR SELECT USING (user_id = auth.uid());
-- Matching index
CREATE INDEX idx_orders_user_auth ON orders(user_id)
WHERE user_id IS NOT NULL;
2. Realtime Subscriptions
-- Optimize realtime channel queries
CREATE INDEX idx_messages_room_created ON messages(room_id, created_at DESC)
WHERE deleted_at IS NULL;
3. Full-Text Search with Postgres and Supabase
-- Create search function
CREATE OR REPLACE FUNCTION search_products(search_term TEXT)
RETURNS TABLE (
id UUID,
name TEXT,
description TEXT,
rank REAL
) AS $$
BEGIN
RETURN QUERY
SELECT
p.id,
p.name,
p.description,
ts_rank(p.search_vector, query) as rank
FROM products p,
to_tsquery('english', search_term) query
WHERE p.search_vector @@ query
ORDER BY rank DESC;
END;
$$ LANGUAGE plpgsql;
-- Create index for the search
CREATE INDEX idx_products_search ON products USING GIN (search_vector);
Performance Testing and Validation
Benchmarking Your Indexes
-- Enable timing
\timing on
-- Test without index
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 12345;
-- Create index
CREATE INDEX idx_orders_user ON orders(user_id);
-- Test with index
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM orders WHERE user_id = 12345;
-- Compare results
Load Testing
-- Generate test data
INSERT INTO orders (user_id, status, total_amount, created_at)
SELECT
(random() * 100000)::int as user_id,
(ARRAY['pending', 'processing', 'completed', 'cancelled'])[floor(random() * 4 + 1)] as status,
(random() * 1000)::numeric(10,2) as total_amount,
NOW() - (random() * 365 || ' days')::interval as created_at
FROM generate_series(1, 1000000);
-- Run your queries and measure
Index Strategy Checklist
✅ Before Creating an Index:
- Run EXPLAIN ANALYZE on your slow query
- Identify if it’s doing a Seq Scan on large table
- Check if relevant columns have high cardinality
- Verify the query pattern is consistent
- Check existing indexes for overlap
✅ When Creating an Index:
- Use CONCURRENTLY in production
- Choose appropriate index type (B-tree, GIN, etc.)
- Order composite index columns by selectivity
- Consider partial indexes for filtered queries
- Use INCLUDE for covering indexes
✅ After Creating an Index:
- Run EXPLAIN ANALYZE to verify index usage
- Monitor index size and scan statistics
- Check impact on write operations
- Schedule periodic REINDEX if needed
- Review unused indexes monthly
Conclusion
Proper indexing can transform your Postgres database from a performance bottleneck into a speed demon. Here’s what we covered:
- EXPLAIN ANALYZE is your diagnostic tool - use it religiously
- B-tree indexes for equality, ranges, and sorting
- GIN indexes for full-text search, JSONB, and arrays
- Composite indexes for multi-column queries (column order matters!)
- Covering indexes eliminate table lookups entirely
- Partial indexes reduce size and improve performance
- Index maintenance prevents bloat and keeps performance high
Remember: Measure first, optimize second, validate always. Every database is different, so use EXPLAIN ANALYZE to guide your indexing decisions.
Next Steps
- Audit your slowest queries with
pg_stat_statements - Run EXPLAIN ANALYZE on each slow query
- Create targeted indexes based on query patterns
- Monitor index usage with
pg_stat_user_indexes - Remove unused indexes to reduce overhead
Your users will thank you when their 8-second queries drop to 8 milliseconds.
Additional Resources
- Official Postgres Indexing Documentation
- Supabase Performance Best Practices
- Understanding EXPLAIN Plans
- GIN Index Internals
Have questions about indexing your specific use case? Drop them in the comments below.
Comments
Comments section will be integrated here. Popular options include Disqus, Utterances, or custom comment systems.