Database Query Optimization: Indexes, Execution Plans & Partitioning

A single missing index can turn a 2-millisecond query into a 20-second table scan. As your database grows from thousands to millions of rows, the difference between an optimized and unoptimized query is the difference between a responsive application and one that times out under load. Database optimization delivers the highest return on engineering time of any performance work you can do.

Key Takeaways

EXPLAIN ANALYZE is your most powerful diagnostic tool -- learn to read execution plans before optimizing anything

Choose index types strategically: B-tree for equality and range, GIN for full-text and JSONB, partial indexes for filtered subsets

N+1 queries are the most common performance killer in ORM-based applications -- detect them early with query logging

Table partitioning becomes essential when tables exceed 10-50 million rows, reducing query planning time and enabling efficient data lifecycle management

Reading Execution Plans with EXPLAIN ANALYZE

Before optimizing any query, you must understand how PostgreSQL currently executes it. EXPLAIN ANALYZE runs the query and shows the actual execution plan with real timing data.

A basic EXPLAIN ANALYZE output shows you the planner's chosen strategy, the estimated versus actual row counts, and the time spent at each step. The key metrics to focus on are:

Seq Scan -- the database reads every row in the table. Acceptable for small tables (under 10,000 rows) but a red flag for larger ones.
Index Scan -- the database uses an index to find matching rows efficiently. This is what you want for filtered queries on large tables.
Index Only Scan -- the database answers the query entirely from the index without touching the table. The fastest scan type.
Nested Loop -- joins tables by scanning the inner table once per row in the outer table. Efficient when the inner scan uses an index.
Hash Join -- builds a hash table from one side of the join, then probes it with the other. Efficient for larger result sets.
Sort -- an explicit sort step, often for ORDER BY. Watch for sorts that spill to disk (indicated by "Sort Method: external merge").

What to Look For

The most important signal in an execution plan is the gap between estimated and actual rows. When PostgreSQL estimates 10 rows but finds 100,000, it chose the wrong plan. This happens when table statistics are stale -- run ANALYZE on the table to update them.

Watch for sequential scans on large tables, sorts without indexes, and nested loops with sequential scans on the inner table. Each of these patterns indicates a missing index or a query that needs rewriting.

Index Types and When to Use Them

PostgreSQL offers several index types, each optimized for different query patterns. Choosing the right type is critical -- a GIN index on a column that only needs equality checks wastes storage and slows writes without improving reads.

Index Type	Best For	Example Use Case	Storage Overhead
B-tree (default)	Equality, range, sorting, LIKE prefix	WHERE status = 'active', WHERE created_at > '2026-01-01'	Low to moderate
Hash	Equality only (no range)	WHERE uuid = '...' (rare, B-tree usually sufficient)	Low
GIN (Generalized Inverted)	Full-text search, JSONB containment, arrays	JSONB containment queries, full-text search, array operations	High
GiST (Generalized Search Tree)	Geometric data, range types, nearest-neighbor	Nearest-neighbor spatial queries, range type overlap queries	Moderate
BRIN (Block Range Index)	Naturally ordered data (timestamps, sequences)	WHERE created_at BETWEEN '2026-01-01' AND '2026-01-31' on append-only tables	Very low
Partial	Filtered subsets of data	WHERE status = 'pending' (index only pending rows)	Low

B-tree Indexes

B-tree is the default and most versatile index type. It supports equality, range comparisons (less than, greater than, BETWEEN), sorting (ORDER BY), and prefix pattern matching (LIKE 'abc%'). For most columns in WHERE, JOIN, and ORDER BY clauses, a B-tree index is the correct choice.

Composite indexes combine multiple columns into a single B-tree. Column order matters: the index on (status, created_at) efficiently supports queries filtering on status alone or on both status and created_at, but not on created_at alone. Place the most selective column first and the column used for range filtering last.

GIN Indexes

GIN indexes excel at searching within composite values. They are essential for full-text search (tsvector columns), JSONB containment queries, and array overlap queries. GIN indexes are larger and slower to update than B-tree indexes, so use them only where B-tree cannot serve the query pattern.

For JSONB columns that store flexible attributes, a GIN index on the entire column supports any key-based query. For columns where you only query specific keys, a B-tree index on a generated column or expression is more efficient.

Partial Indexes

Partial indexes only index rows matching a WHERE condition. They are powerful for tables where queries consistently filter for a small subset of data.

For example, if your orders table has 10 million rows but you almost exclusively query active orders (5% of the table), a partial index on (customer_id, created_at) WHERE status = 'active' is 20x smaller than a full index and just as fast for your actual queries.

Detecting and Fixing N+1 Queries

The N+1 query problem is the most common performance issue in applications using ORMs. It occurs when code loads a list of N records, then executes one additional query per record to load related data, resulting in N+1 total queries instead of 1-2.

How N+1 Queries Happen

Consider loading a list of orders with their customer names. A naive implementation loads the orders list (1 query), then for each order, loads the customer (N queries). With 100 orders, this generates 101 database round trips. At 1ms per query, that is 101ms -- but under concurrent load with connection pool contention, it can easily become 500ms or more.

Detection Methods

Query logging -- enable PostgreSQL query logging temporarily and look for repeated identical queries with different parameter values
ORM-level logging -- Drizzle ORM, Prisma, and TypeORM all support query logging that shows every SQL statement executed
APM tools -- Datadog, New Relic, and Sentry can group queries by endpoint and highlight N+1 patterns automatically
pg_stat_statements -- this PostgreSQL extension tracks query execution statistics and reveals frequently executed identical query templates

Fixing N+1 Queries

The fix depends on your ORM and query pattern:

Eager loading -- tell the ORM to load related data in the initial query using JOINs. In Drizzle, use the with option in query builders.
Batch loading -- collect all foreign key IDs, then load related records in a single WHERE id IN (...) query. This is the DataLoader pattern.
Denormalization -- for read-heavy use cases, store the related data directly on the parent record. Trade write complexity for read performance.

Query Rewriting Techniques

Sometimes the query itself needs restructuring, not just better indexes.

Subquery to JOIN Conversion

Correlated subqueries execute once per row in the outer query. Converting them to JOINs allows PostgreSQL to use more efficient join strategies.

Instead of selecting orders with a subquery that looks up the latest order date per customer, rewrite it as a JOIN with a derived table or a window function. The JOIN version allows PostgreSQL to choose between nested loop, hash join, and merge join based on data distribution.

Common Table Expressions (CTEs)

In PostgreSQL 12 and later, CTEs are inlined by default, meaning the optimizer can push predicates into them. Use CTEs for readability without worrying about performance fences. For cases where you explicitly want materialization (to prevent re-execution of expensive subqueries), add the MATERIALIZED keyword.

Window Functions vs GROUP BY

When you need both detail rows and aggregates, window functions avoid the need for a self-join or subquery. Calculating a running total, ranking within groups, or comparing each row to the group average are all more efficient with window functions than with correlated subqueries.

Table Partitioning Strategies

When tables grow beyond 10-50 million rows, even well-indexed queries slow down due to index depth, vacuum overhead, and planner complexity. Partitioning divides a large table into smaller physical chunks while maintaining a single logical table interface.

Partition Types

Strategy	Mechanism	Best For
Range partitioning	Partition by value ranges (date ranges, ID ranges)	Time-series data, logs, orders by date
List partitioning	Partition by discrete values	Multi-tenant data by organization_id, orders by region
Hash partitioning	Partition by hash of a column	Even distribution when no natural range or list key exists

Range Partitioning by Date

The most common pattern is monthly partitioning by a timestamp column. Each month's data lives in its own partition. Queries that filter by date automatically scan only the relevant partitions (partition pruning).

Benefits of time-based partitioning:

Query performance -- queries for recent data only scan recent partitions
Maintenance -- VACUUM and ANALYZE run faster on smaller partitions
Data lifecycle -- dropping old partitions is instant compared to deleting millions of rows
Backup efficiency -- back up only recent partitions for point-in-time recovery

Partitioning Considerations

Partitioning adds complexity. Every query must include the partition key in its WHERE clause for partition pruning to work. Unique constraints must include the partition key. Foreign keys referencing partitioned tables have limitations. Start partitioning only when you have measured that table size is causing performance degradation.

PostgreSQL Configuration Tuning

Default PostgreSQL configuration is conservative, designed to run on minimal hardware. Production workloads benefit from tuning key parameters.

Parameter	Default	Recommended (16GB RAM server)	Purpose
shared_buffers	128MB	4GB (25% of RAM)	In-memory cache for table and index data
effective_cache_size	4GB	12GB (75% of RAM)	Planner hint for OS file cache availability
work_mem	4MB	64MB	Memory per sort/hash operation (careful with concurrency)
maintenance_work_mem	64MB	1GB	Memory for VACUUM, CREATE INDEX, ALTER TABLE
random_page_cost	4.0	1.1 (SSD storage)	Cost estimate for random I/O (lower for SSD)
effective_io_concurrency	1	200 (SSD storage)	Concurrent I/O operations for bitmap heap scans
max_connections	100	200 (with PgBouncer)	Use connection pooling to keep this reasonable

These settings must be tuned for your specific hardware and workload. Monitor pg_stat_bgwriter, pg_stat_activity, and pg_stat_user_tables to validate that changes improve performance.

Frequently Asked Questions

How many indexes should a table have?

There is no fixed limit, but each index slows down INSERT, UPDATE, and DELETE operations because the index must be maintained. A good rule of thumb is to create indexes for columns that appear in WHERE, JOIN ON, and ORDER BY clauses of your most frequent queries. Use pg_stat_user_indexes to find unused indexes that can be dropped.

Should I use UUID or integer primary keys for performance?

Integer primary keys (BIGSERIAL) are faster for joins and indexing because they are smaller (8 bytes vs 16 bytes) and naturally ordered. UUIDs provide global uniqueness without coordination, which matters for distributed systems. For most applications, use UUIDs for external-facing identifiers and integers for internal joins.

When should I switch from a single database to read replicas?

When your read workload exceeds 70-80% of your database's capacity, or when reporting queries compete with transactional queries for resources. Read replicas handle the read load while the primary focuses on writes. This is typically needed at 5,000-10,000 concurrent users for a typical web application.

How do I handle slow queries in production without downtime?

Create indexes with the CONCURRENTLY option to avoid locking the table. Use pg_stat_statements to identify the slowest queries. Deploy query optimizations behind feature flags. For schema changes that rewrite tables, use tools like pg_repack to reorganize tables without locking.

What Is Next

Database optimization is the foundation of platform performance. Start by enabling pg_stat_statements, identify your slowest queries, and work through them systematically with EXPLAIN ANALYZE. Add missing indexes, fix N+1 patterns, and consider partitioning for your largest tables.

For the broader performance picture, see our pillar guide on scaling your business platform from startup to enterprise. To learn about the next layer of optimization, read our guide on caching strategies with Redis, CDN, and HTTP caching.

ECOSIRE provides expert database optimization for PostgreSQL-backed platforms including Odoo ERP and custom applications. Contact us for a database performance audit.

Published by ECOSIRE — helping businesses scale with AI-powered solutions across Odoo ERP, Shopify eCommerce, and OpenClaw AI.

Key Takeaways

EXPLAIN ANALYZE is your most powerful diagnostic tool -- learn to read execution plans before optimizing anything

Choose index types strategically: B-tree for equality and range, GIN for full-text and JSONB, partial indexes for filtered subsets

N+1 queries are the most common performance killer in ORM-based applications -- detect them early with query logging

Table partitioning becomes essential when tables exceed 10-50 million rows, reducing query planning time and enabling efficient data lifecycle management

Reading Execution Plans with EXPLAIN ANALYZE

Before optimizing any query, you must understand how PostgreSQL currently executes it. EXPLAIN ANALYZE runs the query and shows the actual execution plan with real timing data.

A basic EXPLAIN ANALYZE output shows you the planner's chosen strategy, the estimated versus actual row counts, and the time spent at each step. The key metrics to focus on are:

Seq Scan -- the database reads every row in the table. Acceptable for small tables (under 10,000 rows) but a red flag for larger ones.
Index Scan -- the database uses an index to find matching rows efficiently. This is what you want for filtered queries on large tables.
Index Only Scan -- the database answers the query entirely from the index without touching the table. The fastest scan type.
Nested Loop -- joins tables by scanning the inner table once per row in the outer table. Efficient when the inner scan uses an index.
Hash Join -- builds a hash table from one side of the join, then probes it with the other. Efficient for larger result sets.
Sort -- an explicit sort step, often for ORDER BY. Watch for sorts that spill to disk (indicated by "Sort Method: external merge").

What to Look For

Index Types and When to Use Them

Index Type	Best For	Example Use Case	Storage Overhead
B-tree (default)	Equality, range, sorting, LIKE prefix	WHERE status = 'active', WHERE created_at > '2026-01-01'	Low to moderate
Hash	Equality only (no range)	WHERE uuid = '...' (rare, B-tree usually sufficient)	Low
GIN (Generalized Inverted)	Full-text search, JSONB containment, arrays	JSONB containment queries, full-text search, array operations	High
GiST (Generalized Search Tree)	Geometric data, range types, nearest-neighbor	Nearest-neighbor spatial queries, range type overlap queries	Moderate
BRIN (Block Range Index)	Naturally ordered data (timestamps, sequences)	WHERE created_at BETWEEN '2026-01-01' AND '2026-01-31' on append-only tables	Very low
Partial	Filtered subsets of data	WHERE status = 'pending' (index only pending rows)	Low

B-tree Indexes

GIN Indexes

Partial Indexes

Partial indexes only index rows matching a WHERE condition. They are powerful for tables where queries consistently filter for a small subset of data.

Detecting and Fixing N+1 Queries

How N+1 Queries Happen

Detection Methods

Query logging -- enable PostgreSQL query logging temporarily and look for repeated identical queries with different parameter values
ORM-level logging -- Drizzle ORM, Prisma, and TypeORM all support query logging that shows every SQL statement executed
APM tools -- Datadog, New Relic, and Sentry can group queries by endpoint and highlight N+1 patterns automatically
pg_stat_statements -- this PostgreSQL extension tracks query execution statistics and reveals frequently executed identical query templates

Fixing N+1 Queries

The fix depends on your ORM and query pattern:

Eager loading -- tell the ORM to load related data in the initial query using JOINs. In Drizzle, use the with option in query builders.
Batch loading -- collect all foreign key IDs, then load related records in a single WHERE id IN (...) query. This is the DataLoader pattern.
Denormalization -- for read-heavy use cases, store the related data directly on the parent record. Trade write complexity for read performance.

Query Rewriting Techniques

Sometimes the query itself needs restructuring, not just better indexes.

Subquery to JOIN Conversion

Correlated subqueries execute once per row in the outer query. Converting them to JOINs allows PostgreSQL to use more efficient join strategies.

Common Table Expressions (CTEs)

Window Functions vs GROUP BY

Table Partitioning Strategies

Partition Types

Strategy	Mechanism	Best For
Range partitioning	Partition by value ranges (date ranges, ID ranges)	Time-series data, logs, orders by date
List partitioning	Partition by discrete values	Multi-tenant data by organization_id, orders by region
Hash partitioning	Partition by hash of a column	Even distribution when no natural range or list key exists

Range Partitioning by Date

Benefits of time-based partitioning:

Query performance -- queries for recent data only scan recent partitions
Maintenance -- VACUUM and ANALYZE run faster on smaller partitions
Data lifecycle -- dropping old partitions is instant compared to deleting millions of rows
Backup efficiency -- back up only recent partitions for point-in-time recovery

Partitioning Considerations

PostgreSQL Configuration Tuning

Default PostgreSQL configuration is conservative, designed to run on minimal hardware. Production workloads benefit from tuning key parameters.

Parameter	Default	Recommended (16GB RAM server)	Purpose
shared_buffers	128MB	4GB (25% of RAM)	In-memory cache for table and index data
effective_cache_size	4GB	12GB (75% of RAM)	Planner hint for OS file cache availability
work_mem	4MB	64MB	Memory per sort/hash operation (careful with concurrency)
maintenance_work_mem	64MB	1GB	Memory for VACUUM, CREATE INDEX, ALTER TABLE
random_page_cost	4.0	1.1 (SSD storage)	Cost estimate for random I/O (lower for SSD)
effective_io_concurrency	1	200 (SSD storage)	Concurrent I/O operations for bitmap heap scans
max_connections	100	200 (with PgBouncer)	Use connection pooling to keep this reasonable

These settings must be tuned for your specific hardware and workload. Monitor pg_stat_bgwriter, pg_stat_activity, and pg_stat_user_tables to validate that changes improve performance.

Frequently Asked Questions

How many indexes should a table have?

Should I use UUID or integer primary keys for performance?

When should I switch from a single database to read replicas?

How do I handle slow queries in production without downtime?

What Is Next

ECOSIRE provides expert database optimization for PostgreSQL-backed platforms including Odoo ERP and custom applications. Contact us for a database performance audit.

Published by ECOSIRE — helping businesses scale with AI-powered solutions across Odoo ERP, Shopify eCommerce, and OpenClaw AI.

Database Query Optimization: Indexes, Execution Plans & Partitioning

Reading Execution Plans with EXPLAIN ANALYZE

What to Look For

Index Types and When to Use Them

B-tree Indexes

GIN Indexes

Partial Indexes

Detecting and Fixing N+1 Queries

How N+1 Queries Happen

Detection Methods

Fixing N+1 Queries

Query Rewriting Techniques

Subquery to JOIN Conversion

Common Table Expressions (CTEs)

Window Functions vs GROUP BY

Table Partitioning Strategies

Partition Types

Range Partitioning by Date

Partitioning Considerations

PostgreSQL Configuration Tuning

Frequently Asked Questions

What Is Next

Grow Your Business with ECOSIRE

Related Articles

Odoo Hosting Requirements in 2026: Server Sizing by User Count (With Real Configs)

Shopify Speed Optimization: A Technical Checklist That Actually Moves Core Web Vitals (2026)

Odoo 19 HR: Skills Matrix, Career Plans, Performance Cycles

More from Performance & Scalability

Shopify Speed Optimization: A Technical Checklist That Actually Moves Core Web Vitals (2026)

Technical SEO Audit Checklist 2026: 47 Checks We Run on Every Client Site

Odoo 19 HR: Skills Matrix, Career Plans, Performance Cycles

Odoo 19 Performance Benchmarks: PostgreSQL 17 Tuning Numbers

OpenClaw Cost Optimization and Token Efficiency at Scale

Power BI Incremental Refresh for Tables Over 10 Million Rows

Database Query Optimization: Indexes, Execution Plans & Partitioning

Reading Execution Plans with EXPLAIN ANALYZE

What to Look For

Index Types and When to Use Them

B-tree Indexes

GIN Indexes

Partial Indexes

Detecting and Fixing N+1 Queries

How N+1 Queries Happen

Detection Methods

Fixing N+1 Queries

Query Rewriting Techniques

Subquery to JOIN Conversion

Common Table Expressions (CTEs)

Window Functions vs GROUP BY

Table Partitioning Strategies

Partition Types

Range Partitioning by Date

Partitioning Considerations

PostgreSQL Configuration Tuning

Frequently Asked Questions

What Is Next

Grow Your Business with ECOSIRE

Related Articles

Odoo Hosting Requirements in 2026: Server Sizing by User Count (With Real Configs)

Shopify Speed Optimization: A Technical Checklist That Actually Moves Core Web Vitals (2026)

Odoo 19 HR: Skills Matrix, Career Plans, Performance Cycles

More from Performance & Scalability

Shopify Speed Optimization: A Technical Checklist That Actually Moves Core Web Vitals (2026)

Technical SEO Audit Checklist 2026: 47 Checks We Run on Every Client Site

Odoo 19 HR: Skills Matrix, Career Plans, Performance Cycles

Odoo 19 Performance Benchmarks: PostgreSQL 17 Tuning Numbers

OpenClaw Cost Optimization and Token Efficiency at Scale

Power BI Incremental Refresh for Tables Over 10 Million Rows