Database Scaling Strategies: Read Replicas, Sharding, and Beyond
Database performance is the bottleneck in 78% of web application scaling issues. Applications can scale horizontally with minimal effort, but databases resist horizontal scaling. The strategies you choose for database scaling determine whether your application serves 100 users or 100,000 users with acceptable performance.
This guide covers the complete spectrum of database scaling strategies, from simple optimizations that delay the need for scaling to advanced techniques like horizontal sharding.
Key Takeaways
- Optimize queries and add indexes before adding infrastructure --- this solves 60% of database performance issues
- Read replicas are the lowest-risk scaling strategy and handle 80% of read-heavy workloads
- Connection pooling is mandatory once your application runs more than 10 instances
- Horizontal sharding is a last resort that introduces significant application complexity
The Scaling Ladder
Scale in this order. Each step is cheaper and less risky than the next:
Step 1: Query Optimization (Free)
Before adding infrastructure, ensure your existing database is performing optimally.
-- Find slow queries in PostgreSQL
SELECT
calls,
mean_exec_time::numeric(10,2) AS avg_ms,
total_exec_time::numeric(10,2) AS total_ms,
query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;
Common optimizations:
- Add missing indexes for frequently filtered columns
- Replace
SELECT *with specific column lists - Use
EXPLAIN ANALYZEto identify sequential scans on large tables - Add composite indexes for multi-column WHERE clauses
- Implement pagination with keyset pagination instead of
OFFSET
-- Bad: OFFSET pagination (scans all skipped rows)
SELECT * FROM orders ORDER BY created_at DESC LIMIT 20 OFFSET 10000;
-- Good: Keyset pagination (index-only scan)
SELECT * FROM orders
WHERE created_at < '2026-03-15T10:00:00Z'
ORDER BY created_at DESC
LIMIT 20;
Step 2: Vertical Scaling ($)
Increase CPU, RAM, and storage on your existing database server. This buys time and requires zero application changes.
| Instance Size | vCPU | RAM | Connections | Monthly Cost (RDS) |
|---|---|---|---|---|
| db.t3.medium | 2 | 4 GB | 100 | $65 |
| db.r6g.large | 2 | 16 GB | 200 | $175 |
| db.r6g.xlarge | 4 | 32 GB | 400 | $350 |
| db.r6g.2xlarge | 8 | 64 GB | 800 | $700 |
Most applications reach their limit at 64 GB RAM and 8 vCPUs. Beyond that, vertical scaling becomes cost-prohibitive.
Step 3: Connection Pooling ($)
Application (50 pods x 20 connections = 1,000 connections)
|
v
PgBouncer (25 database connections, transaction pooling)
|
v
PostgreSQL (25 active connections, manageable)
PgBouncer configuration:
[databases]
app = host=db.example.com port=5432 dbname=production
[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
pool_mode = transaction
default_pool_size = 25
max_client_conn = 1000
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3
Step 4: Read Replicas ($$)
Read replicas handle SELECT queries, offloading 60-90% of database load from the primary.
Architecture:
Write queries --> Primary database
|
Replication (async)
|
+----+----+
| |
Read queries --> Replica 1 Replica 2
Application-level routing (Drizzle ORM example):
import { drizzle } from 'drizzle-orm/node-postgres';
import { Pool } from 'pg';
const primaryPool = new Pool({ connectionString: process.env.DATABASE_URL });
const replicaPool = new Pool({ connectionString: process.env.DATABASE_REPLICA_URL });
export const primaryDb = drizzle(primaryPool);
export const replicaDb = drizzle(replicaPool);
// In service code:
// Write operations use primaryDb
async createOrder(data: OrderInput) {
return primaryDb.insert(orders).values(data).returning();
}
// Read operations use replicaDb
async getOrders(organizationId: string) {
return replicaDb.select().from(orders)
.where(eq(orders.organizationId, organizationId))
.orderBy(desc(orders.createdAt));
}
Replication lag considerations: Async replication introduces a delay (typically 10-100ms). Immediately after a write, reading from the replica may return stale data. Use the primary for reads that follow writes in the same user flow.
Step 5: Caching ($$)
Redis caching eliminates repeated database queries entirely.
async getProduct(id: string): Promise<Product> {
const cacheKey = `product:${id}`;
// Check cache first
const cached = await this.redis.get(cacheKey);
if (cached) return JSON.parse(cached);
// Cache miss: query database
const product = await this.db.select().from(products)
.where(eq(products.id, id))
.limit(1);
// Cache for 5 minutes
await this.redis.setex(cacheKey, 300, JSON.stringify(product[0]));
return product[0];
}
Cache invalidation strategy: Invalidate on write. When a product is updated, delete the cache key. Use a cache-aside pattern (application manages the cache) rather than write-through (database manages the cache).
Step 6: Horizontal Sharding ($$$)
Sharding distributes data across multiple database instances based on a shard key.
| Sharding Strategy | Description | Best For |
|---|---|---|
| Hash-based | Hash the shard key, distribute evenly | Even data distribution |
| Range-based | Assign ranges to shards (e.g., A-M, N-Z) | Time-series, geographic data |
| Tenant-based | One shard per tenant/organization | Multi-tenant SaaS |
When to shard:
- Single database exceeds 1 TB and growing
- Write throughput exceeds what a single primary can handle
- Vertical scaling costs exceed $2,000/month with no headroom
When NOT to shard:
- You have not exhausted steps 1-5
- Your data fits in a single 500 GB database
- Cross-shard queries are common in your application
PostgreSQL-Specific Optimizations
Partitioning (Before Sharding)
PostgreSQL table partitioning splits large tables into smaller physical tables while maintaining a single logical table:
-- Partition orders by month
CREATE TABLE orders (
id UUID PRIMARY KEY,
organization_id UUID NOT NULL,
created_at TIMESTAMP NOT NULL,
total DECIMAL(10,2)
) PARTITION BY RANGE (created_at);
CREATE TABLE orders_2026_01 PARTITION OF orders
FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');
CREATE TABLE orders_2026_02 PARTITION OF orders
FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');
CREATE TABLE orders_2026_03 PARTITION OF orders
FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');
Partitioning improves query performance for time-range queries by 10-100x on large tables because PostgreSQL only scans relevant partitions.
Vacuuming and Maintenance
-- Check table bloat
SELECT
schemaname,
relname,
n_live_tup,
n_dead_tup,
round(n_dead_tup::numeric / greatest(n_live_tup, 1) * 100, 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;
Configure autovacuum aggressively for high-write tables:
ALTER TABLE orders SET (
autovacuum_vacuum_threshold = 100,
autovacuum_vacuum_scale_factor = 0.05,
autovacuum_analyze_threshold = 50,
autovacuum_analyze_scale_factor = 0.02
);
Monitoring Database Performance
Track these metrics to understand when and how to scale:
| Metric | Tool | Alert Threshold |
|---|---|---|
| Query latency (P95) | pg_stat_statements | >500ms |
| Active connections | pg_stat_activity | >80% of max |
| Cache hit ratio | pg_stat_database | <95% |
| Replication lag | pg_stat_replication | >1 second |
| Table bloat | pg_stat_user_tables | >20% dead tuples |
| Disk I/O wait | iostat / CloudWatch | >20ms |
A cache hit ratio below 95% is the strongest indicator that you need more memory. Increasing shared_buffers and effective_cache_size is often cheaper and faster than adding read replicas.
Query Performance Tracking
-- Enable pg_stat_statements (postgresql.conf)
-- shared_preload_libraries = 'pg_stat_statements'
-- Find the top 10 most time-consuming queries
SELECT
queryid,
calls,
mean_exec_time::numeric(10,2) AS avg_ms,
total_exec_time::numeric(10,2) AS total_ms,
rows,
query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;
Review the top 10 queries weekly. Optimizing even a single frequently executed query can reduce overall database load by 10-30%.
Frequently Asked Questions
How do we know when it is time to scale?
Monitor three metrics: query latency P95 (alert at 500ms), connection utilization (alert at 80%), and CPU utilization (alert at 70% sustained). If you are hitting these thresholds regularly, move to the next step on the scaling ladder. Do not pre-optimize --- scale when the data tells you to.
Read replicas or caching --- which first?
Start with caching. Redis caching is simpler to implement, eliminates more load (cache hits skip the database entirely), and costs less. Add read replicas when your cache hit rate is already above 80% but the primary database is still under pressure from cache misses and write operations.
How does database scaling work with Odoo?
Odoo uses PostgreSQL exclusively. Start with query optimization (Odoo generates complex queries for reporting). Add PgBouncer for connection pooling when you exceed 50 concurrent users. Use read replicas for reporting queries (configure Odoo's --db-replica option). ECOSIRE provides Odoo performance optimization including database tuning.
Is managed database (RDS/Cloud SQL) worth the premium?
Yes, for most businesses. Managed databases handle automated backups, patching, failover, and monitoring. The 30-40% cost premium over self-managed PostgreSQL is offset by the engineering time you save. The exception is large-scale deployments where the cost premium on a large instance exceeds the cost of a part-time DBA.
What Comes Next
Database scaling is one component of a broader infrastructure scaling strategy. Combine it with CDN optimization for static assets, Kubernetes auto-scaling for application pods, and load testing to validate your scaling decisions under realistic conditions.
Contact ECOSIRE for database optimization consulting, or see our DevOps guide for the full infrastructure roadmap.
Published by ECOSIRE -- helping businesses scale data infrastructure with confidence.
Written by
ECOSIRE Research and Development Team
Building enterprise-grade digital products at ECOSIRE. Sharing insights on Odoo integrations, e-commerce automation, and AI-powered business solutions.
Related Articles
API Gateway Patterns and Best Practices for Modern Applications
Implement API gateway patterns including rate limiting, authentication, request routing, circuit breakers, and API versioning for scalable web architectures.
CDN Performance Optimization: The Complete Guide to Faster Global Delivery
Optimize CDN performance with caching strategies, edge computing, image optimization, and multi-CDN architectures for faster global content delivery.
CI/CD Pipeline Best Practices: Automate Your Way to Reliable Deployments
Build reliable CI/CD pipelines with best practices for testing, staging, deployment automation, rollback strategies, and security scanning in production workflows.