Database Scaling Strategies: Read Replicas, Sharding, and Beyond

Database performance is the bottleneck in 78% of web application scaling issues. Applications can scale horizontally with minimal effort, but databases resist horizontal scaling. The strategies you choose for database scaling determine whether your application serves 100 users or 100,000 users with acceptable performance.

This guide covers the complete spectrum of database scaling strategies, from simple optimizations that delay the need for scaling to advanced techniques like horizontal sharding.

Key Takeaways

Optimize queries and add indexes before adding infrastructure --- this solves 60% of database performance issues

Read replicas are the lowest-risk scaling strategy and handle 80% of read-heavy workloads

Connection pooling is mandatory once your application runs more than 10 instances

Horizontal sharding is a last resort that introduces significant application complexity

The Scaling Ladder

Scale in this order. Each step is cheaper and less risky than the next:

Step 1: Query Optimization (Free)

Before adding infrastructure, ensure your existing database is performing optimally.

-- Find slow queries in PostgreSQL
SELECT
  calls,
  mean_exec_time::numeric(10,2) AS avg_ms,
  total_exec_time::numeric(10,2) AS total_ms,
  query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;

Common optimizations:

Add missing indexes for frequently filtered columns
Replace SELECT * with specific column lists
Use EXPLAIN ANALYZE to identify sequential scans on large tables
Add composite indexes for multi-column WHERE clauses
Implement pagination with keyset pagination instead of OFFSET

-- Bad: OFFSET pagination (scans all skipped rows)
SELECT * FROM orders ORDER BY created_at DESC LIMIT 20 OFFSET 10000;

-- Good: Keyset pagination (index-only scan)
SELECT * FROM orders
WHERE created_at < '2026-03-15T10:00:00Z'
ORDER BY created_at DESC
LIMIT 20;

Step 2: Vertical Scaling ($)

Increase CPU, RAM, and storage on your existing database server. This buys time and requires zero application changes.

Instance Size	vCPU	RAM	Connections	Monthly Cost (RDS)
db.t3.medium	2	4 GB	100	$65
db.r6g.large	2	16 GB	200	$175
db.r6g.xlarge	4	32 GB	400	$350
db.r6g.2xlarge	8	64 GB	800	$700

Most applications reach their limit at 64 GB RAM and 8 vCPUs. Beyond that, vertical scaling becomes cost-prohibitive.

Step 3: Connection Pooling ($)

Application (50 pods x 20 connections = 1,000 connections)
    |
    v
PgBouncer (25 database connections, transaction pooling)
    |
    v
PostgreSQL (25 active connections, manageable)

PgBouncer configuration:

[databases]
app = host=db.example.com port=5432 dbname=production

[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
pool_mode = transaction
default_pool_size = 25
max_client_conn = 1000
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3

Step 4: Read Replicas ($$)

Read replicas handle SELECT queries, offloading 60-90% of database load from the primary.

Architecture:

Write queries --> Primary database
                      |
              Replication (async)
                      |
                 +----+----+
                 |         |
Read queries --> Replica 1  Replica 2

Application-level routing (Drizzle ORM example):

import { drizzle } from 'drizzle-orm/node-postgres';
import { Pool } from 'pg';

const primaryPool = new Pool({ connectionString: process.env.DATABASE_URL });
const replicaPool = new Pool({ connectionString: process.env.DATABASE_REPLICA_URL });

export const primaryDb = drizzle(primaryPool);
export const replicaDb = drizzle(replicaPool);

// In service code:
// Write operations use primaryDb
async createOrder(data: OrderInput) {
  return primaryDb.insert(orders).values(data).returning();
}

// Read operations use replicaDb
async getOrders(organizationId: string) {
  return replicaDb.select().from(orders)
    .where(eq(orders.organizationId, organizationId))
    .orderBy(desc(orders.createdAt));
}

Replication lag considerations: Async replication introduces a delay (typically 10-100ms). Immediately after a write, reading from the replica may return stale data. Use the primary for reads that follow writes in the same user flow.

Step 5: Caching ($$)

Redis caching eliminates repeated database queries entirely.

async getProduct(id: string): Promise&lt;Product> {
  const cacheKey = `product:${id}`;

  // Check cache first
  const cached = await this.redis.get(cacheKey);
  if (cached) return JSON.parse(cached);

  // Cache miss: query database
  const product = await this.db.select().from(products)
    .where(eq(products.id, id))
    .limit(1);

  // Cache for 5 minutes
  await this.redis.setex(cacheKey, 300, JSON.stringify(product[0]));

  return product[0];
}

Cache invalidation strategy: Invalidate on write. When a product is updated, delete the cache key. Use a cache-aside pattern (application manages the cache) rather than write-through (database manages the cache).

Step 6: Horizontal Sharding ($$$)

Sharding distributes data across multiple database instances based on a shard key.

Sharding Strategy	Description	Best For
Hash-based	Hash the shard key, distribute evenly	Even data distribution
Range-based	Assign ranges to shards (e.g., A-M, N-Z)	Time-series, geographic data
Tenant-based	One shard per tenant/organization	Multi-tenant SaaS

When to shard:

Single database exceeds 1 TB and growing
Write throughput exceeds what a single primary can handle
Vertical scaling costs exceed $2,000/month with no headroom

When NOT to shard:

You have not exhausted steps 1-5
Your data fits in a single 500 GB database
Cross-shard queries are common in your application

PostgreSQL-Specific Optimizations

Partitioning (Before Sharding)

PostgreSQL table partitioning splits large tables into smaller physical tables while maintaining a single logical table:

-- Partition orders by month
CREATE TABLE orders (
    id UUID PRIMARY KEY,
    organization_id UUID NOT NULL,
    created_at TIMESTAMP NOT NULL,
    total DECIMAL(10,2)
) PARTITION BY RANGE (created_at);

CREATE TABLE orders_2026_01 PARTITION OF orders
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE orders_2026_02 PARTITION OF orders
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

CREATE TABLE orders_2026_03 PARTITION OF orders
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

Partitioning improves query performance for time-range queries by 10-100x on large tables because PostgreSQL only scans relevant partitions.

Vacuuming and Maintenance

-- Check table bloat
SELECT
  schemaname,
  relname,
  n_live_tup,
  n_dead_tup,
  round(n_dead_tup::numeric / greatest(n_live_tup, 1) * 100, 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Configure autovacuum aggressively for high-write tables:

ALTER TABLE orders SET (
  autovacuum_vacuum_threshold = 100,
  autovacuum_vacuum_scale_factor = 0.05,
  autovacuum_analyze_threshold = 50,
  autovacuum_analyze_scale_factor = 0.02
);

Monitoring Database Performance

Track these metrics to understand when and how to scale:

Metric	Tool	Alert Threshold
Query latency (P95)	pg_stat_statements	>500ms
Active connections	pg_stat_activity	>80% of max
Cache hit ratio	pg_stat_database	<95%
Replication lag	pg_stat_replication	>1 second
Table bloat	pg_stat_user_tables	>20% dead tuples
Disk I/O wait	iostat / CloudWatch	>20ms

A cache hit ratio below 95% is the strongest indicator that you need more memory. Increasing shared_buffers and effective_cache_size is often cheaper and faster than adding read replicas.

Query Performance Tracking

-- Enable pg_stat_statements (postgresql.conf)
-- shared_preload_libraries = 'pg_stat_statements'

-- Find the top 10 most time-consuming queries
SELECT
  queryid,
  calls,
  mean_exec_time::numeric(10,2) AS avg_ms,
  total_exec_time::numeric(10,2) AS total_ms,
  rows,
  query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Review the top 10 queries weekly. Optimizing even a single frequently executed query can reduce overall database load by 10-30%.

Frequently Asked Questions

How do we know when it is time to scale?

Monitor three metrics: query latency P95 (alert at 500ms), connection utilization (alert at 80%), and CPU utilization (alert at 70% sustained). If you are hitting these thresholds regularly, move to the next step on the scaling ladder. Do not pre-optimize --- scale when the data tells you to.

Read replicas or caching --- which first?

Start with caching. Redis caching is simpler to implement, eliminates more load (cache hits skip the database entirely), and costs less. Add read replicas when your cache hit rate is already above 80% but the primary database is still under pressure from cache misses and write operations.

How does database scaling work with Odoo?

Odoo uses PostgreSQL exclusively. Start with query optimization (Odoo generates complex queries for reporting). Add PgBouncer for connection pooling when you exceed 50 concurrent users. Use read replicas for reporting queries (configure Odoo's --db-replica option). ECOSIRE provides Odoo performance optimization including database tuning.

Is managed database (RDS/Cloud SQL) worth the premium?

Yes, for most businesses. Managed databases handle automated backups, patching, failover, and monitoring. The 30-40% cost premium over self-managed PostgreSQL is offset by the engineering time you save. The exception is large-scale deployments where the cost premium on a large instance exceeds the cost of a part-time DBA.

What Comes Next

Database scaling is one component of a broader infrastructure scaling strategy. Combine it with CDN optimization for static assets, Kubernetes auto-scaling for application pods, and load testing to validate your scaling decisions under realistic conditions.

Contact ECOSIRE for database optimization consulting, or see our DevOps guide for the full infrastructure roadmap.

Published by ECOSIRE -- helping businesses scale data infrastructure with confidence.

This guide covers the complete spectrum of database scaling strategies, from simple optimizations that delay the need for scaling to advanced techniques like horizontal sharding.

Key Takeaways

Optimize queries and add indexes before adding infrastructure --- this solves 60% of database performance issues

Read replicas are the lowest-risk scaling strategy and handle 80% of read-heavy workloads

Connection pooling is mandatory once your application runs more than 10 instances

Horizontal sharding is a last resort that introduces significant application complexity

The Scaling Ladder

Scale in this order. Each step is cheaper and less risky than the next:

Step 1: Query Optimization (Free)

Before adding infrastructure, ensure your existing database is performing optimally.

-- Find slow queries in PostgreSQL
SELECT
  calls,
  mean_exec_time::numeric(10,2) AS avg_ms,
  total_exec_time::numeric(10,2) AS total_ms,
  query
FROM pg_stat_statements
ORDER BY mean_exec_time DESC
LIMIT 20;

Common optimizations:

Add missing indexes for frequently filtered columns
Replace SELECT * with specific column lists
Use EXPLAIN ANALYZE to identify sequential scans on large tables
Add composite indexes for multi-column WHERE clauses
Implement pagination with keyset pagination instead of OFFSET

-- Bad: OFFSET pagination (scans all skipped rows)
SELECT * FROM orders ORDER BY created_at DESC LIMIT 20 OFFSET 10000;

-- Good: Keyset pagination (index-only scan)
SELECT * FROM orders
WHERE created_at < '2026-03-15T10:00:00Z'
ORDER BY created_at DESC
LIMIT 20;

Step 2: Vertical Scaling ($)

Increase CPU, RAM, and storage on your existing database server. This buys time and requires zero application changes.

Instance Size	vCPU	RAM	Connections	Monthly Cost (RDS)
db.t3.medium	2	4 GB	100	$65
db.r6g.large	2	16 GB	200	$175
db.r6g.xlarge	4	32 GB	400	$350
db.r6g.2xlarge	8	64 GB	800	$700

Most applications reach their limit at 64 GB RAM and 8 vCPUs. Beyond that, vertical scaling becomes cost-prohibitive.

Step 3: Connection Pooling ($)

Application (50 pods x 20 connections = 1,000 connections)
    |
    v
PgBouncer (25 database connections, transaction pooling)
    |
    v
PostgreSQL (25 active connections, manageable)

PgBouncer configuration:

[databases]
app = host=db.example.com port=5432 dbname=production

[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0
auth_type = md5
pool_mode = transaction
default_pool_size = 25
max_client_conn = 1000
min_pool_size = 5
reserve_pool_size = 5
reserve_pool_timeout = 3

Step 4: Read Replicas ($$)

Read replicas handle SELECT queries, offloading 60-90% of database load from the primary.

Architecture:

Write queries --> Primary database
                      |
              Replication (async)
                      |
                 +----+----+
                 |         |
Read queries --> Replica 1  Replica 2

Application-level routing (Drizzle ORM example):

import { drizzle } from 'drizzle-orm/node-postgres';
import { Pool } from 'pg';

const primaryPool = new Pool({ connectionString: process.env.DATABASE_URL });
const replicaPool = new Pool({ connectionString: process.env.DATABASE_REPLICA_URL });

export const primaryDb = drizzle(primaryPool);
export const replicaDb = drizzle(replicaPool);

// In service code:
// Write operations use primaryDb
async createOrder(data: OrderInput) {
  return primaryDb.insert(orders).values(data).returning();
}

// Read operations use replicaDb
async getOrders(organizationId: string) {
  return replicaDb.select().from(orders)
    .where(eq(orders.organizationId, organizationId))
    .orderBy(desc(orders.createdAt));
}

Step 5: Caching ($$)

Redis caching eliminates repeated database queries entirely.

async getProduct(id: string): Promise&lt;Product> {
  const cacheKey = `product:${id}`;

  // Check cache first
  const cached = await this.redis.get(cacheKey);
  if (cached) return JSON.parse(cached);

  // Cache miss: query database
  const product = await this.db.select().from(products)
    .where(eq(products.id, id))
    .limit(1);

  // Cache for 5 minutes
  await this.redis.setex(cacheKey, 300, JSON.stringify(product[0]));

  return product[0];
}

Step 6: Horizontal Sharding ($$$)

Sharding distributes data across multiple database instances based on a shard key.

Sharding Strategy	Description	Best For
Hash-based	Hash the shard key, distribute evenly	Even data distribution
Range-based	Assign ranges to shards (e.g., A-M, N-Z)	Time-series, geographic data
Tenant-based	One shard per tenant/organization	Multi-tenant SaaS

When to shard:

Single database exceeds 1 TB and growing
Write throughput exceeds what a single primary can handle
Vertical scaling costs exceed $2,000/month with no headroom

When NOT to shard:

You have not exhausted steps 1-5
Your data fits in a single 500 GB database
Cross-shard queries are common in your application

PostgreSQL-Specific Optimizations

Partitioning (Before Sharding)

PostgreSQL table partitioning splits large tables into smaller physical tables while maintaining a single logical table:

-- Partition orders by month
CREATE TABLE orders (
    id UUID PRIMARY KEY,
    organization_id UUID NOT NULL,
    created_at TIMESTAMP NOT NULL,
    total DECIMAL(10,2)
) PARTITION BY RANGE (created_at);

CREATE TABLE orders_2026_01 PARTITION OF orders
    FOR VALUES FROM ('2026-01-01') TO ('2026-02-01');

CREATE TABLE orders_2026_02 PARTITION OF orders
    FOR VALUES FROM ('2026-02-01') TO ('2026-03-01');

CREATE TABLE orders_2026_03 PARTITION OF orders
    FOR VALUES FROM ('2026-03-01') TO ('2026-04-01');

Partitioning improves query performance for time-range queries by 10-100x on large tables because PostgreSQL only scans relevant partitions.

Vacuuming and Maintenance

-- Check table bloat
SELECT
  schemaname,
  relname,
  n_live_tup,
  n_dead_tup,
  round(n_dead_tup::numeric / greatest(n_live_tup, 1) * 100, 2) AS dead_pct
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC;

Configure autovacuum aggressively for high-write tables:

ALTER TABLE orders SET (
  autovacuum_vacuum_threshold = 100,
  autovacuum_vacuum_scale_factor = 0.05,
  autovacuum_analyze_threshold = 50,
  autovacuum_analyze_scale_factor = 0.02
);

Monitoring Database Performance

Track these metrics to understand when and how to scale:

Metric	Tool	Alert Threshold
Query latency (P95)	pg_stat_statements	>500ms
Active connections	pg_stat_activity	>80% of max
Cache hit ratio	pg_stat_database	<95%
Replication lag	pg_stat_replication	>1 second
Table bloat	pg_stat_user_tables	>20% dead tuples
Disk I/O wait	iostat / CloudWatch	>20ms

A cache hit ratio below 95% is the strongest indicator that you need more memory. Increasing shared_buffers and effective_cache_size is often cheaper and faster than adding read replicas.

Query Performance Tracking

-- Enable pg_stat_statements (postgresql.conf)
-- shared_preload_libraries = 'pg_stat_statements'

-- Find the top 10 most time-consuming queries
SELECT
  queryid,
  calls,
  mean_exec_time::numeric(10,2) AS avg_ms,
  total_exec_time::numeric(10,2) AS total_ms,
  rows,
  query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

Review the top 10 queries weekly. Optimizing even a single frequently executed query can reduce overall database load by 10-30%.

Frequently Asked Questions

How do we know when it is time to scale?

Read replicas or caching --- which first?

How does database scaling work with Odoo?

Is managed database (RDS/Cloud SQL) worth the premium?

What Comes Next

Contact ECOSIRE for database optimization consulting, or see our DevOps guide for the full infrastructure roadmap.

Published by ECOSIRE -- helping businesses scale data infrastructure with confidence.

Database Scaling Strategies: Read Replicas, Sharding, and Beyond

The Scaling Ladder

Step 1: Query Optimization (Free)

Step 2: Vertical Scaling ($)

Step 3: Connection Pooling ($)

Step 4: Read Replicas ($$)

Step 5: Caching ($$)

Step 6: Horizontal Sharding ($$$)

PostgreSQL-Specific Optimizations

Partitioning (Before Sharding)

Vacuuming and Maintenance

Monitoring Database Performance

Query Performance Tracking

Frequently Asked Questions

What Comes Next

Grow Your Business with ECOSIRE

Related Articles

How Much Does Cloud Hosting Cost in 2026? Real Price Breakdown (AWS, Hetzner, DigitalOcean, Odoo.sh)

Odoo Hosting Requirements in 2026: Server Sizing by User Count (With Real Configs)

Odoo 19 Performance Benchmarks: PostgreSQL 17 Tuning Numbers

Database Scaling Strategies: Read Replicas, Sharding, and Beyond

The Scaling Ladder

Step 1: Query Optimization (Free)

Step 2: Vertical Scaling ($)

Step 3: Connection Pooling ($)

Step 4: Read Replicas ($$)

Step 5: Caching ($$)

Step 6: Horizontal Sharding ($$$)

PostgreSQL-Specific Optimizations

Partitioning (Before Sharding)

Vacuuming and Maintenance

Monitoring Database Performance

Query Performance Tracking

Frequently Asked Questions

What Comes Next

Grow Your Business with ECOSIRE

Related Articles

How Much Does Cloud Hosting Cost in 2026? Real Price Breakdown (AWS, Hetzner, DigitalOcean, Odoo.sh)

Odoo Hosting Requirements in 2026: Server Sizing by User Count (With Real Configs)

Odoo 19 Performance Benchmarks: PostgreSQL 17 Tuning Numbers