Database Scaling Strategies | Pillai Infotech LLP

Q: When should I shard my database?

Only after exhausting vertical scaling, read replicas, caching, and query optimization — and you have a genuine write throughput or storage problem (typically > 1-2TB).

Q: Can I scale without application changes?

Vertical scaling and connection pooling require minimal/no code changes. Caching and sharding always require application modifications.

Q: Redis or Memcached for caching?

Redis in almost all cases — it supports data structures, persistence, pub/sub. Memcached only for pure key-value at extreme scale.

Q: How do I handle database scaling with microservices?

Each service owns its database (natural domain sharding). Cross-service data flows through APIs and events, not database joins.

In This Guide

1. Signs You Need to Scale
2. Vertical Scaling — The Quick Fix
3. Read Replicas — 80% of Your Reads, Offloaded
4. Connection Pooling — The Overlooked Bottleneck
5. Caching Layers — Don't Hit the Database at All
6. Sharding — The Nuclear Option
7. Query Optimization — Free Performance
8. The Scaling Playbook — What to Do When
9. Frequently Asked Questions

Every database hits a wall. The question isn't if — it's when, and whether you'll be ready. We've watched startups go from "our PostgreSQL handles everything" to "why are queries taking 30 seconds?" in a matter of weeks once they hit product-market fit. The good news: database scaling is a solved problem. The bad news: most teams reach for the wrong solution first.

This guide walks through every scaling strategy in the order you should consider them — from free optimizations to architectural overhauls. Most teams never need to go past step 5.

1. Signs You Need to Scale

Before you redesign anything, confirm you actually have a scaling problem — not a bad query problem.

Symptom	Likely Cause	First Action
Single query takes > 5s	Missing index or bad query	EXPLAIN ANALYZE
All queries slow during peak	CPU/memory saturated	Vertical scale or read replicas
Connection errors / timeouts	Connection pool exhausted	Connection pooler (PgBouncer)
Write throughput plateaued	Single-writer bottleneck	Partitioning, then sharding
Storage growing > 500GB	Data retention / archival needed	Partitioning + cold storage
Replication lag > 1s	Replica overloaded	Add replicas or split workloads

Our Rule of Thumb: If a single EXPLAIN ANALYZE + index addition can solve it, don't architect. We've seen teams spend weeks designing a sharding strategy when a composite index would've fixed the problem in 10 minutes.

2. Vertical Scaling — The Quick Fix

Vertical scaling means giving your existing database a bigger machine — more CPU, RAM, faster SSDs. It's unglamorous, but it works, and it's the simplest scaling strategy.

Cloud Provider	Max Instance	vCPUs	RAM	Cost/Month
AWS RDS	db.r7g.16xlarge	64	512 GB	~$8,500
GCP Cloud SQL	db-perf-optimized	128	864 GB	~$12,000
Azure Flexible	E96ds_v5	96	672 GB	~$9,200

When to vertical scale: Your CPU or RAM is consistently above 70% utilization, you're under 1TB data, and you haven't optimized queries yet. Vertical scaling buys you time to implement proper horizontal scaling.

When to stop: When doubling machine size gives less than 30% improvement, you've hit the vertical ceiling. Time for horizontal strategies.

3. Read Replicas — 80% of Your Reads, Offloaded

Most applications are read-heavy — often 80-90% reads vs 10-20% writes. Read replicas let you distribute that read load across multiple database copies while writes go to a single primary.

How Read Replicas Work

Application
    │
    ├── WRITE (INSERT/UPDATE/DELETE)
    │       └── Primary Database
    │               │
    │               ├── Async replication ──→ Replica 1 (analytics queries)
    │               ├── Async replication ──→ Replica 2 (API read traffic)
    │               └── Async replication ──→ Replica 3 (search/reporting)
    │
    └── READ (SELECT)
            └── Load Balancer → Replica 1, 2, or 3

Database	Max Replicas	Replication Type	Typical Lag
PostgreSQL	Unlimited (practical: 5-10)	Streaming (async/sync)	< 100ms
MySQL	Unlimited (practical: 5-10)	Binary log (async/semi-sync)	< 100ms
Aurora PostgreSQL	15	Storage-level replication	< 20ms
Cloud SQL (GCP)	20	Managed async	< 100ms

Application-Level Read/Write Splitting

// Database router — Laravel example
class DatabaseRouter {
    public function select($query) {
        // Route reads to replica
        return DB::connection('replica')->select($query);
    }

    public function insert($query, $bindings) {
        // Route writes to primary
        return DB::connection('primary')->insert($query, $bindings);
    }
}

// Django — built-in router
DATABASES = {
    'default': {'HOST': 'primary.db.internal'},
    'replica': {'HOST': 'replica.db.internal'},
}
DATABASE_ROUTERS = ['myapp.routers.ReadReplicaRouter']

// Spring Boot — @Transactional(readOnly = true) routes to replica

Watch Out — Replication Lag: If a user writes data and immediately reads it from a replica, they might see stale data. Route "read-after-write" queries to the primary for a few seconds, or use synchronous replication (at the cost of write latency).

4. Connection Pooling — The Overlooked Bottleneck

Each database connection consumes memory (5-10 MB in PostgreSQL). Serverless architectures and microservices create thousands of connections. A connection pooler sits between your app and database, multiplexing many app connections onto fewer database connections.

Pooler	Database	Mode	Best For
PgBouncer	PostgreSQL	Transaction pooling	High-connection serverless
Pgcat	PostgreSQL	Transaction + sharding	Multi-tenant sharded setups
ProxySQL	MySQL	Query routing + pooling	Read/write split + failover
Supavisor	PostgreSQL	Tenant-aware pooling	Multi-tenant SaaS

PgBouncer Configuration

[databases]
myapp = host=primary.db.internal port=5432 dbname=myapp

[pgbouncer]
listen_port = 6432
listen_addr = 0.0.0.0

; Transaction pooling — connection returned after each transaction
pool_mode = transaction

; 20 actual database connections serve 2,000 app connections
default_pool_size = 20
max_client_conn = 2000

; Reserve connections for admin
reserve_pool_size = 5
reserve_pool_timeout = 3

5. Caching Layers — Don't Hit the Database at All

The fastest database query is the one you never make. Caching stores frequently accessed data in memory (Redis, Memcached) so your application skips the database entirely for repeat reads.

Cache Pattern	How It Works	Best For	Staleness Risk
Cache-Aside	App checks cache → miss → query DB → store in cache	General purpose, user profiles	TTL-dependent
Write-Through	Write to cache + DB simultaneously	Data that's read immediately after write	Low
Write-Behind	Write to cache → async flush to DB	High-write workloads (counters, analytics)	Data loss risk on crash
Read-Through	Cache itself fetches from DB on miss	CDN patterns, materialized views	TTL-dependent

Cache-Aside Pattern — Redis Example

async function getUser(userId) {
    const cacheKey = `user:${userId}`;

    // 1. Check cache
    const cached = await redis.get(cacheKey);
    if (cached) return JSON.parse(cached);

    // 2. Cache miss — query database
    const user = await db.query('SELECT * FROM users WHERE id = $1', [userId]);

    // 3. Store in cache with TTL (5 minutes)
    await redis.setex(cacheKey, 300, JSON.stringify(user));

    return user;
}

// Invalidate on write
async function updateUser(userId, data) {
    await db.query('UPDATE users SET ... WHERE id = $1', [userId, ...data]);
    await redis.del(`user:${userId}`);  // Delete, don't update — avoids race conditions
}

Cache Invalidation Is Hard: "There are only two hard things in computer science: cache invalidation and naming things." Delete the cache key on write — don't try to update it. Use short TTLs (1-5 min) for data that changes frequently. For our Redis patterns deep-dive, we cover advanced invalidation strategies.

6. Sharding — The Nuclear Option

Sharding splits your data across multiple database instances. Each shard holds a subset of data. It's the most powerful scaling technique — and the most complex to implement and operate.

Sharding Strategy	How It Works	Pros	Cons
Hash-Based	shard = hash(key) % N	Even distribution	Resharding requires data migration
Range-Based	Shard by date, ID range, region	Range queries stay on one shard	Hotspots (recent data shard)
Directory-Based	Lookup table maps key → shard	Flexible, can rebalance easily	Lookup table is a SPOF
Tenant-Based	Each tenant/customer on own shard	Natural isolation, easy compliance	Uneven load if tenants vary

Consistent Hashing — Minimizing Resharding Pain

// Naive hashing — adding a shard moves ALL data
shard = hash(user_id) % 4   // user 12345 → shard 1
shard = hash(user_id) % 5   // user 12345 → shard 0 (DIFFERENT!)

// Consistent hashing — adding a shard moves ~1/N data
class ConsistentHash {
    constructor(nodes, virtualNodes = 150) {
        this.ring = new SortedMap();
        nodes.forEach(node => {
            for (let i = 0; i < virtualNodes; i++) {
                const hash = md5(`${node}:${i}`);
                this.ring.set(hash, node);
            }
        });
    }

    getShard(key) {
        const hash = md5(key);
        // Find first node clockwise on the ring
        const entry = this.ring.firstAfter(hash) || this.ring.first();
        return entry.value;
    }
}

What Sharding Breaks

Before you shard, understand what you lose:

JOINs across shards — queries spanning multiple shards require scatter-gather, which is slow
Transactions — ACID guarantees don't extend across database instances without distributed transactions (2PC)
Unique constraints — uniqueness is per-shard, not global. Need a separate uniqueness service
Aggregations — COUNT, SUM, AVG require querying all shards and merging results
Schema migrations — every shard needs the migration, and they can't all run simultaneously

Consider Managed Sharding First: Before building your own sharding layer, evaluate Citus (PostgreSQL), Vitess (MySQL), CockroachDB, or YugabyteDB. These handle shard routing, rebalancing, and distributed queries for you. DIY sharding is a multi-year investment — we've seen teams underestimate the operational overhead by 5-10x.

7. Query Optimization — Free Performance

Before scaling infrastructure, optimize what you have. We've seen 10-100x improvements from query optimization alone.

Index Strategy — The Biggest Win

-- BEFORE: Full table scan (120 seconds on 50M rows)
SELECT * FROM orders WHERE customer_id = 12345 AND status = 'active';

-- EXPLAIN shows: Seq Scan on orders (cost=0.00..1847293.00 rows=50000000)

-- FIX: Composite index matching the WHERE clause
CREATE INDEX idx_orders_customer_status ON orders(customer_id, status);

-- AFTER: Index scan (2ms)
-- EXPLAIN shows: Index Scan using idx_orders_customer_status (cost=0.56..8.72 rows=15)

-- RULE: Column order in composite index matters!
-- Put equality conditions first, range conditions last
CREATE INDEX idx_orders_date ON orders(status, customer_id, created_at DESC);
-- Works for: WHERE status = 'active' AND customer_id = 123 AND created_at > '2026-01-01'

Optimization	Impact	Effort	When to Use
Add missing indexes	10-1000x	Low	Always first
Rewrite N+1 → JOIN/batch	5-50x	Medium	ORM-heavy code
Pagination (cursor-based)	2-10x	Low	OFFSET > 10,000
Partitioning (by date)	2-20x	Medium	Tables > 100M rows with date queries
Materialized views	10-100x	Low	Complex aggregation dashboards
SELECT only needed columns	1.5-5x	Low	Wide tables, SELECT *

8. The Scaling Playbook — What to Do When

Follow this decision tree based on your current scale. Each step should be fully implemented before moving to the next.

Stage	Scale	Strategy	Effort	Headroom Gained
0	Any	Query optimization + indexing	Hours	10-100x for bad queries
1	< 1TB	Vertical scaling (bigger instance)	Minutes	2-4x
2	Read-heavy	Read replicas + connection pooling	Days	3-10x reads
3	Repeat reads	Redis/Memcached caching layer	Days	10-50x for cached data
4	> 100M rows	Table partitioning	Days	2-20x for partition-pruned queries
5	> 1TB / write-heavy	Sharding (or NewSQL: CockroachDB, Vitess)	Months	Near-linear horizontal scaling

What We've Learned Building Scaled Systems: The most common mistake is jumping to sharding too early. We've consulted for teams running sharded PostgreSQL at 50GB of data — all the complexity, none of the benefit. Exhaust vertical scaling, read replicas, and caching first. You'd be surprised how far a well-optimized single PostgreSQL instance can go — we've seen single instances handle 50,000 transactions/second comfortably with proper indexing and connection pooling.

Frequently Asked Questions

When should I shard my database?

Only after you've exhausted vertical scaling, read replicas, caching, and query optimization — and you have a genuine write throughput problem or your data exceeds what a single instance can store (typically > 1-2TB). Most applications never need sharding. If you're unsure, you probably don't need it yet.

Can I scale without application changes?

Vertical scaling and managed read replicas (like Aurora) require minimal code changes. Connection pooling (PgBouncer) requires zero code changes. Caching and sharding always require application-level modifications. The more impactful the strategy, the more code changes it needs.

Redis or Memcached for caching?

Redis in almost all cases. It supports data structures (sorted sets, lists, hashes), persistence, pub/sub, and Lua scripting. Memcached is simpler and slightly faster for pure key-value caching, but the feature gap has widened. Pick Memcached only if you need multi-threaded performance at extreme scale and don't need data structures.

How do I handle database scaling with microservices?

Each microservice owns its database — this is natural sharding by domain (orders service has orders DB, users service has users DB). Cross-service queries go through APIs, not database joins. Use eventual consistency via events (Kafka) for data that spans services. This pattern scales well but adds complexity to transactions.

Should I use a managed database or self-hosted?

Managed (RDS, Cloud SQL, Aurora) unless you have a dedicated DBA team. The operational overhead of backups, failover, patching, and monitoring is substantial. Managed databases cost 20-40% more but save hundreds of engineering hours. Self-host only if you need specific configurations that managed services don't support, or at massive scale where the cost premium becomes prohibitive.

🗄️

Pillai Infotech LLP

We design and implement database scaling strategies for growing applications — from query optimization to multi-region sharding. Let's scale your database.

Redis Caching Patterns: Beyond Simple Key-Value → PostgreSQL vs MySQL: Database Comparison for 2026 → NoSQL Databases Guide: MongoDB, Redis, Cassandra, and DynamoDB →

Database Scaling Strategies: Sharding, Replication, and Caching

In This Guide

1. Signs You Need to Scale

2. Vertical Scaling — The Quick Fix

3. Read Replicas — 80% of Your Reads, Offloaded

How Read Replicas Work

Application-Level Read/Write Splitting

4. Connection Pooling — The Overlooked Bottleneck

PgBouncer Configuration

5. Caching Layers — Don't Hit the Database at All

Cache-Aside Pattern — Redis Example

6. Sharding — The Nuclear Option

Consistent Hashing — Minimizing Resharding Pain

What Sharding Breaks

7. Query Optimization — Free Performance

Index Strategy — The Biggest Win

8. The Scaling Playbook — What to Do When

Frequently Asked Questions

When should I shard my database?

Can I scale without application changes?

Redis or Memcached for caching?

How do I handle database scaling with microservices?

Should I use a managed database or self-hosted?

Pillai Infotech LLP

Related Articles

Database Scaling Strategies: Sharding, Replication, and Caching

In This Guide

1. Signs You Need to Scale

2. Vertical Scaling — The Quick Fix

3. Read Replicas — 80% of Your Reads, Offloaded

How Read Replicas Work

Application-Level Read/Write Splitting

4. Connection Pooling — The Overlooked Bottleneck

PgBouncer Configuration

5. Caching Layers — Don't Hit the Database at All

Cache-Aside Pattern — Redis Example

6. Sharding — The Nuclear Option

Consistent Hashing — Minimizing Resharding Pain

What Sharding Breaks

7. Query Optimization — Free Performance

Index Strategy — The Biggest Win

8. The Scaling Playbook — What to Do When

Frequently Asked Questions

When should I shard my database?

Can I scale without application changes?

Redis or Memcached for caching?

How do I handle database scaling with microservices?

Should I use a managed database or self-hosted?

Pillai Infotech LLP

Related Articles

Book a Free Consultation

Your Details

Pick a 30-min Slot

Thank You!