Database Migration Strategies | Pillai Infotech LLP

Q: Is zero-downtime migration really possible?

Yes, using expand-contract for schema changes and dual-write + shadow reads for engine migrations. The tradeoff is increased complexity and duration.

Q: How do I migrate a 500GB database?

Initial bulk load with COPY/pg_dump, then CDC (Debezium/AWS DMS) for ongoing changes. Cut over once CDC catches up and consistency is verified.

Q: What's the biggest migration mistake teams make?

Not testing with production-volume data. A 2-second dev migration can lock tables for 45 minutes on 100M rows.

Q: Should I use an ORM migration tool or raw SQL?

ORM tools for simple schema changes. Raw SQL for complex migrations — ORM-generated SQL often includes unnecessary locks.

Q: How do I handle foreign keys during migration?

Migrate parent tables first, then child tables. Add foreign keys as NOT VALID first, backfill, then VALIDATE separately.

In This Guide

1. Types of Database Migrations
2. Schema Migrations — The Expand-Contract Pattern
3. Engine Migrations — Changing Databases Entirely
4. The Dual-Write Pattern
5. Shadow Database Strategy
6. Large-Scale Data Migration
7. Rollback Strategies
8. Migration Checklist
9. Frequently Asked Questions

Database migrations are among the riskiest operations in software engineering. A bad migration can take down your application, corrupt data, or create inconsistencies that take weeks to untangle. We've been through enough of these — both smooth and painful — to know what works and what doesn't.

This guide covers every migration scenario: schema changes that don't lock your tables, engine swaps that don't lose data, and rollback plans that actually work when things go wrong. The common thread: never do anything you can't undo.

1. Types of Database Migrations

Migration Type	Example	Risk Level	Typical Downtime
Schema change	Add column, rename table	Low–Medium	0 (if done right)
Data transformation	Backfill new column, split table	Medium	0 (batched)
Version upgrade	PostgreSQL 15 → 16	Medium	Minutes (replica promotion)
Engine migration	MySQL → PostgreSQL	High	0 (dual-write), minutes (cutover)
Cloud migration	Self-hosted → RDS/Cloud SQL	High	0 (with DMS/replication)

2. Schema Migrations — The Expand-Contract Pattern

The expand-contract pattern is the gold standard for zero-downtime schema changes. Instead of modifying in place, you expand (add new), migrate data, then contract (remove old).

Example: Renaming a Column (the safe way)

-- WRONG: This breaks all existing queries instantly
ALTER TABLE users RENAME COLUMN name TO full_name;  -- 💥 Downtime

-- RIGHT: Expand-Contract (3 deployments, 0 downtime)

-- Deploy 1: EXPAND — add new column
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
-- Backfill (batched to avoid locks):
UPDATE users SET full_name = name WHERE id BETWEEN 1 AND 10000;
UPDATE users SET full_name = name WHERE id BETWEEN 10001 AND 20000;
-- ...continue in batches

-- Deploy 2: MIGRATE — app writes to both, reads from new
-- Application code:
-- INSERT INTO users (name, full_name, ...) VALUES ($1, $1, ...);
-- SELECT full_name FROM users WHERE ...;

-- Deploy 3: CONTRACT — remove old column (after verifying all reads use new)
ALTER TABLE users DROP COLUMN name;

Dangerous Schema Operations (and Safe Alternatives)

Operation	Risk	Safe Alternative
ADD COLUMN with DEFAULT	Table lock (MySQL < 8.0)	ADD COLUMN, then backfill in batches
ADD NOT NULL constraint	Full table scan to validate	Add CHECK constraint with NOT VALID, then VALIDATE separately
CREATE INDEX	Table lock during build	CREATE INDEX CONCURRENTLY (PostgreSQL)
Change column type	Table rewrite	Add new column, backfill, swap reads, drop old
DROP COLUMN	Breaks code still reading it	Remove all reads first, then drop in separate deploy
RENAME TABLE	Breaks all queries	Create new table, dual-write, migrate reads, drop old

Tools That Help: gh-ost (GitHub) and pt-online-schema-change (Percona) for MySQL — they create shadow tables and swap atomically. pgroll and pg_repack for PostgreSQL. Rails has strong_migrations gem that blocks dangerous migrations. Django has django-pg-zero-downtime-migrations.

3. Engine Migrations — Changing Databases Entirely

Switching from MySQL to PostgreSQL, or from PostgreSQL to a NoSQL database, is a major undertaking. Here's the proven four-phase approach.

The Four Phases

Phase 1: PREPARE (2-4 weeks)
├── Audit all queries — identify incompatible syntax
├── Build abstraction layer (Repository pattern)
├── Set up new database with equivalent schema
├── Build data validation scripts (row counts, checksums)
└── Create rollback plan

Phase 2: DUAL-WRITE (1-2 weeks)
├── Write to BOTH old and new databases
├── Read from OLD database only
├── Verify new database has consistent data
└── Fix any sync discrepancies

Phase 3: SHADOW-READ (1-2 weeks)
├── Read from BOTH databases
├── Compare results, log mismatches
├── Primary reads still from old database
└── Fix all mismatches before proceeding

Phase 4: CUTOVER (1 day)
├── Switch reads to new database
├── Keep dual-writes for rollback window (24-48 hours)
├── Monitor error rates, latency, data consistency
└── Disable old database writes after rollback window

4. The Dual-Write Pattern

Dual-write means every write goes to both the old and new database. It's the foundation of zero-downtime engine migrations, but it has subtle pitfalls.

Dual-Write Implementation

class OrderRepository {
    constructor(
        private oldDb: MySQLConnection,
        private newDb: PostgreSQLConnection,
        private migrationPhase: 'dual-write' | 'shadow-read' | 'cutover'
    ) {}

    async createOrder(order: Order): Promise<Order> {
        // Always write to old (source of truth during migration)
        const result = await this.oldDb.insert('orders', order);

        // Write to new database — async, don't block on failure
        try {
            await this.newDb.insert('orders', this.transformForNewDb(order));
        } catch (error) {
            // Log for reconciliation, DON'T fail the request
            logger.error('Dual-write to new DB failed', { orderId: order.id, error });
            await this.reconciliationQueue.add({ table: 'orders', id: order.id });
        }

        return result;
    }

    async getOrder(id: string): Promise<Order> {
        if (this.migrationPhase === 'cutover') {
            return this.newDb.query('SELECT * FROM orders WHERE id = $1', [id]);
        }

        const oldResult = await this.oldDb.query('SELECT * FROM orders WHERE id = ?', [id]);

        if (this.migrationPhase === 'shadow-read') {
            // Read from new DB too, compare results, log mismatches
            const newResult = await this.newDb.query('SELECT * FROM orders WHERE id = $1', [id]);
            if (!this.deepEqual(oldResult, newResult)) {
                logger.warn('Shadow read mismatch', { id, old: oldResult, new: newResult });
            }
        }

        return oldResult;  // Always return from old during migration
    }
}

Dual-Write Pitfalls:

Ordering issues — if write to old succeeds but new fails, then retries, you can get out-of-order writes. Use idempotent writes.
Increased latency — writing to two databases doubles write time. Make the secondary write async.
Transaction consistency — you can't have ACID transactions across two databases. Accept eventual consistency during migration.

5. Shadow Database Strategy

An alternative to dual-write: use Change Data Capture (CDC) to replicate from old to new. This is less invasive — no application code changes for the write path.

Application ──→ Old Database (MySQL)
                    │
                    │  CDC (Debezium / DMS)
                    ▼
              Shadow Database (PostgreSQL)
                    │
                    │  Validation job (compares row counts, checksums)
                    ▼
              Dashboard: "99.97% consistent, 12 rows divergent"

When consistency reaches 100% and stays there for 48 hours → safe to cut over

Approach	Pros	Cons	Best For
Dual-Write	Full control, works with any DB	App code changes, consistency risk	Different DB engines
CDC (Debezium)	No app changes, real-time sync	Setup complexity, transform limitations	Same schema, different engine
AWS DMS	Managed, handles schema conversion	AWS lock-in, limited customization	Cloud migrations

6. Large-Scale Data Migration

Moving terabytes of data without locking tables or consuming all your database's resources requires careful batching.

Batched Data Backfill

#!/bin/bash
# Batched migration — 1000 rows at a time with throttling

BATCH_SIZE=1000
LAST_ID=0
TOTAL=0

while true; do
    # Migrate next batch
    COUNT=$(psql -t -A -c "
        INSERT INTO new_orders (id, customer_id, amount, status, created_at)
        SELECT id, customer_id, amount, status, created_at
        FROM old_orders
        WHERE id > $LAST_ID
        ORDER BY id
        LIMIT $BATCH_SIZE
        ON CONFLICT (id) DO NOTHING
        RETURNING 1
    " | wc -l)

    # Update cursor
    LAST_ID=$(psql -t -A -c "
        SELECT COALESCE(MAX(id), $LAST_ID) FROM new_orders WHERE id > $LAST_ID
    ")

    TOTAL=$((TOTAL + COUNT))
    echo "Migrated $TOTAL rows (last_id: $LAST_ID)"

    # No more rows — done
    if [ "$COUNT" -lt "$BATCH_SIZE" ]; then
        echo "Migration complete: $TOTAL rows"
        break
    fi

    # Throttle — sleep between batches to avoid overwhelming the database
    sleep 0.1
done

Data Volume	Recommended Approach	Estimated Time
< 1 GB	pg_dump / mysqldump	Minutes
1–100 GB	Batched INSERT with throttling	Hours
100 GB – 1 TB	Parallel workers + CDC for ongoing changes	Hours to days
> 1 TB	Initial bulk load (COPY) + CDC for delta sync	Days

7. Rollback Strategies

Every migration needs a tested rollback plan. The best rollback is one you never need — but the second-best is one that's been practiced.

Migration Type	Rollback Strategy	Rollback Time	Data Loss Risk
Add column	Drop column (safe if not read yet)	Seconds	None
Engine migration	Keep dual-writes active, switch reads back	Minutes (config change)	None (both DBs have data)
Version upgrade	Promote old replica back to primary	Minutes	Writes since cutover
Data transformation	Keep old column until verified, revert reads	Seconds (deploy revert)	None (old data preserved)

Our Rollback Rule: Never delete old data, columns, or tables until at least one full business cycle (usually one week) after migration. Keep the old path available. The cost of extra storage is trivial compared to the cost of not being able to roll back. We've seen teams delete the old column on Friday, discover a bug on Monday, and spend the next week reconstructing data.

8. Migration Checklist

Use this checklist for any non-trivial database migration. Skip items at your own risk.

Pre-Migration

Document rollback plan — specific commands, not "revert if needed"
Test migration on staging with production-size data
Measure migration duration on staging (multiply by 1.5x for production)
Notify on-call team and stakeholders of migration window
Take a backup (and verify you can restore from it)
Set up monitoring dashboards for error rates and latency

During Migration

Monitor database CPU, connections, replication lag
Watch application error rates in real-time
Run data validation queries continuously
Have rollback command ready to execute

Post-Migration

Verify row counts match (old vs new)
Run checksums on critical tables
Test all critical application flows end-to-end
Keep old database running for rollback window (minimum 48 hours)
Schedule cleanup of old resources after rollback window

Frequently Asked Questions

Is zero-downtime migration really possible?

Yes, for most migration types. Schema changes use expand-contract. Engine migrations use dual-write + shadow reads. The tradeoff is increased complexity and migration duration — what could be a 5-minute maintenance window becomes a multi-week gradual rollout. For most production systems, the extra complexity is worth it.

How do I migrate a 500GB database?

Initial bulk load with COPY or pg_dump (hours, can run during low traffic), then CDC (Debezium or AWS DMS) for ongoing changes. Once CDC catches up and consistency is verified, cut over. The initial load is the slow part — CDC keeps it in sync after that.

What's the biggest migration mistake teams make?

Not testing with production-volume data. A migration that takes 2 seconds on a dev database with 1,000 rows can lock your table for 45 minutes on a production table with 100 million rows. Always test on a staging environment with realistic data volumes before touching production.

Should I use an ORM migration tool or raw SQL?

ORM migration tools (Django migrations, Alembic, Rails migrations) are fine for simple schema changes. For complex migrations — data transformations, engine switches, performance-sensitive operations — write raw SQL. ORM-generated SQL often includes unnecessary locks or suboptimal patterns. Always review the generated SQL before running in production.

How do I handle foreign keys during migration?

Migrate parent tables first, then child tables. If using expand-contract, add the new foreign key as NOT VALID first (PostgreSQL), backfill data, then VALIDATE CONSTRAINT separately. In MySQL, temporarily disable foreign key checks during bulk loads (SET FOREIGN_KEY_CHECKS=0), but re-enable and validate immediately after.

🗄️

Pillai Infotech LLP

We plan and execute database migrations for production systems — from schema changes to full engine swaps. Let's plan your migration.

PostgreSQL vs MySQL: Database Comparison for 2026 → Database Scaling Strategies: Sharding, Replication, and Caching → Data Engineering Fundamentals: Building Modern Data Pipelines →

Database Migration Strategies: Zero-Downtime Approaches

In This Guide

1. Types of Database Migrations

2. Schema Migrations — The Expand-Contract Pattern

Example: Renaming a Column (the safe way)

Dangerous Schema Operations (and Safe Alternatives)

3. Engine Migrations — Changing Databases Entirely

The Four Phases

4. The Dual-Write Pattern

Dual-Write Implementation

5. Shadow Database Strategy

6. Large-Scale Data Migration

Batched Data Backfill

7. Rollback Strategies

8. Migration Checklist

Pre-Migration

During Migration

Post-Migration

Frequently Asked Questions

Is zero-downtime migration really possible?

How do I migrate a 500GB database?

What's the biggest migration mistake teams make?

Should I use an ORM migration tool or raw SQL?

How do I handle foreign keys during migration?

Pillai Infotech LLP

Related Articles

Database Migration Strategies: Zero-Downtime Approaches

In This Guide

1. Types of Database Migrations

2. Schema Migrations — The Expand-Contract Pattern

Example: Renaming a Column (the safe way)

Dangerous Schema Operations (and Safe Alternatives)

3. Engine Migrations — Changing Databases Entirely

The Four Phases

4. The Dual-Write Pattern

Dual-Write Implementation

5. Shadow Database Strategy

6. Large-Scale Data Migration

Batched Data Backfill

7. Rollback Strategies

8. Migration Checklist

Pre-Migration

During Migration

Post-Migration

Frequently Asked Questions

Is zero-downtime migration really possible?

How do I migrate a 500GB database?

What's the biggest migration mistake teams make?

Should I use an ORM migration tool or raw SQL?

How do I handle foreign keys during migration?

Pillai Infotech LLP

Related Articles

Book a Free Consultation

Your Details

Pick a 30-min Slot

Thank You!