Zero-Downtime Database Migrations with Drizzle ORM
Database migrations are the most dangerous operation in production deployments. A migration that locks a table — even for 200ms — causes connection timeouts that cascade into 500 errors that cascade into alert storms at 2 AM. Most ORM migration tutorials show you the "easy" path: shut down the app, run the migration, restart. That is not zero-downtime; that is scheduled maintenance.
Real zero-downtime migrations require a different mental model: your schema must be backward-compatible with the currently-running application code throughout the deployment process. This means the old code and new code must both work simultaneously during the rollout window. The expand-contract pattern makes this possible. Drizzle ORM's SQL-first approach gives you the control to implement it correctly.
Key Takeaways
- Never run
drizzle-kit pushin production — generate and review SQL, then apply with control- The expand-contract pattern: add (expand) → deploy new code → remove old (contract)
- Adding a nullable column is safe; adding NOT NULL without a default is dangerous
- Renaming a column requires: add new column → backfill → update code → remove old column (4 steps)
- Use PostgreSQL
CREATE INDEX CONCURRENTLYto avoid table locks during index creation- Test migrations on a production-sized copy of your database before applying to production
- Always have a rollback plan: for every migration, write and test the rollback SQL
- Schema changes that require table rewrites (ALTER TYPE on a large table) need maintenance windows
Drizzle ORM Schema Setup
// packages/db/src/schema/contacts.ts
import { pgTable, uuid, varchar, text, timestamp, pgEnum, index } from 'drizzle-orm/pg-core';
export const contactStatusEnum = pgEnum('contact_status', ['active', 'inactive', 'archived']);
export const contacts = pgTable('contacts', {
id: uuid('id').primaryKey().defaultRandom(),
organizationId: uuid('organization_id').notNull(),
name: varchar('name', { length: 255 }).notNull(),
email: varchar('email', { length: 255 }),
phone: varchar('phone', { length: 50 }),
status: contactStatusEnum('status').default('active').notNull(),
notes: text('notes'),
createdAt: timestamp('created_at', { withTimezone: true }).defaultNow().notNull(),
updatedAt: timestamp('updated_at', { withTimezone: true }).defaultNow().notNull(),
}, (table) => ({
orgIdIdx: index('contacts_org_id_idx').on(table.organizationId),
emailIdx: index('contacts_email_idx').on(table.email),
statusIdx: index('contacts_status_idx').on(table.status),
}));
Generate migrations:
# Generate SQL migration (never auto-apply to production)
npx drizzle-kit generate
# Review the generated SQL before applying
cat drizzle/0001_add_contact_status.sql
# Apply to development database
npx drizzle-kit migrate
# For production: apply via your deployment pipeline
The Expand-Contract Pattern
The core principle: every schema change is split into multiple backward-compatible deployments.
Phase 1: Expand (Add Without Breaking)
Deploy schema changes that are additive only — new columns with defaults, new tables, new indexes.
-- Migration: 0010_expand_add_company.sql
-- Safe to apply while old code is running: nullable column with a default
ALTER TABLE contacts
ADD COLUMN IF NOT EXISTS company_id UUID,
ADD COLUMN IF NOT EXISTS company_name VARCHAR(255);
-- Old code ignores these columns; new code uses them
-- Both versions coexist during rolling deployment
// Drizzle schema after expansion
export const contacts = pgTable('contacts', {
// ... existing columns ...
companyId: uuid('company_id'), // nullable — old code ignores it
companyName: varchar('company_name', { length: 255 }), // nullable — safe
});
Phase 2: Deploy New Application Code
Deploy the new app version that reads and writes both old and new columns. During rolling deployment, old instances (no company support) and new instances (company support) run simultaneously — both work because the column is nullable.
Phase 3: Backfill
Populate the new column for existing rows in small batches to avoid table locks:
-- Migration: 0011_backfill_company_name.sql
-- Run in small batches to avoid locking
DO $$
DECLARE
batch_size INTEGER := 1000;
offset_val INTEGER := 0;
rows_updated INTEGER;
BEGIN
LOOP
UPDATE contacts
SET company_name = 'Unknown'
WHERE company_name IS NULL
AND id IN (
SELECT id FROM contacts
WHERE company_name IS NULL
ORDER BY id
LIMIT batch_size
OFFSET offset_val
);
GET DIAGNOSTICS rows_updated = ROW_COUNT;
EXIT WHEN rows_updated = 0;
offset_val := offset_val + batch_size;
-- Brief pause between batches to reduce I/O pressure
PERFORM pg_sleep(0.1);
END LOOP;
END $$;
Phase 4: Contract (Remove Old Column)
After all application code is updated and backfill is complete, remove the old column or constraint.
Safe Migration Patterns
Adding a NOT NULL Column
-- WRONG — will fail if table has rows (no default, no nullable)
ALTER TABLE contacts ADD COLUMN tier VARCHAR(20) NOT NULL;
-- WRONG — locks table while it writes the default to every row
ALTER TABLE contacts ADD COLUMN tier VARCHAR(20) NOT NULL DEFAULT 'free';
-- CORRECT — three-step approach
-- Step 1: Add nullable column
ALTER TABLE contacts ADD COLUMN tier VARCHAR(20);
-- Step 2: Backfill existing rows
UPDATE contacts SET tier = 'free' WHERE tier IS NULL;
-- Step 3 (next deployment): Add NOT NULL constraint (instant if no NULLs exist)
ALTER TABLE contacts ALTER COLUMN tier SET NOT NULL;
ALTER TABLE contacts ALTER COLUMN tier SET DEFAULT 'free';
Renaming a Column (4-Step Process)
-- Step 1: Add the new column
ALTER TABLE contacts ADD COLUMN full_name VARCHAR(255);
-- Step 2: Backfill + keep in sync with a trigger
UPDATE contacts SET full_name = name;
CREATE OR REPLACE FUNCTION sync_full_name() RETURNS trigger AS $$
BEGIN
NEW.full_name := NEW.name;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER contacts_sync_full_name
BEFORE INSERT OR UPDATE OF name ON contacts
FOR EACH ROW EXECUTE FUNCTION sync_full_name();
-- Step 3: Deploy new code that writes to full_name, reads from both
-- Step 4 (next deployment, after all instances updated):
DROP TRIGGER contacts_sync_full_name ON contacts;
DROP FUNCTION sync_full_name;
ALTER TABLE contacts DROP COLUMN name;
Changing an Enum Type
PostgreSQL does not allow removing enum values without a full table rewrite. For safe enum changes:
-- Adding a new enum value is safe (instant, no lock)
ALTER TYPE contact_status ADD VALUE IF NOT EXISTS 'pending';
-- Removing an enum value requires a full table rewrite
-- Use the expand-contract pattern: add a new enum type, migrate, drop old
CREATE TYPE contact_status_new AS ENUM ('active', 'inactive', 'archived', 'pending');
-- Migrate data
ALTER TABLE contacts
ALTER COLUMN status TYPE contact_status_new
USING status::text::contact_status_new;
DROP TYPE contact_status;
ALTER TYPE contact_status_new RENAME TO contact_status;
Index Creation Without Locking
Regular CREATE INDEX acquires a ShareLock that blocks all writes for the duration. On a large table, this can take minutes.
-- WRONG — locks writes during index creation
CREATE INDEX contacts_email_idx ON contacts(email);
-- CORRECT — concurrent index creation; no write lock
CREATE INDEX CONCURRENTLY IF NOT EXISTS contacts_email_idx
ON contacts(email)
WHERE email IS NOT NULL; -- Partial index for better performance
-- Drizzle note: drizzle-kit does not generate CONCURRENTLY by default
-- Edit the generated SQL migration to add CONCURRENTLY before applying
In your Drizzle migration file, manually add CONCURRENTLY after generation:
-- drizzle/0012_add_email_index.sql (edited after generation)
CREATE INDEX CONCURRENTLY IF NOT EXISTS contacts_email_idx
ON contacts (email)
WHERE email IS NOT NULL;
Drizzle Migration in Production CI/CD
Never use drizzle-kit push in production — it applies migrations without a review step. Use drizzle-kit migrate with generated SQL files that are version-controlled:
# scripts/deploy-production.sh
echo "=== Running database migrations ==="
# Check if there are pending migrations
PENDING=$(npx drizzle-kit migrate --dry-run 2>&1 | grep "pending")
if [ -n "$PENDING" ]; then
echo "Pending migrations detected:"
echo "$PENDING"
# Apply migrations
npx drizzle-kit migrate
echo "Migrations applied successfully"
else
echo "No pending migrations"
fi
// Programmatic migration in main.ts (NestJS)
import { drizzle } from 'drizzle-orm/postgres-js';
import { migrate } from 'drizzle-orm/postgres-js/migrator';
async function runMigrations() {
const migrationClient = postgres(process.env.DATABASE_URL!, { max: 1 });
const db = drizzle(migrationClient);
await migrate(db, {
migrationsFolder: join(__dirname, '..', '..', '..', 'drizzle'),
});
await migrationClient.end();
}
// In bootstrap(), before app.listen():
if (process.env.RUN_MIGRATIONS === 'true') {
await runMigrations();
}
Rollback Strategies
Every migration must have a companion rollback migration written and tested before deployment:
// drizzle/0013_add_company_id.sql (forward migration)
ALTER TABLE contacts ADD COLUMN IF NOT EXISTS company_id UUID;
CREATE INDEX CONCURRENTLY IF NOT EXISTS contacts_company_id_idx
ON contacts(company_id);
// drizzle/rollbacks/0013_rollback_add_company_id.sql
DROP INDEX CONCURRENTLY IF EXISTS contacts_company_id_idx;
ALTER TABLE contacts DROP COLUMN IF EXISTS company_id;
For emergency rollback:
# Emergency rollback script
# Run the rollback SQL, then redeploy the previous app version
psql "$DATABASE_URL" < drizzle/rollbacks/0013_rollback_add_company_id.sql
git checkout HEAD~1
pnpm build && pm2 restart ecosystem.config.cjs --update-env
Migration Testing Strategy
// packages/db/src/tests/migration.spec.ts
import { describe, it, expect, beforeAll } from 'vitest';
import { migrate } from 'drizzle-orm/postgres-js/migrator';
import postgres from 'postgres';
import { drizzle } from 'drizzle-orm/postgres-js';
describe('Database Migrations', () => {
let sql: ReturnType<typeof postgres>;
beforeAll(async () => {
sql = postgres(process.env.TEST_DATABASE_URL!);
});
it('all migrations apply cleanly on a fresh database', async () => {
const db = drizzle(sql);
await expect(migrate(db, { migrationsFolder: './drizzle' }))
.resolves.not.toThrow();
});
it('migrations are idempotent (can be applied twice safely)', async () => {
const db = drizzle(sql);
await migrate(db, { migrationsFolder: './drizzle' });
// Running again should be a no-op, not an error
await expect(migrate(db, { migrationsFolder: './drizzle' }))
.resolves.not.toThrow();
});
});
Dangerous Operations Requiring Maintenance Windows
Some operations cannot be made zero-downtime on very large tables (hundreds of millions of rows):
| Operation | Risk | Mitigation |
|---|---|---|
ALTER TYPE USING on large column | Full table rewrite, long lock | Schedule maintenance window |
ADD COLUMN NOT NULL without default | Immediate fail if rows exist | Use three-step expand-contract |
CREATE INDEX (without CONCURRENTLY) | Write lock for duration | Always use CONCURRENTLY |
ALTER TABLE ... RENAME TO | Instant, but breaks old code | Only in expand phase after code updated |
DROP TABLE | Irreversible | Always have a backup; 30-day soft-delete policy |
VACUUM FULL | Table lock for duration | Use regular VACUUM + schedule FULL during maintenance |
Frequently Asked Questions
What is the difference between drizzle-kit push and drizzle-kit migrate?
push directly applies your current schema to the database without generating migration files — useful for development where you do not care about migration history. migrate applies the SQL files in your drizzle/ directory in order, tracking which have been applied. Always use migrate in production so you have a controlled, reviewable, version-controlled history of every schema change.
How does Drizzle handle the drizzle_migrations tracking table?
Drizzle creates a drizzle_migrations (or __drizzle_migrations) table that records which migration files have been applied and when. Before applying a migration, it checks this table. Do not manually insert or delete rows from this table — it is Drizzle's state tracking mechanism. If you need to revert to a previous migration state, run your rollback SQL and manually delete the corresponding row.
How do I test migrations against a production-sized database?
Restore a sanitized (PII-removed) copy of your production database to a staging server. Apply the migration there and measure: wall clock time, lock wait time (from pg_stat_activity), table bloat (from pg_stat_user_tables), and impact on query performance (from EXPLAIN ANALYZE). If the migration takes more than 5 seconds on staging, redesign it.
Can I use transactions for database migrations?
Most DDL in PostgreSQL is transactional — you can wrap ALTER TABLE, CREATE TABLE, CREATE INDEX (but NOT CREATE INDEX CONCURRENTLY) in a transaction and roll back if any step fails. Drizzle wraps each migration file in a transaction by default. For migrations that use CONCURRENTLY, you must split them into a separate migration file that runs outside a transaction.
How do I handle Drizzle enum comparisons in TypeScript?
Drizzle's enum columns return the string value from the database. When comparing, TypeScript may not narrow the type correctly. Cast explicitly: if ((contact.status as ContactStatus) === 'active') or use a type assertion in your Drizzle where clause: where(eq(contacts.status, 'active' as ContactStatus)). This avoids runtime errors while keeping TypeScript happy.
Next Steps
Zero-downtime database migrations require discipline — but the alternative (downtime, data corruption, emergency rollbacks at 3 AM) is far worse. The expand-contract pattern, CREATE INDEX CONCURRENTLY, and proper rollback preparation make schema changes a routine, safe operation.
ECOSIRE manages database schemas for NestJS applications with Drizzle ORM, PostgreSQL 17, and a zero-downtime deployment pipeline tested across 65+ schema files and 300+ migrations. Explore our backend engineering services to learn how we handle database operations safely at scale.
Written by
ECOSIRE Research and Development Team
Building enterprise-grade digital products at ECOSIRE. Sharing insights on Odoo integrations, e-commerce automation, and AI-powered business solutions.
Related Articles
AWS EC2 Deployment Guide for Web Applications
Complete AWS EC2 deployment guide: instance selection, security groups, Node.js deployment, Nginx reverse proxy, SSL, auto-scaling, CloudWatch monitoring, and cost optimization.
Drizzle ORM with PostgreSQL: Complete Guide
Complete guide to Drizzle ORM with PostgreSQL: schema design, migrations, type-safe queries, relations, transactions, and production patterns for TypeScript apps.
Enterprise Security for OpenClaw AI Deployments
Comprehensive guide to securing OpenClaw AI agent deployments in enterprise environments. Covers authentication, secrets management, network isolation, and compliance.