pgschema: Postgres Declarative Schema Migration, like Terraform

Introduction

While database migrations have traditionally been imperative (writing step-by-step migration scripts), declarative solutions are gaining traction. pgschema differentiates itself from other declarative tools by:

Comprehensive Postgres Support: Handles virtually all schema-level database objects - tables, indexes, views, functions, procedures, triggers, policies, types, and more. Thoroughly tested against Postgres versions 14 through 17.
Schema-Level Focus: Designed for real-world Postgres usage patterns, from single-schema applications to multi-tenant architectures.
Terraform-Like Plan-Review-Apply Workflow: Generate a detailed migration plan before execution, review changes in human-readable, SQL, or JSON formats, then apply with confidence. No surprises in production.
No Migration Table Needed: As a state-based tool, it determines what to change by comparing your schema files with the actual database state - no need to track migration history in a special table.
Concurrent Change Detection: Built-in fingerprinting ensures the database hasn’t changed between planning and execution.
Online DDL Support: Automatically uses PostgreSQL’s non-blocking strategies to minimize downtime during schema changes.
Adaptive Transaction: Intelligently wraps migrations in transactions when possible, with automatic handling of operations like concurrent index creation that require special treatment.
Modular Schema Organization: Supports multi-file schema definitions for better team collaboration and ownership.
Dependency Management: Automatically resolves complex dependencies between database objects using topological sorting, ensuring operations execute in the correct order.
No Shadow Database Required: Unlike other declarative tools, it works directly with your schema files and target database - no temporary databases, no extra infrastructure.

Comprehensive Postgres Support

pgschema supports a wide range of Postgres features, including:

Tables, columns, and constraints (primary keys, foreign keys, unique constraints, check constraints)
Indexes (including partial, functional, and concurrent indexes)
Views and materialized views
Functions and stored procedures
Custom types and domains
Schemas and permissions
Row-level security (RLS) policies
Triggers and sequences
Comments on database objects

The tool is thoroughly tested against Postgres versions 14 through 17 with extensive test suites.

Schema-Level Migration

Unlike many migration tools that operate at the database level, pgschema is designed to work at the schema level.

Why Schema-Level?

Single Schema Simplicity: Most Postgres applications use only the default public schema. For these cases, schema-level operations eliminate unnecessary complexity while providing all the functionality needed for effective schema management. Multi-Tenant Architecture Support: For applications using multiple schemas, the predominant pattern is schema-per-tenant architecture - where each customer or tenant gets their own schema within the same database. This schema-level approach enables tenant schema reconciliation - ensuring all tenant schemas stay in sync with the canonical schema definition.

Schema-Agnostic Migrations

The tool intelligently handles schema qualifiers to create portable, schema-agnostic dumps and migrations:

# Dump from 'tenant_123' schema
pgschema dump \
  --host localhost \
  --user postgres \
  --db myapp \
  --schema tenant_123 \
  > schema.sql
# Apply the same schema definition to 'tenant_456' 
pgschema plan \
  --host localhost \
  --user postgres \
  --db myapp \
  --schema tenant_456 \
  --file schema.sql

Here’s what a schema-level dump looks like - notice how it creates schema-agnostic SQL:

--
-- pgschema database dump
--

--
-- Name: users; Type: TABLE; Schema: -; Owner: -
--

CREATE TABLE IF NOT EXISTS users (
    id SERIAL PRIMARY KEY,
    username varchar(100) NOT NULL,
    email varchar(100) NOT NULL,
    role public.user_role DEFAULT 'user',
    status public.status DEFAULT 'active',
    created_at timestamp DEFAULT now()
);

--
-- Name: posts; Type: TABLE; Schema: -; Owner: -
--

CREATE TABLE IF NOT EXISTS posts (
    id SERIAL PRIMARY KEY,
    user_id integer REFERENCES users(id),
    title varchar(200) NOT NULL,
    content text,
    created_at timestamp DEFAULT now()
);

The tool automatically:

Strips schema qualifiers from table and index names (notice Schema: - in comments)
Removes schema prefixes from object references (users instead of tenant_123.users)
Preserves cross-schema type references where needed (public.user_role for shared types)
Creates portable DDL that can be applied to any target schema

Declarative Workflow

This workflow mirrors Terraform’s approach and fits GitOps nicely:

1. Dump: Extract Current Schema

pgschema dump \
  --host localhost \
  --user postgres \
  --db myapp \
  > schema.sql

This extracts your current database schema into a clean, readable SQL file that represents your current state.

2. Edit: Define Desired State

Edit the dumped schema file to reflect your desired changes. Instead of writing ALTER TABLE statements, you write CREATE TABLE definitions to specify the desired end state.

3. Plan: Preview Changes

pgschema plan \
  --host localhost \
  --user postgres \
  --db myapp \
  --file schema.sql \
  --output-human \
  --output-json plan.json \
  --output-sql plan.sql

This compares your desired schema (from the file) with the current database state and generates a migration plan. Here’s where the tool differentiates from Terraform—the plan supports multiple output formats:

Human Format
JSON Format
SQL Format

The human-readable format is perfect for reviewing changes during development

Plan: 1 to alter.

Summary by type:
  tables: 1 to alter

Tables:
  ~ users

Transaction: true

DDL to be executed:
--------------------------------------------------

ALTER TABLE users ADD COLUMN name varchar(100);

4. Apply: Execute Changes

pgschema apply \
  --plan plan.json \
  --host localhost \
  --user postgres \
  --db myapp

This executes the planned changes safely, with built-in protections against concurrent schema modifications and proper transaction handling.

No Migration Table Needed

Traditional imperative migration tools maintain a migration history table (often called migrations, schema_migrations, or similar) to track which migration scripts have been executed. This approach has several limitations:

History Tracking Overhead: Requires maintaining and syncing a special table across all environments
Script Ordering Dependencies: Migration files must be carefully numbered and never modified once applied
Drift Detection Gaps: Can’t detect manual schema changes made outside the migration tool
State Reconstruction Complexity: Understanding current schema requires replaying all historical migrations

State-Based Approach

pgschema eliminates these complexities through a state-based approach. Instead of tracking which changes were applied, it determines what needs to change by:

Reading Desired State: Parse your schema files to understand what you want
Inspecting Current State: Query the database to understand what currently exists
Computing Differences: Compare the two states to determine necessary changes

This means:

No special tables required - your database schema speaks for itself
Manual changes detected - any drift from desired state is automatically identified
Schema files are authoritative - they represent the complete truth, not just incremental changes
Environment agnostic - the same schema files work regardless of how the database reached its current state

How It Handles Real-World Scenarios

Scenario 1: Emergency Hotfix

# DBA makes emergency manual change in production
psql -c "ALTER TABLE users ADD COLUMN emergency_flag boolean DEFAULT false;"

# Later, developer updates schema file and plans migration
# Tool detects the column already exists and adjusts the plan accordingly
pgschema plan --file schema.sql --output-json plan.json
# Plan shows: No changes needed (current state matches desired state)

Scenario 2: Environment Sync

# Development database has experimental columns
# Production database is several versions behind
# The same schema file works for both:

# For dev: generates migration to remove experimental columns
pgschema plan --host dev.db --file schema.sql

# For prod: generates migration to add missing features
pgschema plan --host prod.db --file schema.sql

This state-based approach provides the flexibility and reliability that modern database operations demand, without the overhead of migration history tracking.

Concurrent Change Detection

The separation between plan and apply phases creates a critical safety feature but also introduces a potential risk: what if the database changes between planning and execution? This time window could lead to applying outdated or conflicting migrations. The tool solves this with fingerprinting - a cryptographic mechanism that ensures the exact database state you planned against is the same state you’re applying changes to.

How Fingerprinting Works

During Plan Generation: When you run pgschema plan, it calculates a cryptographic fingerprint of the current database schema state and embeds it in the plan file:

{
  "version": "1.0.0",
  "pgschema_version": "1.0.0",
  "created_at": "2025-08-12T17:44:43+08:00",
  "source_fingerprint": {
    "hash": "965b1131737c955e24c7f827c55bd78e4cb49a75adfd04229e0ba297376f5085"
  },
  "diffs": [...]
}

During Apply Execution: Before executing any changes, pgschema apply recalculates the current database fingerprint and compares it with the stored fingerprint from the plan.

Safety in Action

This fingerprinting prevents dangerous scenarios:

# Team member A generates plan
pgschema plan \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  --file schema.sql \
  --output-json plan.json

# Time passes... Team member B makes emergency schema change

# Team member A tries to apply original plan
pgschema apply \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  --plan plan.json

# ❌ Error: schema fingerprint mismatch detected - the database schema has changed since the plan was generated.
#
#          schema fingerprint mismatch - expected: 965b1131737c955e, actual: abc123456789abcd

Recover Fingerprint Mismatch

When a fingerprint mismatch occurs, the right approach is to reestablish the baseline with the current database state:

# 1. Get the latest schema from the current database
pgschema dump \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  > current-schema.sql

# 2. Update your desired schema file with any new changes
# (merge your intended changes with the current state)

# 3. Plan the changes against the updated baseline
pgschema plan \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  --file updated-schema.sql \
  --output-json new-plan.json

# 4. Apply with the fresh plan
pgschema apply \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  --plan new-plan.json

This fingerprinting mechanism maintains the integrity of the Plan-Review-Apply workflow, ensuring that concurrent modifications don’t lead to unexpected or conflicting database changes.

Online DDL Operations

The tool automatically uses PostgreSQL’s non-blocking features to minimize downtime during schema changes, including concurrent index creation, NOT VALID constraint patterns, and safe NOT NULL additions. Example - Concurrent Index Creation: Index operations automatically use CREATE INDEX CONCURRENTLY to avoid blocking table writes:

-- Generated migration for index addition
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email_status ON users (email, status DESC);

-- pgschema:wait - Built-in progress monitoring
SELECT 
    COALESCE(i.indisvalid, false) as done,
    CASE 
        WHEN p.blocks_total > 0 THEN p.blocks_done * 100 / p.blocks_total
        ELSE 0
    END as progress
FROM pg_class c
LEFT JOIN pg_index i ON c.oid = i.indexrelid
LEFT JOIN pg_stat_progress_create_index p ON c.oid = p.index_relid
WHERE c.relname = 'idx_users_email_status';

The pgschema:wait directive blocks migration execution, polls the database to monitor progress, and automatically continues when operations complete. For comprehensive details, see the Online DDL documentation.

Adaptive Transaction

The system automatically determines transaction boundaries based on the SQL operations in your migration plan. Each diff in the plan contains a can_run_in_transaction field that indicates whether that particular SQL can run inside a transaction.

plan.json snippet

  "diffs": [
    ...
    {
      "sql": "ALTER TABLE department\nADD CONSTRAINT department_dept_name_key UNIQUE (dept_name);",
      "type": "table.constraint",
      "operation": "create",
      "path": "public.department.department_dept_name_key",
      "source": {
        "schema": "public",
        "table": "department",
        "name": "department_dept_name_key",
        "type": "UNIQUE",
        "columns": [
          {
            "name": "dept_name",
            "position": 1
          }
        ]
      },
      "can_run_in_transaction": true <--------- transaction indication
    },
    ...
  ]

Smart Transaction Wrapping

If all diffs can run in a transaction: The entire plan is wrapped in a single transaction for maximum atomicity
If any diff cannot run in a transaction: Each diff runs in its own transaction to maintain isolation

Operating at the schema level, among all the DDL operations supported, only CREATE INDEX CONCURRENTLY cannot run inside a transaction.

Modular Schema Organization

For large applications and teams, managing your database schema as a single monolithic file becomes unwieldy. The modular approach transforms your schema into an organized structure that enables better collaboration, clearer ownership, and easier maintenance.

Breaking Down the Monolith

Instead of a single large schema file, the tool can organize your database objects into logical, manageable files with the --multi-file option:

# Initialize modular structure from existing database
mkdir -p schema
pgschema dump \
  --host localhost \
  --db myapp \
  --user postgres \
  --multi-file \
  --file schema/main.sql

# Examine the generated structure
tree schema/

This creates an organized drectory structure. The main.sql file serves as the entry point, containing Postgres \i (include) directives that reference individual component files.

Directory Structure
main.sql
functions/update_timestamp.sql

schema/
├── main.sql                    # Entry point with include directives
├── types/
│   ├── user_status.sql
│   ├── order_status.sql
│   └── address.sql
├── domains/
│   ├── email_address.sql
│   └── positive_integer.sql
├── sequences/
│   ├── global_id_seq.sql
│   └── order_number_seq.sql
├── tables/
│   ├── users.sql
│   └── orders.sql
├── functions/
│   ├── update_timestamp.sql
│   ├── get_user_count.sql
│   └── get_order_count.sql
├── procedures/
│   ├── cleanup_orders.sql
│   └── update_status.sql
└── views/
    ├── user_summary.sql
    └── order_details.sql

Team Collaboration Benefits

This modular approach enables powerful team collaboration patterns: Granular Ownership: Use GitHub’s CODEOWNERS to assign different teams to different parts of your schema:

# .github/CODEOWNERS
schema/tables/users*           @user-team
schema/tables/orders*          @orders-team  
schema/tables/products*        @inventory-team
schema/views/user_*            @user-team
schema/views/order_*           @orders-team
schema/functions/*inventory*   @inventory-team

Independent Development: Teams can work on separate schema components without conflicts, test their changes in isolation, then combine for coordinated production deployments. Focused Code Reviews: Instead of reviewing massive schema changes, reviewers can focus on specific files relevant to their domain expertise.

Same Workflow, Better Organization

The declarative workflow remains identical - you’re still defining desired state and letting the tool generate the migration plan:

# Teams edit their respective schema files to define desired state
# Example: Add new column to users table
vim schema/tables/users.sql

# Generate migration plan from modular schema
pgschema plan \
  --host localhost \
  --db myapp \
  --user postgres \
  --file schema/main.sql \
  --output-human \
  --output-json plan.json \
  --output-sql plan.sql

# Apply the coordinated changes
pgschema apply \
  --host localhost \
  --db myapp \
  --user postgres \
  --plan plan.json

The multi-file approach scales from small teams working on focused components to large organizations managing complex schemas with hundreds of objects, while maintaining the same simple, declarative workflow.

Dependency Management

As schemas grow in complexity, manually ordering DDL statements becomes increasingly error-prone. A single schema file with hundreds of objects becomes unmanageable—you’re constantly scrolling and reordering statements to satisfy dependencies. The modular schema approach introduces even greater challenges. With schema files spread across directories, tracking dependencies between objects becomes nearly impossible. A trigger in triggers/audit.sql might depend on a function in functions/timestamps.sql, which itself references types defined in types/custom.sql. Database objects often depend on each other—triggers need functions, views reference tables, foreign keys require referenced tables to exist first. The tool automatically handles these dependencies using topological sorting to ensure operations execute in the correct order, regardless of how your schema files are organized.

Automatic Dependency Resolution

Consider this modular schema example where you’re adding a table with a trigger that depends on a function

Modular Schema Files
Generated Migration

-- tables/users.sql
CREATE TABLE public.users (
    id serial PRIMARY KEY,
    name text NOT NULL,
    email text UNIQUE,
    created_at timestamp DEFAULT CURRENT_TIMESTAMP,
    updated_at timestamp DEFAULT CURRENT_TIMESTAMP
);

-- triggers/update_users_modified_time.sql
CREATE TRIGGER update_users_modified_time
    BEFORE UPDATE ON public.users
    FOR EACH ROW
    EXECUTE FUNCTION public.update_modified_time();

-- functions/update_modified_time.sql
CREATE OR REPLACE FUNCTION public.update_modified_time()
RETURNS trigger AS $$
BEGIN
    NEW.updated_at = CURRENT_TIMESTAMP;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

The topological sort algorithm ensures that:

Tables are created before their indexes, constraints, and triggers
Functions are created before triggers that use them
Referenced tables exist before foreign key constraints
Views are created after all their dependent tables and functions

This automatic dependency management eliminates manual ordering errors and ensures migrations always execute successfully.

No Shadow Database Required

Unlike many declarative schema migration tools, this approach doesn’t require a separate “shadow” or “dev” database to compute schema differences. Other tools typically create a temporary database, apply your desired schema to it, then compare this shadow database with your target database to generate migrations. This shadow database approach has several drawbacks:

Counterintuitive Configuration: Most developers stumble on the additional shadow database flags and need to read documentation to understand what they’re used for
Complex Setup: Requires permissions for temporary database creation and cleanup of temporary databases
Additional Infrastructure: Demands extra database resources, maintenance, and can cause inconsistencies with different Postgres versions or extensions

IR Normalization

This tool eliminates these complexities by working directly with your schema files and target database through an Intermediate Representation (IR) system:

SQL Files      Database System Catalogs
    v             v
(Desired)      (Current)
    |             |
    | Parse       | Query
    v             v
           IR
      (Normalized)
           |
           v
      Diff Engine
           |
           v
     Migration Plan

1. SQL Parsing: Your desired state SQL files are parsed and converted into a normalized Intermediate Representation. 2. Database Introspection: The target database schema is introspected through Postgres system catalogs (pg_class, pg_attribute, pg_constraint, etc.) and converted into the same normalized IR format. 3. Direct Comparison: The diff engine compares these two normalized IR structures directly, identifying what changes are needed without requiring any temporary databases. 4. Migration Generation: The differences are converted into properly ordered DDL statements that transform the current state into the desired state. The IR normalization ensures that regardless of whether schema information comes from SQL parsing or database introspection, it’s represented consistently for accurate comparison - enabling precise migrations without the overhead and complexity of shadow databases.

Acknowledgements

pgschema wouldn’t exist without these amazing open source projects:

sqlc - For generating code to extract schema from target databases
pg_query_go - Go bindings for libpg_query, enabling reliable Postgres SQL parsing. In fact, sqlc also uses pg_query_go under the hood! It’s the secret weapon in the Postgres ecosystem that allows tools to build near-perfect Postgres compatibility.
testcontainers - Enabling comprehensive integration testing with real Postgres instances

Design reference came from:

Terraform - For figuring out the right Plan-Review-Apply workflow that developers love
pg-schema-diff - Reading GitHub issues to understand use cases and studying the codebase. For example, I knew sqlc beforehand, but when I saw pg-schema-diff uses sqlc to fetch schema information, it immediately clicked.

And the real inspiration came from hundreds of customer conversations over 4+ years building Bytebase. Developers kept telling us: “Why can’t database migrations be as simple as Terraform?” Well, now they can be. Special thanks to Claude Code for sweating millions of tokens to make this project possible. Turns out AI is pretty good at writing database migration tools – who knew? 🤖

Ready to give it a try?

Installation
Join Discord or open GitHub Issues

​Introduction

​Comprehensive Postgres Support

​Schema-Level Migration

​Why Schema-Level?

​Schema-Agnostic Migrations

​Declarative Workflow

​1. Dump: Extract Current Schema

​2. Edit: Define Desired State

​3. Plan: Preview Changes

​4. Apply: Execute Changes

​No Migration Table Needed

​State-Based Approach

​How It Handles Real-World Scenarios

​Concurrent Change Detection

​How Fingerprinting Works

​Safety in Action

​Recover Fingerprint Mismatch

​Online DDL Operations

​Adaptive Transaction

​Smart Transaction Wrapping

​Modular Schema Organization

​Breaking Down the Monolith

​Team Collaboration Benefits

​Same Workflow, Better Organization

​Dependency Management

​Automatic Dependency Resolution

​No Shadow Database Required

​IR Normalization

​Acknowledgements