Introduction

pgschema logo While database migrations have traditionally been imperative (writing step-by-step migration scripts), declarative solutions are gaining traction. pgschema differentiates itself from other declarative tools by:
  • Comprehensive Postgres Support: Handles virtually all schema-level database objects - tables, indexes, views, functions, procedures, triggers, policies, types, and more. Thoroughly tested against Postgres versions 14 through 17.
  • Schema-Level Focus: Designed for real-world Postgres usage patterns, from single-schema applications to multi-tenant architectures.
  • Terraform-Like Plan-Review-Apply Workflow: Generate a detailed migration plan before execution, review changes in human-readable, SQL, or JSON formats, then apply with confidence. No surprises in production.
  • Concurrent Change Detection: Built-in fingerprinting ensures the database hasn’t changed between planning and execution.
  • Online DDL Support: Automatically uses PostgreSQL’s non-blocking strategies to minimize downtime during schema changes.
  • Adaptive Transaction: Intelligently wraps migrations in transactions when possible, with automatic handling of operations like concurrent index creation that require special treatment.
  • Modular Schema Organization: Supports multi-file schema definitions for better team collaboration and ownership.
  • Dependency Management: Automatically resolves complex dependencies between database objects using topological sorting, ensuring operations execute in the correct order.
  • No Shadow Database Required: Unlike other declarative tools, it works directly with your schema files and target database - no temporary databases, no extra infrastructure.

Comprehensive Postgres Support

pgschema supports a wide range of Postgres features, including:
  • Tables, columns, and constraints (primary keys, foreign keys, unique constraints, check constraints)
  • Indexes (including partial, functional, and concurrent indexes)
  • Views and materialized views
  • Functions and stored procedures
  • Custom types and domains
  • Schemas and permissions
  • Row-level security (RLS) policies
  • Triggers and sequences
  • Comments on database objects
The tool is thoroughly tested against Postgres versions 14 through 17 with extensive test suites.

Schema-Level Migration

Unlike many migration tools that operate at the database level, pgschema is designed to work at the schema level.

Why Schema-Level?

Single Schema Simplicity: Most Postgres applications use only the default public schema. For these cases, schema-level operations eliminate unnecessary complexity while providing all the functionality needed for effective schema management. Multi-Tenant Architecture Support: For applications using multiple schemas, the predominant pattern is schema-per-tenant architecture - where each customer or tenant gets their own schema within the same database. This schema-level approach enables tenant schema reconciliation - ensuring all tenant schemas stay in sync with the canonical schema definition.

Schema-Agnostic Migrations

The tool intelligently handles schema qualifiers to create portable, schema-agnostic dumps and migrations:
# Dump from 'tenant_123' schema
pgschema dump \
  --host localhost \
  --user postgres \
  --db myapp \
  --schema tenant_123 \
  > schema.sql
# Apply the same schema definition to 'tenant_456' 
pgschema plan \
  --host localhost \
  --user postgres \
  --db myapp \
  --schema tenant_456 \
  --file schema.sql
Here’s what a schema-level dump looks like - notice how it creates schema-agnostic SQL:
--
-- pgschema database dump
--

--
-- Name: users; Type: TABLE; Schema: -; Owner: -
--

CREATE TABLE IF NOT EXISTS users (
    id SERIAL PRIMARY KEY,
    username varchar(100) NOT NULL,
    email varchar(100) NOT NULL,
    role public.user_role DEFAULT 'user',
    status public.status DEFAULT 'active',
    created_at timestamp DEFAULT now()
);

--
-- Name: posts; Type: TABLE; Schema: -; Owner: -
--

CREATE TABLE IF NOT EXISTS posts (
    id SERIAL PRIMARY KEY,
    user_id integer REFERENCES users(id),
    title varchar(200) NOT NULL,
    content text,
    created_at timestamp DEFAULT now()
);
The tool automatically:
  • Strips schema qualifiers from table and index names (notice Schema: - in comments)
  • Removes schema prefixes from object references (users instead of tenant_123.users)
  • Preserves cross-schema type references where needed (public.user_role for shared types)
  • Creates portable DDL that can be applied to any target schema

Declarative Workflow

This workflow mirrors Terraform’s approach and fits GitOps nicely:

1. Dump: Extract Current Schema

pgschema dump \
  --host localhost \
  --user postgres \
  --db myapp \
  > schema.sql
This extracts your current database schema into a clean, readable SQL file that represents your current state.

2. Edit: Define Desired State

Edit the dumped schema file to reflect your desired changes. Instead of writing ALTER TABLE statements, you write CREATE TABLE definitions to specify the desired end state.

3. Plan: Preview Changes

pgschema plan \
  --host localhost \
  --user postgres \
  --db myapp \
  --file schema.sql \
  --output-human \
  --output-json plan.json \
  --output-sql plan.sql
This compares your desired schema (from the file) with the current database state and generates a migration plan. Here’s where the tool differentiates from Terraform—the plan supports multiple output formats:
The human-readable format is perfect for reviewing changes during development
Plan: 1 to alter.

Summary by type:
  tables: 1 to alter

Tables:
  ~ users

Transaction: true

DDL to be executed:
--------------------------------------------------

ALTER TABLE users ADD COLUMN name varchar(100);

4. Apply: Execute Changes

pgschema apply \
  --plan plan.json \
  --host localhost \
  --user postgres \
  --db myapp
This executes the planned changes safely, with built-in protections against concurrent schema modifications and proper transaction handling.

Concurrent Change Detection

The separation between plan and apply phases creates a critical safety feature but also introduces a potential risk: what if the database changes between planning and execution? This time window could lead to applying outdated or conflicting migrations. The tool solves this with fingerprinting - a cryptographic mechanism that ensures the exact database state you planned against is the same state you’re applying changes to.

How Fingerprinting Works

During Plan Generation: When you run pgschema plan, it calculates a cryptographic fingerprint of the current database schema state and embeds it in the plan file:
{
  "version": "1.0.0",
  "pgschema_version": "1.0.0",
  "created_at": "2025-08-12T17:44:43+08:00",
  "source_fingerprint": {
    "hash": "965b1131737c955e24c7f827c55bd78e4cb49a75adfd04229e0ba297376f5085"
  },
  "diffs": [...]
}
During Apply Execution: Before executing any changes, pgschema apply recalculates the current database fingerprint and compares it with the stored fingerprint from the plan.

Safety in Action

This fingerprinting prevents dangerous scenarios:
# Team member A generates plan
pgschema plan \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  --file schema.sql \
  --output-json plan.json

# Time passes... Team member B makes emergency schema change

# Team member A tries to apply original plan
pgschema apply \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  --plan plan.json

# ❌ Error: schema fingerprint mismatch detected - the database schema has changed since the plan was generated.
#
#          schema fingerprint mismatch - expected: 965b1131737c955e, actual: abc123456789abcd

Recover Fingerprint Mismatch

When a fingerprint mismatch occurs, the right approach is to reestablish the baseline with the current database state:
# 1. Get the latest schema from the current database
pgschema dump \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  > current-schema.sql

# 2. Update your desired schema file with any new changes
# (merge your intended changes with the current state)

# 3. Plan the changes against the updated baseline
pgschema plan \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  --file updated-schema.sql \
  --output-json new-plan.json

# 4. Apply with the fresh plan
pgschema apply \
  --host prod.db.com \
  --db myapp \
  --user postgres \
  --plan new-plan.json
This fingerprinting mechanism maintains the integrity of the Plan-Review-Apply workflow, ensuring that concurrent modifications don’t lead to unexpected or conflicting database changes.

Online DDL Operations

The tool automatically uses PostgreSQL’s non-blocking features to minimize downtime during schema changes, including concurrent index creation, NOT VALID constraint patterns, and safe NOT NULL additions. Example - Concurrent Index Creation: Index operations automatically use CREATE INDEX CONCURRENTLY to avoid blocking table writes:
-- Generated migration for index addition
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_users_email_status ON users (email, status DESC);

-- pgschema:wait - Built-in progress monitoring
SELECT 
    COALESCE(i.indisvalid, false) as done,
    CASE 
        WHEN p.blocks_total > 0 THEN p.blocks_done * 100 / p.blocks_total
        ELSE 0
    END as progress
FROM pg_class c
LEFT JOIN pg_index i ON c.oid = i.indexrelid
LEFT JOIN pg_stat_progress_create_index p ON c.oid = p.index_relid
WHERE c.relname = 'idx_users_email_status';
The pgschema:wait directive blocks migration execution, polls the database to monitor progress, and automatically continues when operations complete. For comprehensive details, see the Online DDL documentation.

Adaptive Transaction

The system automatically determines transaction boundaries based on the SQL operations in your migration plan. Each diff in the plan contains a can_run_in_transaction field that indicates whether that particular SQL can run inside a transaction.
plan.json snippet
  "diffs": [
    ...
    {
      "sql": "ALTER TABLE department\nADD CONSTRAINT department_dept_name_key UNIQUE (dept_name);",
      "type": "table.constraint",
      "operation": "create",
      "path": "public.department.department_dept_name_key",
      "source": {
        "schema": "public",
        "table": "department",
        "name": "department_dept_name_key",
        "type": "UNIQUE",
        "columns": [
          {
            "name": "dept_name",
            "position": 1
          }
        ]
      },
      "can_run_in_transaction": true <--------- transaction indication
    },
    ...
  ]

Smart Transaction Wrapping

  • If all diffs can run in a transaction: The entire plan is wrapped in a single transaction for maximum atomicity
  • If any diff cannot run in a transaction: Each diff runs in its own transaction to maintain isolation
Operating at the schema level, among all the DDL operations supported, only CREATE INDEX CONCURRENTLY cannot run inside a transaction.

Modular Schema Organization

For large applications and teams, managing your database schema as a single monolithic file becomes unwieldy. The modular approach transforms your schema into an organized structure that enables better collaboration, clearer ownership, and easier maintenance.

Breaking Down the Monolith

Instead of a single large schema file, the tool can organize your database objects into logical, manageable files with the --multi-file option:
# Initialize modular structure from existing database
mkdir -p schema
pgschema dump \
  --host localhost \
  --db myapp \
  --user postgres \
  --multi-file \
  --file schema/main.sql

# Examine the generated structure
tree schema/
This creates an organized drectory structure. The main.sql file serves as the entry point, containing Postgres \i (include) directives that reference individual component files.
schema/
├── main.sql                    # Entry point with include directives
├── types/
│   ├── user_status.sql
│   ├── order_status.sql
│   └── address.sql
├── domains/
│   ├── email_address.sql
│   └── positive_integer.sql
├── sequences/
│   ├── global_id_seq.sql
│   └── order_number_seq.sql
├── tables/
│   ├── users.sql
│   └── orders.sql
├── functions/
│   ├── update_timestamp.sql
│   ├── get_user_count.sql
│   └── get_order_count.sql
├── procedures/
│   ├── cleanup_orders.sql
│   └── update_status.sql
└── views/
    ├── user_summary.sql
    └── order_details.sql

Team Collaboration Benefits

This modular approach enables powerful team collaboration patterns: Granular Ownership: Use GitHub’s CODEOWNERS to assign different teams to different parts of your schema:
# .github/CODEOWNERS
schema/tables/users*           @user-team
schema/tables/orders*          @orders-team  
schema/tables/products*        @inventory-team
schema/views/user_*            @user-team
schema/views/order_*           @orders-team
schema/functions/*inventory*   @inventory-team
Independent Development: Teams can work on separate schema components without conflicts, test their changes in isolation, then combine for coordinated production deployments. Focused Code Reviews: Instead of reviewing massive schema changes, reviewers can focus on specific files relevant to their domain expertise.

Same Workflow, Better Organization

The declarative workflow remains identical - you’re still defining desired state and letting the tool generate the migration plan:
# Teams edit their respective schema files to define desired state
# Example: Add new column to users table
vim schema/tables/users.sql

# Generate migration plan from modular schema
pgschema plan \
  --host localhost \
  --db myapp \
  --user postgres \
  --file schema/main.sql \
  --output-human \
  --output-json plan.json \
  --output-sql plan.sql

# Apply the coordinated changes
pgschema apply \
  --host localhost \
  --db myapp \
  --user postgres \
  --plan plan.json
The multi-file approach scales from small teams working on focused components to large organizations managing complex schemas with hundreds of objects, while maintaining the same simple, declarative workflow.

Dependency Management

As schemas grow in complexity, manually ordering DDL statements becomes increasingly error-prone. A single schema file with hundreds of objects becomes unmanageable—you’re constantly scrolling and reordering statements to satisfy dependencies. The modular schema approach introduces even greater challenges. With schema files spread across directories, tracking dependencies between objects becomes nearly impossible. A trigger in triggers/audit.sql might depend on a function in functions/timestamps.sql, which itself references types defined in types/custom.sql. Database objects often depend on each other—triggers need functions, views reference tables, foreign keys require referenced tables to exist first. The tool automatically handles these dependencies using topological sorting to ensure operations execute in the correct order, regardless of how your schema files are organized.

Automatic Dependency Resolution

Consider this modular schema example where you’re adding a table with a trigger that depends on a function
-- tables/users.sql
CREATE TABLE public.users (
    id serial PRIMARY KEY,
    name text NOT NULL,
    email text UNIQUE,
    created_at timestamp DEFAULT CURRENT_TIMESTAMP,
    updated_at timestamp DEFAULT CURRENT_TIMESTAMP
);

-- triggers/update_users_modified_time.sql
CREATE TRIGGER update_users_modified_time
    BEFORE UPDATE ON public.users
    FOR EACH ROW
    EXECUTE FUNCTION public.update_modified_time();

-- functions/update_modified_time.sql
CREATE OR REPLACE FUNCTION public.update_modified_time()
RETURNS trigger AS $$
BEGIN
    NEW.updated_at = CURRENT_TIMESTAMP;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;
The topological sort algorithm ensures that:
  • Tables are created before their indexes, constraints, and triggers
  • Functions are created before triggers that use them
  • Referenced tables exist before foreign key constraints
  • Views are created after all their dependent tables and functions
This automatic dependency management eliminates manual ordering errors and ensures migrations always execute successfully.

No Shadow Database Required

Unlike many declarative schema migration tools, this approach doesn’t require a separate “shadow” or “dev” database to compute schema differences. Other tools typically create a temporary database, apply your desired schema to it, then compare this shadow database with your target database to generate migrations. This shadow database approach has several drawbacks:
  • Counterintuitive Configuration: Most developers stumble on the additional shadow database flags and need to read documentation to understand what they’re used for
  • Complex Setup: Requires permissions for temporary database creation and cleanup of temporary databases
  • Additional Infrastructure: Demands extra database resources, maintenance, and can cause inconsistencies with different Postgres versions or extensions

IR Normalization

This tool eliminates these complexities by working directly with your schema files and target database through an Intermediate Representation (IR) system:
SQL Files      Database System Catalogs
    v             v
(Desired)      (Current)
    |             |
    | Parse       | Query
    v             v
           IR
      (Normalized)
           |
           v
      Diff Engine
           |
           v
     Migration Plan
1. SQL Parsing: Your desired state SQL files are parsed and converted into a normalized Intermediate Representation. 2. Database Introspection: The target database schema is introspected through Postgres system catalogs (pg_class, pg_attribute, pg_constraint, etc.) and converted into the same normalized IR format. 3. Direct Comparison: The diff engine compares these two normalized IR structures directly, identifying what changes are needed without requiring any temporary databases. 4. Migration Generation: The differences are converted into properly ordered DDL statements that transform the current state into the desired state. The IR normalization ensures that regardless of whether schema information comes from SQL parsing or database introspection, it’s represented consistently for accurate comparison - enabling precise migrations without the overhead and complexity of shadow databases.

Acknowledgements

pgschema wouldn’t exist without these amazing open source projects:
  • sqlc - For generating code to extract schema from target databases
  • pg_query_go - Go bindings for libpg_query, enabling reliable Postgres SQL parsing. In fact, sqlc also uses pg_query_go under the hood! It’s the secret weapon in the Postgres ecosystem that allows tools to build near-perfect Postgres compatibility.
  • testcontainers - Enabling comprehensive integration testing with real Postgres instances
Design reference came from:
  • Terraform - For figuring out the right Plan-Review-Apply workflow that developers love
  • pg-schema-diff - Reading GitHub issues to understand use cases and studying the codebase. For example, I knew sqlc beforehand, but when I saw pg-schema-diff uses sqlc to fetch schema information, it immediately clicked.
And the real inspiration came from hundreds of customer conversations over 4+ years building Bytebase. Developers kept telling us: “Why can’t database migrations be as simple as Terraform?” Well, now they can be. Special thanks to Claude Code for sweating millions of tokens to make this project possible. Turns out AI is pretty good at writing database migration tools – who knew? 🤖
Ready to give it a try?