Skip to content

Database Changes

A complete reference for making database changes in Mantle projects using Aurora DSQL. This guide covers the end-to-end workflow, DSQL-specific constraints, validation tooling, and common failure modes.

Agent doing a migration? Start with the Migration Playbook for Agents — it's a decision tree + per-scenario recipes. This page is the reference; the playbook is the how-to.

The 5-Step Workflow

Every database schema change follows this sequence:

1. Edit schema  →  2. Generate  →  3. Validate  →  4. Deploy  →  5. Permissions
bash
# 1. Edit src/db/entities/*.ts (Drizzle schema DSL)
# 2. Generate migration SQL
mantle db generate

# 3. Validate before applying
mantle check migrations        # journal ↔ SQL file consistency
mantle db check-dsql           # DSQL compatibility classification
mantle db migrate --dry-run    # preview against mock schema (no mutations)

# 4. Deploy (runs MigrateDSQL Lambda automatically)
mantle deploy --stage staging

# 5. Verify permissions (applied automatically by MigrateDSQL)
mantle db apply-permissions    # manual re-run if needed

When to Use mantle db generate vs Hand-Writing

ScenarioApproach
Adding/renaming tables or columnsmantle db generate
Adding indexes or constraintsmantle db generate
Data backfill (INSERT/UPDATE existing rows)Hand-write SQL
Multi-step migration (add column → backfill → drop old)Hand-write SQL
Conditional DDL (IF NOT EXISTS guards)Hand-write SQL

After hand-writing, you must manually add a journal entry to migrations/meta/_journal.json:

json
{
  "idx": 3,
  "version": "7",
  "when": 1746835200000,
  "tag": "0004_my_migration",
  "breakpoints": true
}

The tag field must match the SQL filename without extension: 0004_my_migration.sql"tag": "0004_my_migration".

The Journal Requirement

Every .sql file in migrations/ must have a corresponding entry in migrations/meta/_journal.json. The reverse also holds: every journal entry must reference an existing SQL file.

Why this matters: The Mantle migration runner tracks applied migrations by matching journal entries against the database. An orphaned SQL file (no journal entry) will never be applied. A missing SQL file (journal entry exists) will cause a runtime error.

Enforcement:

bash
mantle check migrations

This command validates bidirectional consistency between SQL files and journal entries. It runs automatically in mantle ci and should be included in any pre-deploy CI pipeline.

DSQL Constraints Reference

Aurora DSQL is PostgreSQL-compatible but has significant differences. All migrations must account for these:

ConstraintImpactWorkaround
No ALTER COLUMN TYPECannot change column types in-placeAdd new column → backfill → use new column
No DROP COLUMNCannot remove columnsLeave deprecated columns in place
No SET DEFAULT or ADD COLUMN ... DEFAULTCannot alter column defaults, and ADD COLUMN ... DEFAULT N is rejected on populated tablesMigration runner automatically routes to table recreation — no manual workaround needed
No SERIAL / sequencesNo auto-incrementing integersUse text with gen_random_uuid()
One DDL per transactionMultiple DDL in one txn failsRunner executes each statement in its own transaction
No synchronous index creationCREATE INDEX blocksAuto-converted to CREATE INDEX ASYNC by runner
No Row Level SecurityENABLE RLS failsStripped silently by runner
No PL/pgSQL DO blocksAnonymous code blocks failStripped silently by runner
UNIQUE CONSTRAINT required for onConflictUNIQUE INDEX won't workUse .unique() in Drizzle schema (creates CONSTRAINT)
GRANT ON <nonexistent_table> succeeds silentlyPermissions appear applied but have no effectMigrateDSQL now runs validatePermissions after applyPermissions — gaps logged as warnings. See DSQL GRANT Behavior

Confirmed DSQL Behaviors (Reference)

The following behaviors are confirmed via live testing (LP migration stress test, May 2026). They are not all documented in AWS' DSQL reference and matter when planning a migration.

BehaviorDetail
ALTER TABLE ADD COLUMN <nullable>Accepted as metadata-only ALTER (~80ms).
ALTER TABLE ADD COLUMN ... NOT NULLRejected. Classifier routes to recreation.
ALTER TABLE ADD COLUMN ... DEFAULT NRejected on populated tables. Classifier routes to recreation.
ALTER COLUMN ... SET DATA TYPERejected. Classifier routes to recreation.
ALTER COLUMN ... SET NOT NULLRejected. Classifier routes to recreation. DSQL also rejects the recreation if any row would be NULL after — that's the safety net.
DROP COLUMNRejected. Classifier routes to recreation.
DROP TABLEAccepted. Permitted operation.
RENAME COLUMNAccepted as metadata-only ALTER.
CREATE INDEX (synchronous)Auto-rewritten to CREATE INDEX ASYNC by sanitizer. Physical build is asynchronous — track via sys.jobs.
CREATE UNIQUE INDEXSame auto-async behavior. Stored as UNIQUE INDEX in pg_indexes.
ADD CONSTRAINT ... UNIQUERejected as ALTER. Classifier routes to recreation. Stored as UNIQUE INDEX in pg_indexes after recreation (no pg_constraint row).
PRIMARY KEYAuto-extended to INCLUDE (all non-key columns) for covering-index optimization.
Index method btreeTransparently mapped to btree_index in pg_indexes output.
Foreign keysNot enforced. DSQL has no FK enforcement.
Row Level Security (ENABLE RLS)Stripped silently by sanitizer (unsupported).
DO $$ ... $$ PL/pgSQL blocksStripped silently by sanitizer (unsupported).
CREATE SEQUENCEStripped silently by sanitizer (unsupported).
One DDL per transactionDSQL allows at most one DDL per transaction. Runner executes each statement in its own transaction.
GRANT ON <nonexistent_table>Silently succeeds. No error. Grant is recorded but has no effect. Verify with validatePermissions log (gaps array) or pg_class.relacl post-migration.

For per-scenario recipes that exercise these behaviors, see the Migration Playbook for Agents.

DSQL GRANT Behavior

Critical gotcha: Aurora DSQL silently succeeds when you GRANT permissions on a table that doesn't exist. No error is raised. The grant is recorded but has no effect since the target table doesn't exist.

This means:

  • If migrations fail partway (some tables created, others not), the subsequent permissions step will appear to succeed for ALL tables
  • Lambda functions will get permission denied at runtime for tables that were never actually created
  • The MigrateDSQL Lambda runs migrations BEFORE permissions specifically to avoid this — but if a deploy is interrupted between steps, re-deploy to fix

Diagnosis: If a Lambda gets permission denied after a deploy:

bash
# Check if the table actually exists
mantle db check-dsql

# Re-run the full migration + permissions sequence
aws lambda invoke --function-name <stage>-<app>-MigrateDSQL --payload '{}' response.json

Deploy-time Permission Validation

MigrateDSQL runs validatePermissions automatically after applyPermissions on every deploy. This is the safety net for the DSQL silent-GRANT problem.

validatePermissions cross-references every GRANT statement in your permissions files against information_schema.tables. For each grant targeting a table that does not exist, it records a gap and re-applies the grant (idempotent — harmless if the table does exist by the time it runs).

Warning, not failure: Gaps are logged as structured warnings, not Lambda failures. This is intentional — some deploy orderings legitimately have permissions pointing to tables that exist by the time queries run (e.g., post-recreation flows). A warning tells you to investigate; it does not block the deploy.

Warning log format:

json
{
  "level": "WARN",
  "message": "DSQL permission validation found gaps (GRANTs targeting non-existent tables)",
  "gapCount": 2,
  "gaps": [
    {"table": "sessions", "role": "app_role", "operations": ["SELECT", "INSERT"], "file": "permissions.sql"}
  ]
}

When to act on a gap warning:

  • If the table should exist → check that the migration ran successfully (see Migration not applied after deploy)
  • If the table is intentionally deferred → no action needed; the next deploy will re-apply the grant once the table exists
  • If the table was dropped → remove the corresponding GRANT from the permissions file

Permission Verification Recipes

pg_class.relacl — the canonical permission check

information_schema.role_table_grants is a PostgreSQL view that filters by current-user visibility: it only shows grants the connecting user can see. From a non-owner connection (e.g., the DevTools MCP connecting as lambda_dev_tools), you will get false negatives — grants on other roles' tables are silently filtered out.

The canonical permission verification query uses pg_class.relacl directly:

sql
SELECT
  c.relname AS table_name,
  c.relacl AS acl
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE n.nspname = 'public'
  AND c.relkind = 'r'
  AND c.relname = '<your_table>';

ACL letter cheat sheet:

LetterPrivilege
rSELECT
wUPDATE
aINSERT
dDELETE
DTRUNCATE
xREFERENCES
tTRIGGER
*GRANT OPTION (suffix on another letter)

Example: lambda_books_update=rw/admin means lambda_books_update has SELECT + UPDATE, granted by admin.

IAM ↔ PG role mappings

sql
SELECT pg_role, iam_arn FROM sys.iam_pg_role_mappings WHERE pg_role LIKE 'lambda_%';

Each Lambda role has exactly one row here. If a deployed Lambda is getting permission denied and pg_class.relacl shows the right grants, check that the IAM↔PG mapping exists.

Migration history audit

sql
SELECT id, tag, hash, created_at
FROM drizzle.__drizzle_migrations
ORDER BY id DESC
LIMIT 10;

The hash should match sha256(migrations/<tag>.sql). A hash mismatch indicates a file edit after application — the runner will try to re-apply the changed file, which usually fails.

Validation Workflow

Before deploying migration changes, validate at three levels:

1. Journal Consistency

bash
mantle check migrations

Ensures every SQL file has a journal entry and vice versa. Catches the most common developer error (adding a SQL file without updating the journal, or using drizzle-kit generate in a non-interactive terminal where it silently produces nothing).

2. DSQL Compatibility

bash
mantle db check-dsql

Classifies each statement in your migration files:

  • OK — Fully compatible with DSQL
  • INDEX — Will be rewritten to CREATE INDEX ASYNC
  • STRIP — Unsupported, will be skipped (RLS, DO blocks, sequences)
  • RECREATION — Requires table recreation (ALTER COLUMN TYPE, ADD COLUMN with NOT NULL or DEFAULT on populated tables)

3. Dry-Run Migration

bash
mantle db migrate --dry-run

Creates a preview schema and replays all migrations against it. Reports per-statement classification without acquiring locks or mutating the live database.

4. Local Clone Testing (Optional)

bash
mantle db clone --stage staging

Clones the remote DSQL database to a local Docker PostgreSQL container (port 5433). Run migrations against real data locally before deploying to staging.

Terraform Deploy Trigger

The terraform_data.rerun_migration resource is the single mechanism that controls when MigrateDSQL is re-invoked on deploy. It hashes:

  1. The MigrateDSQL Lambda bundle (build/lambdas/MigrateDSQL/index.mjs)
  2. The permissions SQL file (permissions/permissions.sql)
  3. All files in the migrations folder (build/lambdas/MigrateDSQL/migrations/**/*)

Any change to migration files, the journal, or permissions triggers automatic re-invocation at apply time — after Lambda code updates are deployed. This ensures the latest bundle always runs migrations.

Important: This resource runs at apply time, not plan time. If the Lambda returns a FunctionError or runtime error, the local-exec script exits 1 and Terraform marks the apply as failed. You will see the error in the Terraform output — not "Apply complete!" with a hidden failure.

This is generated by mantle generate infra — never edit the trigger hash or local-exec script manually in instance .tf files. Run mantle generate infra to regenerate.

Squashing Migration History

After a migration history accumulates enough that the audit trail is noise (typically 15+ migrations with evolutionary churn), consolidate to a single 0000_baseline.sql representing current schema state. Staging only. Production consolidation requires explicit approval per release engineering policy.

The procedure relies on three framework features:

  1. mantle db clone --stage <stage> — clones the live DSQL schema into a local Docker PostgreSQL container by replaying all migrations
  2. pg_dump --schema-only — exports the resulting schema from the local clone
  3. A way to mark the baseline as "already applied" in __drizzle_migrations — required so the next deploy does not attempt to re-run the baseline against an already-populated schema

Procedure

bash
# 1. Generate consolidated SQL via clone + pg_dump
npx mantle db clone --stage staging
pg_dump --schema-only -h localhost -p 5433 -U postgres postgres > migrations/base/tables.sql

# 2. Hand-review tables.sql:
#    - No DSQL-incompatible patterns reintroduced (sequences, etc.)
#    - All tables present
#    - All indexes and constraints present
#    - Consistent ordering/formatting

# 3. Copy current permissions
cp permissions/permissions.sql migrations/base/permissions.sql

# 4. Archive old migrations
mkdir -p migrations/archive
mv migrations/[0-9]*.sql migrations/archive/
mv migrations/meta/_journal.json migrations/meta/_journal.pre-consolidation.json

# 5. Promote baseline
mv migrations/base/tables.sql migrations/0000_baseline.sql
mv migrations/base/permissions.sql permissions/permissions.sql

# 6. Create new journal — single baseline entry
cat > migrations/meta/_journal.json <<EOF
{
  "version": "7",
  "dialect": "postgresql",
  "entries": [
    {
      "idx": 0,
      "version": "7",
      "when": $(date +%s)000,
      "tag": "0000_baseline",
      "breakpoints": true
    }
  ]
}
EOF

# 7. Mark baseline as already applied (see "Marking baseline applied" below)
#    See "Marking baseline applied" below

# 8. Verify
npx mantle check migrations    # one journal entry, one SQL file
npx mantle deploy --stage staging
#    Expected: MigrateDSQL reports "Database is up to date" or applies 0 new migrations

Marking baseline applied

Use the framework CLI command — mantle db mark-applied — which computes the baseline hash with the same SHA-256 algorithm the runner uses and inserts the tracking row without executing the SQL:

bash
# Conservative: add a tracking row, preserve existing history (recommended default).
# The post-consolidation __drizzle_migrations table will have all pre-consolidation
# rows plus the new baseline row. Drift but harmless — the runner iterates files,
# not tracking rows.
npx mantle db mark-applied --file migrations/0000_baseline.sql

# Full consolidation: TRUNCATE existing tracking rows first, then INSERT the
# baseline. Destructive — requires --yes to confirm. After this, the tracking
# table contains exactly one row.
npx mantle db mark-applied --file migrations/0000_baseline.sql --reset --yes

Idempotent — if the file's hash is already in the tracking table, the command is a no-op.

DSQL endpoint + region are auto-detected from infra/.terraform (state file) when available; otherwise pass --endpoint + --region explicitly.

Fallback: manual SQL

If the CLI command isn't available for some reason (e.g., older framework version), connect to DSQL with admin credentials and execute:

sql
-- Replace the entire migration tracking history with a single baseline entry
TRUNCATE drizzle.__drizzle_migrations;
INSERT INTO drizzle.__drizzle_migrations (hash, tag, created_at)
VALUES (
  '<sha256 of migrations/0000_baseline.sql>',  -- compute with: shasum -a 256 migrations/0000_baseline.sql
  '0000_baseline',
  EXTRACT(epoch FROM now()) * 1000
);

After either approach, the next deploy will hash the baseline file, find a matching record in __drizzle_migrations, and skip it.

Verification

After consolidation, validate that:

  • The schema matches pre-consolidation state (no drift introduced by the dump → re-import roundtrip)
  • All Lambda CRUD operations still work (sample one handler per domain)
  • A new no-op migration generates as 0001_* (sequential from the baseline)
  • __drizzle_migrations contains exactly one row
sql
-- Confirm single tracking record
SELECT id, tag, hash, created_at FROM drizzle.__drizzle_migrations;
-- Expect 1 row: 0000_baseline

-- Confirm all expected tables
SELECT tablename FROM pg_tables WHERE schemaname = 'public' ORDER BY tablename;

If anything is missing or wrong, restore from the archive:

bash
# Roll back to pre-consolidation state
rm migrations/0000_baseline.sql
mv migrations/archive/*.sql migrations/
mv migrations/meta/_journal.pre-consolidation.json migrations/meta/_journal.json
git checkout permissions/permissions.sql  # if committed; else restore manually
# Then restore __drizzle_migrations rows from a backup or by replaying the archived migrations against a temp DSQL

Troubleshooting

Migration not applied after deploy

Symptom: New table doesn't exist in staging despite SQL file being in migrations/.

Causes:

  1. Missing journal entry → Run mantle check migrations to detect
  2. Stale terraform trigger → Re-run mantle generate infra to regenerate migrate_dsql.tf
  3. Build didn't include migrations → Check assets.json = ["migrations", "permissions"] exists in the MigrateDSQL handler directory
  4. Lambda bundle not rebuilt before deploy → Always run mantle build before mantle deploy

Migration failure now fails the deploy

Symptom: Previous deploys showed "Apply complete!" even when a migration Lambda invocation failed. This is fixed.

The terraform_data.rerun_migration local-exec script now checks for FunctionError in the AWS CLI response and errorType in the response payload, exiting 1 on either. Terraform will report the apply as failed and print the Lambda error to stderr.

If you see a migration failure during deploy:

bash
# 1. Fix the failing migration (schema.ts + migration SQL + journal entry)
# 2. Rebuild
npx mantle build
# 3. Re-deploy — trigger hash changes will force re-invocation of the fixed bundle
AWS_PROFILE=mantle-<instance> npx mantle deploy --stage staging

If the deploy was previously deadlocked (old data source mechanism), see the recovery procedure below.

Deploy deadlock recovery (legacy data source mechanism)

If you are running a migrate_dsql.tf generated before this fix was applied (identifiable by a data "aws_lambda_invocation" "run_migration" block), a failed migration can cause a permanent deadlock where the plan re-invokes the broken Lambda before code updates apply.

Recovery steps:

bash
# 1. Remove the failing migration (schema.ts + migration file + journal entry)
rm migrations/NNNN_failed.sql
# Edit migrations/meta/_journal.json to remove the entry
npx mantle check migrations  # must show clean

# 2. Rebuild and force-update Lambda code directly (bypass TF)
npx mantle build
cd build/lambdas/MigrateDSQL && zip -qr /tmp/recovery.zip . && cd ../../..
AWS_PROFILE=mantle-<instance> aws lambda update-function-code \
  --function-name <stage>-MigrateDSQL \
  --zip-file fileb:///tmp/recovery.zip

# 3. Verify Lambda runs clean (no pending migrations)
AWS_PROFILE=mantle-<instance> aws lambda invoke \
  --function-name <stage>-MigrateDSQL \
  --payload '{"source":"manual-recovery"}' \
  --cli-binary-format raw-in-base64-out /tmp/recovery-result.json
cat /tmp/recovery-result.json

# 4. Regenerate infra and re-deploy to apply the fix and sync TF state
npx mantle generate infra
AWS_PROFILE=mantle-<instance> npx mantle deploy --stage staging

Recreation pipeline aborted mid-step

Symptom: A migration classified as RECREATION failed partway through. The DSQL state could be one of several intermediate states (temp table created, data partly copied, original dropped but temp not renamed, etc.).

The recreation pipeline executes in this order:

  1. CREATE TABLE <name>_recreate_temp — temp table with the new schema
  2. INSERT INTO <name>_recreate_temp SELECT ... FROM <name> — copy data
  3. DROP TABLE <name> — remove the original
  4. ALTER TABLE <name>_recreate_temp RENAME TO <name> — promote temp

Recovery depends on where it failed:

StateSymptomsRecovery
Temp table created, data not copied<table>_recreate_temp exists; original intactDROP TABLE <table>_recreate_temp then redeploy
Temp table created, data copied, original not droppedBoth tables exist; data duplicatedManual decision: keep original (DROP temp) or keep temp (DROP original + RENAME temp)
Original dropped, temp not renamed<table> does not exist; <table>_recreate_temp has the dataALTER TABLE <table>_recreate_temp RENAME TO <table>
Original dropped, temp does not existDATA LOSS — restore from DSQL backupRestore from backup
sql
-- Inspect intermediate state
SELECT tablename FROM pg_tables
WHERE schemaname = 'public' AND tablename LIKE '%recreate_temp%';

Once the intermediate state is resolved, run mantle check migrations and redeploy. The migration runner's idempotent path handles the rest.

The __drizzle_migrations table will reflect the failed attempt — if the recreation aborted BEFORE the runner recorded success, no row is added. If it aborted AFTER (e.g., between RENAME and the record INSERT, which is rare), you'll have a tracking row for an incomplete migration; purge it manually.

Permission denied after deploy

Symptom: Lambda gets pgCode 28000 or permission denied for table.

Causes:

  1. Missing Step 3 of C29 → Run mantle db apply-permissions
  2. Table didn't exist when permissions were applied (see DSQL GRANT Behavior) → Re-invoke MigrateDSQL
  3. Missing @RequiresTable annotation → Add it, run mantle generate permissions

Journal/SQL mismatch

Symptom: mantle check migrations reports orphaned or missing files.

Fix:

  • Orphaned SQL (no journal entry): Either delete the SQL file or add a journal entry manually
  • Missing SQL (journal entry exists): Either create the SQL file or remove the journal entry
  • Run mantle db generate to let Drizzle reconcile from schema

drizzle-kit generate produces nothing

Cause: Requires an interactive terminal (process.stdout.isTTY). CI environments and piped shells won't work.

Fix: Run mantle db generate in a real terminal session, not in CI or via automation.

mantle db generate fails with "Please install latest version of drizzle-orm"

Symptom: drizzle-kit errors out before doing any work, saying drizzle-orm cannot be found.

Cause: drizzle-kit is an instance devDep, but it cannot find drizzle-orm via Node's module resolution when drizzle-orm is only available transitively through a link:'d @mantleframework/database. New instances scaffolded via create-mantle-app get drizzle-orm automatically. Pre-existing instances need to add it manually.

Fix: Add drizzle-orm to the instance's devDependencies:

json
{
  "devDependencies": {
    "drizzle-kit": "^0.31.10",
    "drizzle-orm": "^0.45.2"
  }
}

Then run pnpm install (or npm install). The version should track the framework's pnpm catalog (mantle/pnpm-workspace.yaml) to avoid drift; mantle check versions will warn if they diverge significantly.

mantle db generate fails with "requires a TTY" in agent/CI environments

Symptom:

Error: drizzle-kit generate requires a TTY for new-table and rename prompts.
Agent-safe workflow:
  npx mantle db generate --custom --name <descriptive-name>

Cause: drizzle-kit's tablesResolver triggers interactive prompts for new-table and rename detection. In non-TTY environments (agents, CI, piped shells), mantle db generate now fails fast with a clear error instead of propagating a confusing drizzle-kit stacktrace.

Fix: Use the agent-safe workflow:

bash
# 1. Generate an empty migration file with a descriptive name
npx mantle db generate --custom --name add_sessions_table

# 2. Write the SQL manually into the generated file
# (see hand-writing recipes below for common patterns)

# 3. Add the journal entry to migrations/meta/_journal.json (required)

The --custom flag bypasses drizzle-kit's interactive resolver and produces an empty .sql file ready for hand-writing. See When to Use mantle db generate vs Hand-Writing for SQL patterns.

Known Client-Side Issues

EventBuffer Retry Exhaustion (iOS OMD)

The EventBuffer in the iOS OfflineMediaDownloader analytics client has a data-loss bug:

EventBuffer.flush() → events.removeAll() → sendWithRetry() → [FAILURE] → events LOST

events.removeAll() executes before sendWithRetry() completes. If the send fails after all retries are exhausted, the events are permanently lost — they've already been removed from the buffer.

Impact: Analytics events may be silently dropped during network interruptions or API errors.

Future fix: Move removeAll() after successful send confirmation, or persist failed batches to disk for retry on next app launch.