Skip to content

fix: strip pgaudit logging flags from target during DMS migration#298

Merged
jhrv merged 8 commits intomainfrom
fix/disable-pgaudit-for-migration
Feb 18, 2026
Merged

fix: strip pgaudit logging flags from target during DMS migration#298
jhrv merged 8 commits intomainfrom
fix/disable-pgaudit-for-migration

Conversation

@jhrv
Copy link
Contributor

@jhrv jhrv commented Feb 16, 2026

Bakgrunn

@mortenlj bekreftet i #240 at audit logging bryter DMS-migrering. Alle tre testdatabaser (i dev-nais-dev/basseng) feilet med "Internal error" ved oppstart av migreringsjobben.

Rotårsak

DefineInstance() bruker DeepCopy() som kopierer alle database flags fra kilde til target, inkludert:

  • pgaudit.log=write,ddl,role
  • pgaudit.log_parameter=on

Disse flaggene aktiverer pgaudit-hooks på target-instansen. Når DMS setter opp pglogical-replikering og kjører interne DDL/write-operasjoner, trigger pgaudit logging av disse — noe som forårsaker "Internal error" og feiler migreringsjobben.

Løsning

Samme mønster som HA/PITR-fixen i #294:

  1. CreateInstance() — Stripper pgaudit.log og pgaudit.log_parameter fra target-instansens spec
  2. PrepareTargetInstance() — Stripper de samme flaggene på CNRM-nivå (safety net)
  3. ValidateSourceInstance() — Logger advarsel om at pgaudit midlertidig deaktiveres, og at brukeren må kjøre nais postgres enable-audit på nytt etter migrering

Viktig: Vi beholder cloudsql.enable_pgaudit=on på target slik at pgaudit shared library forblir lastet. Dette er nødvendig for at CREATE EXTENSION pgaudit fra kilde-databasens dump skal fungere under DMS restore.

Etter migrering

Når target er promotert og original app-spec deployes av naiserator, kommer pgaudit.log og pgaudit.log_parameter tilbake automatisk. Brukeren må deretter kjøre:

nais postgres enable-audit --context <context> --namespace <namespace> <app>

Dette er allerede dokumentert i nais-docs:

"When using cloud-sql-migrator to migrate will result in a completely new instance... the pgaudit extension will need to be created again using the nais cli."

Verifisering

Testdatabasene i dev-nais-dev/basseng fra @mortenlj kan brukes til å verifisere at fixen fungerer.

pgaudit.log and pgaudit.log_parameter flags on the target instance cause
DMS migration jobs to fail with 'Internal error' at startup. The pgaudit
hooks log DMS internal replication operations, which interferes with the
migration process.

This fix strips pgaudit.log and pgaudit.log_parameter from the target
instance spec during migration, while keeping cloudsql.enable_pgaudit so
the shared library stays loaded and the pgaudit extension can be restored
from the source database dump.

After migration and promotion, the original app spec (with all flags) is
re-applied by naiserator. Users must re-run 'nais postgres enable-audit'
to re-create the extension and per-user config on the new instance.

Resolves #240

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jhrv jhrv requested review from Muni10 and mortenlj February 16, 2026 09:16
jhrv and others added 7 commits February 17, 2026 07:43
Testing revealed that keeping cloudsql.enable_pgaudit=on on the target
still causes DMS 'Internal error' - even just loading the shared library
interferes with DMS replication.

Two-pronged approach:
1. Strip ALL pgaudit flags from target (including cloudsql.enable_pgaudit)
2. Drop pgaudit extension from source databases before migration starts,
   so the DMS dump does not contain CREATE EXTENSION pgaudit

After migration, users must re-run 'nais postgres enable-audit'.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The postgres user may not own the pgaudit extension in the app database
(it's typically owned by the app user). Use ALTER EXTENSION OWNER TO
postgres before DROP to ensure we can remove it regardless of owner.

Also make the drop a hard error instead of a warning since DMS validates
that extensions match between source and target.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
In Cloud SQL, the postgres user is not a true superuser and cannot
drop extensions owned by other users. The pgaudit extension in the
app database is typically owned by the app user, so we must connect
as the app user (who has the password from PrepareSourceDatabase)
to drop it.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The pgaudit hooks on the source instance also interfere with DMS
pglogical replication. Strip pgaudit flags from both source and
target instances during PrepareSourceInstance.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The 'nais postgres enable-audit' command sets per-user database config:
ALTER USER app_user IN DATABASE db SET pgaudit.log TO 'none'

This is included in the pg_dump and causes pg_restore to fail on the
target with 'role does not exist' because the source app user
(e.g. audit-test-src) doesn't exist on the target instance.

Reset this setting before dropping the extension so the DMS dump
is completely clean of pgaudit references.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Removing cloudsql.enable_pgaudit from the source in PrepareSourceInstance
(step 11) causes the shared library to be unloaded. When step 12 then
tries to connect to install pglogical, PostgreSQL fails because the
pgaudit extension references a library that is no longer loaded. Step 12b
(which would DROP the extension) never gets to run.

Fix: keep pgaudit flags on the source so the shared library stays loaded
until step 12b can DROP EXTENSION pgaudit and reset per-user settings.
The target already has pgaudit flags stripped (since commit d079777).

Tested end-to-end: setup + promote both complete successfully on a
source instance with pgaudit enabled (including per-user pgaudit.log
settings).
@jhrv jhrv merged commit c165080 into main Feb 18, 2026
4 checks passed
@jhrv jhrv deleted the fix/disable-pgaudit-for-migration branch February 18, 2026 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant