Skip to content

fix(gc): drop nil bind args so VACUUM actually runs on Postgres#1602

Merged
looplj merged 3 commits into
looplj:unstablefrom
suixinio:fix/gc-vacuum-postgres-args
May 5, 2026
Merged

fix(gc): drop nil bind args so VACUUM actually runs on Postgres#1602
looplj merged 3 commits into
looplj:unstablefrom
suixinio:fix/gc-vacuum-postgres-args

Conversation

@suixinio

@suixinio suixinio commented May 5, 2026

Copy link
Copy Markdown
Contributor

Summary

  • The post-cleanup VACUUM call passes its bind args via (ctx, sql, nil, nil) to the embedded *sql.DB.ExecContext (variadic args ...any), so two nils are forwarded as bind parameters.
  • pgx v5 rejects this for a parameter-less statement with mismatched param and argument count, and the auto-VACUUM has been silently failing every night on every Postgres deployment since feat: auto vacuum after gc #823 (2026-02-11).
  • SQLite swallowed the extra nils, which is why the bug went unnoticed in the SQLite-only test path.
  • Fix: drop the trailing nil, nil so the call is invoked as sqlDriver.ExecContext(ctx, vacuumSQL) (the standard *sql.DB.ExecContext form, no bind args).

Production confirmation on a 13 GB Postgres deployment after the fix:

```
INFO Starting database VACUUM operation dialect=postgres vacuum_full=false
INFO Database VACUUM completed successfully duration=0.97s command=VACUUM
```

Before the fix the same trigger emitted:

```
ERROR Failed to run VACUUM after cleanup
error="failed to execute VACUUM: mismatched param and argument count"
```

Note for users with vacuum_full=true

This bug effectively masked VACUUM FULL for ~3 months. The fix re-enables the configured behavior, so any Postgres deployment that has explicitly set vacuum_full=true will start taking ACCESS EXCLUSIVE locks per table. Default remains false; no action needed for default deployments.

Test plan

  • go test ./internal/server/gc/... -run TestWorker_runVacuum -v (SQLite, always runs)
  • AXONHUB_TEST_PG_DSN=... go test ./internal/server/gc/... -run TestWorker_runVacuum_Postgres -v (Postgres 18 + pgx v5; covers both VACUUM and VACUUM FULL)
  • Reverse-verified: reverting the one-line fix makes only the Postgres test fail, with the exact production error message — confirming the regression test catches the bug
  • Deployed image to a 13 GB production Postgres deployment, manually triggered GC, verified Database VACUUM completed successfully in logs

The previous call passed `(ctx, sql, nil, nil)` to the embedded
*sql.DB.ExecContext (variadic args), so pgx received two nil bind
parameters for a parameter-less statement and rejected it with
"mismatched param and argument count". As a result auto-VACUUM
silently failed every night on Postgres deployments.

Tests cover both dialects; the Postgres path is env-gated on
AXONHUB_TEST_PG_DSN since the project has no in-process PG harness.
The SQLite driver swallowed the extra nils, so only the Postgres
test reproduces the regression.
Copilot AI review requested due to automatic review settings May 5, 2026 11:28

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a bug in the runVacuum function by removing unnecessary nil arguments from the ExecContext call, which previously caused parameter mismatch errors in PostgreSQL. Additionally, a new test file vacuum_test.go has been introduced, providing comprehensive coverage for vacuum operations across SQLite and PostgreSQL environments, as well as verifying behavior when the feature is disabled. I have no feedback to provide.

@greptile-apps

greptile-apps Bot commented May 5, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

  • Removes spurious nil, nil bind-parameter arguments from the ExecContext call in runVacuum, which caused pgx v5 to reject parameter-less VACUUM/VACUUM FULL statements with mismatched param and argument count, silently breaking nightly auto-vacuum on all Postgres deployments since feat: auto vacuum after gc #823.
  • Adds a new vacuum_test.go covering the disabled path, SQLite (both plain and VACUUM FULL fallback), and an optional real-Postgres path gated on AXONHUB_TEST_PG_DSN.

Confidence Score: 5/5

Safe to merge — minimal, targeted one-line fix with regression tests and production-verified confirmation.

No P0 or P1 issues found in the changed code. The core fix is correct and well-scoped. Remaining notes (sqlDB pool teardown in tests) were already raised in previous review threads and do not affect production behaviour.

No files require special attention beyond the pre-existing sqlDB cleanup note in vacuum_test.go.

Important Files Changed

Filename Overview
internal/server/gc/gc.go One-line fix: drops nil, nil from sqlDriver.ExecContext so no spurious bind args are forwarded to pgx v5 for a parameter-less VACUUM statement. Change is minimal and correct.
internal/server/gc/vacuum_test.go New test file covering disabled, SQLite, and optional Postgres paths; sqlDB returned from newPostgresEntClient is not explicitly closed (underlying pool may leak if entsql.OpenDB doesn't take ownership), as previously flagged.

Sequence Diagram

sequenceDiagram
    participant GC as GC Worker
    participant Driver as *entsql.Driver
    participant DB as *sql.DB (pgx v5)

    GC->>Driver: ExecContext(ctx, "VACUUM") [FIXED]
    Driver->>DB: ExecContext(ctx, "VACUUM")
    DB-->>Driver: ok
    Driver-->>GC: nil error

    note over GC,DB: Before fix (broken)
    GC--xDriver: ExecContext(ctx, "VACUUM", nil, nil)
    Driver--xDB: ExecContext(ctx, "VACUUM", nil, nil)
    DB-->>Driver: "mismatched param and argument count"
    Driver-->>GC: error
Loading

Reviews (3): Last reviewed commit: "test(gc): drop unused SQLite client from..." | Re-trigger Greptile

Comment thread internal/server/gc/vacuum_test.go
Comment on lines +74 to +83
func newPostgresEntClient(t *testing.T, dsn string) *ent.Client {
t.Helper()

sqlDB, err := sql.Open("pgx", dsn)
require.NoError(t, err)

require.NoError(t, sqlDB.PingContext(context.Background()))

return ent.NewClient(ent.Driver(entsql.OpenDB("postgres", sqlDB)))
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Underlying *sql.DB pool may not be closed after each subtest

Each call to newPostgresEntClient opens a new *sql.DB connection pool. Only client.Close() is deferred at the call-site — if entsql.OpenDB does not take ownership of the underlying *sql.DB, the pool is never drained until the process exits. Adding an explicit t.Cleanup(func() { _ = sqlDB.Close() }) inside this helper makes teardown unambiguous and avoids leaking connections on a shared CI Postgres instance.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes Postgres GC auto-vacuum failures by ensuring the VACUUM statement is executed with no variadic bind arguments, which pgx v5 otherwise interprets as parameters and rejects for parameter-less SQL.

Changes:

  • Update runVacuum to call ExecContext(ctx, vacuumSQL) without forwarding trailing nil arguments.
  • Add regression tests covering SQLite (VACUUM + VACUUM FULL behavior) and an opt-in real Postgres test gated by AXONHUB_TEST_PG_DSN.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
internal/server/gc/gc.go Fixes VACUUM execution on Postgres by removing unintended variadic args that pgx treats as bind parameters.
internal/server/gc/vacuum_test.go Adds regression coverage for the VACUUM execution path, including an opt-in Postgres integration-style test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

suixinio added 2 commits May 5, 2026 11:35
Move the entsql alias into its own alias section so the file
matches the project's gci `custom-order: standard, default, blank,
dot, alias, localmodule` configuration.
runVacuum returns before touching w.Ent when VacuumEnabled is
false, so allocating an in-memory SQLite client only added a
schema migration round-trip without exercising any new path.
@looplj looplj merged commit ac92db8 into looplj:unstable May 5, 2026
4 checks passed
looplj pushed a commit that referenced this pull request May 8, 2026
* fix(gc): drop nil bind args from VACUUM exec call

The previous call passed `(ctx, sql, nil, nil)` to the embedded
*sql.DB.ExecContext (variadic args), so pgx received two nil bind
parameters for a parameter-less statement and rejected it with
"mismatched param and argument count". As a result auto-VACUUM
silently failed every night on Postgres deployments.

Tests cover both dialects; the Postgres path is env-gated on
AXONHUB_TEST_PG_DSN since the project has no in-process PG harness.
The SQLite driver swallowed the extra nils, so only the Postgres
test reproduces the regression.

* style(gc): fix gci import grouping in vacuum_test

Move the entsql alias into its own alias section so the file
matches the project's gci `custom-order: standard, default, blank,
dot, alias, localmodule` configuration.

* test(gc): drop unused SQLite client from disabled-path test

runVacuum returns before touching w.Ent when VacuumEnabled is
false, so allocating an in-memory SQLite client only added a
schema migration round-trip without exercising any new path.
yoke233 pushed a commit to yoke233/axonhub that referenced this pull request May 26, 2026
…lj#1602)

* fix(gc): drop nil bind args from VACUUM exec call

The previous call passed `(ctx, sql, nil, nil)` to the embedded
*sql.DB.ExecContext (variadic args), so pgx received two nil bind
parameters for a parameter-less statement and rejected it with
"mismatched param and argument count". As a result auto-VACUUM
silently failed every night on Postgres deployments.

Tests cover both dialects; the Postgres path is env-gated on
AXONHUB_TEST_PG_DSN since the project has no in-process PG harness.
The SQLite driver swallowed the extra nils, so only the Postgres
test reproduces the regression.

* style(gc): fix gci import grouping in vacuum_test

Move the entsql alias into its own alias section so the file
matches the project's gci `custom-order: standard, default, blank,
dot, alias, localmodule` configuration.

* test(gc): drop unused SQLite client from disabled-path test

runVacuum returns before touching w.Ent when VacuumEnabled is
false, so allocating an in-memory SQLite client only added a
schema migration round-trip without exercising any new path.

(cherry picked from commit ac92db8)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants