Skip to content

feat(gc): pgserve gc orchestration verb (singleton G3 verb 3)#91

Merged
namastex888 merged 2 commits into
mainfrom
feat/singleton-g3-gc-verb
May 9, 2026
Merged

feat(gc): pgserve gc orchestration verb (singleton G3 verb 3)#91
namastex888 merged 2 commits into
mainfrom
feat/singleton-g3-gc-verb

Conversation

@namastex888

Copy link
Copy Markdown
Contributor

Summary

The 240-orphan-disease fix from the v1.0 mission, now landed as a CLI verb. Composes the merged G3 foundations into a working `pgserve gc`.

What ships

  • `src/gc/pg-queries.js` — psql shellout helpers (selectMetaRows / selectExistingDbs / selectActiveDbs / dropDatabase / deleteMetaRow). Modeled on the existing `pgQuery` pattern in `src/upgrade/steps/cosign-meta-migration.js`.
  • `src/commands/gc.js` — the verb. Dry-run by default; `--apply` required to actually drop. `--json` for machine-readable summary. `--stale-after-days ` for window override. `--port ` for explicit port.
  • Wrapper allowlist + dispatch wiring (mirrors `doctor` / `trust` / `verify`).

Decisions

  • Dry-run is the default. `--apply` is opt-in. Dry-run still writes audit-log entries with `dryRun: true` so operators can review the would-have-dropped set before flipping the switch.
  • Audit every retention decision, not just drops. `pgserve gc` writes `action: skip` events for retained rows so "why was X kept?" is answerable from the JSON-lines log alone.
  • Defensive identifier + literal quoting in the DROP path. Identifiers are already constrained by `deriveProvisionedNames()` but we belt-and-suspenders against manually-inserted meta rows.
  • psql shellout > pg driver. Matches the existing `cosign-meta-migration.js` pattern; no extra runtime cost loading `pg` for a CLI that runs once and exits.

Exit codes

Code Meaning
0 Success
1 User error (bad flags)
2 `pgserve_meta` does not exist on this host (no provisions yet — clean monitoring signal)
3 One or more drops failed (partial sweep still audited)

Test plan

  • `bun test tests/cli/gc.test.js tests/gc/pg-queries.test.js` → 26 pass / 0 fail
  • `eslint src/ bin/` → clean
  • `knip` → clean (no new violations)
  • Smoke: `HOME=$(mktemp -d) node bin/pgserve-wrapper.cjs gc --help` routes through wrapper → dispatch → runGc and prints the usage block.

Cohort

Singleton G3 verb 3 of 4. After this lands:

  • ✅ G3 verb 1 (doctor)
  • ✅ G3 verb 2 (trust)
  • ✅ G3 verb 3 (gc) ← this PR
  • ⏳ G3 verb 4 (provision orchestration) — the only G3 leftover

🤖 Generated with Claude Code

The 240-orphan-disease fix from the v1.0 mission, now landed as a CLI
verb. Composes the merged G3 foundations:

  - src/gc/orphan-detection.js  → classifyOrphans (PR #90)
  - src/gc/audit-log.js         → writeGcAudit (PR #90)
  - src/schema/pgserve-meta.js  → PGSERVE_META_TABLE (PR #88)
  - src/lib/admin-json.js       → readAdminJson for port discovery

src/gc/pg-queries.js (new):
  Modeled on src/upgrade/steps/cosign-meta-migration.js#pgQuery (PR #79)
  — psql shellout via spawnSync with stdin pipe, no shell expansion.
  - pgQuery({ sql, db, port, ... }) — primitive
  - selectMetaRows({ port })       → array of meta rows; throws
                                     ENOPGSERVE_META if the table
                                     doesn't exist (caller decides
                                     whether that's a clean exit-2)
  - selectExistingDbs({ port })    → Set<datname> excluding system DBs
  - selectActiveDbs({ port })      → Set<datname> with ≥1 backend in
                                     pg_stat_activity (excluding self)
  - dropDatabase({ database, role, port })
                                   → pg_terminate_backend + DROP
                                     DATABASE IF EXISTS + DROP ROLE
                                     IF EXISTS (best-effort on role)
  - deleteMetaRow({ fingerprint, port })
                                   → DELETE FROM public.pgserve_meta
  Defensive identifier + literal quoting for the CREATE/DROP path; the
  identifiers are constrained by deriveProvisionedNames() but we still
  belt-and-suspenders to defend against admins who manually inserted
  meta rows.

src/commands/gc.js (new):
  - default mode = dry-run; `--apply` is required to actually DROP.
  - `--stale-after-days <N>` overrides the 30-day idle staleness window.
  - `--json` emits a single summary object (per-event audit log is
    always written regardless).
  - `--port <N>` / `-p` overrides the postgres port discovery; default
    is read from admin.json then 5432.
  - exit codes:
      0   success
      1   user error (bad flags)
      2   pgserve_meta absent on host (no provisions yet — clean signal
          for monitoring rather than crash)
      3   one or more drops failed (partial sweep still audited)

  Every retention decision audits a 'skip' event so an operator can
  answer "why was X kept?" days later from the JSON-lines log alone.
  Dry-run mode logs the *intent* with `dryRun: true` so the audit
  layer also captures the would-have-dropped before the operator
  flips the switch.

bin/pgserve-wrapper.cjs:
  Add 'gc' to __installSubcommands. Pure node + child_process; must
  skip the bun probe.

src/cli-install.cjs:
  Add `case 'gc':` to dispatch with the dynamic-import pattern used
  by verify / trust.

Tests (26 pass, 0 fail):
  tests/cli/gc.test.js (new):
    - parseFlags: dry-run default, --apply, --json, --stale-after-days
      validation, --port range check, --help, unknown-flag throw.
    - resolvePort: explicit port wins; fallback returns a valid integer
      in [1, 65535] regardless of admin.json presence on the test host.
    - runGc surface: --help → 0; bad flag → 1.
  tests/gc/pg-queries.test.js (new):
    - quoteIdent / quoteLiteral: shape + injection-style escapes.
    - SYSTEM_DBS contains template0/template1/postgres.
    - DEFAULT_PORT = 5432.
    - dropDatabase refuses empty + system DB names.
    - deleteMetaRow rejects empty fingerprint.
    - pgQuery rejects empty/non-string sql.

Smoke test (HOME=$(mktemp -d) node bin/pgserve-wrapper.cjs gc --help):
  routes through wrapper → dispatch → runGc and prints the usage block,
  confirming the allowlist + dispatch wiring.

Cohort: Singleton G3 verb 3 of 4. Verbs 1 (doctor) + 2 (trust) shipped;
verb 4 (provision orchestration) is the only G3 leftover.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 9, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@namastex888 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 6 minutes and 20 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0e7f3911-c36c-49b2-8573-43dfd5492aee

📥 Commits

Reviewing files that changed from the base of the PR and between c1767d7 and b8da96b.

📒 Files selected for processing (6)
  • bin/pgserve-wrapper.cjs
  • src/cli-install.cjs
  • src/commands/gc.js
  • src/gc/pg-queries.js
  • tests/cli/gc.test.js
  • tests/gc/pg-queries.test.js
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/singleton-g3-gc-verb

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the pgserve gc subcommand, which provides a mechanism to identify and sweep orphaned databases. The implementation includes a new CLI command handler, PostgreSQL query utilities that utilize psql shellouts for minimal runtime overhead, and unit tests for flag parsing and SQL generation. Review feedback identifies a security risk regarding hardcoded database passwords, suggests refactoring duplicated error-handling logic for better maintainability, and notes a type inconsistency where a boolean value was passed as a string in the audit logging logic.

Comment thread src/gc/pg-queries.js Outdated
Comment on lines +52 to +58
password = process.env.PGPASSWORD || 'postgres',
captureStdout = false,
} = {}) {
if (typeof sql !== 'string' || sql.length === 0) {
throw new TypeError('pgQuery: sql must be a non-empty string');
}
const env = { ...process.env, PGPASSWORD: password };

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Hardcoding a fallback password of 'postgres' is a security risk and prevents users from using other authentication methods like .pgpass files. If PGPASSWORD is not set in the environment, psql should be allowed to use its default authentication flow (e.g., prompt for a password, use .pgpass).

I suggest removing the fallback and only setting PGPASSWORD for the child process if it's explicitly provided.

Suggested change
password = process.env.PGPASSWORD || 'postgres',
captureStdout = false,
} = {}) {
if (typeof sql !== 'string' || sql.length === 0) {
throw new TypeError('pgQuery: sql must be a non-empty string');
}
const env = { ...process.env, PGPASSWORD: password };
password = process.env.PGPASSWORD,
captureStdout = false,
} = {}) {
if (typeof sql !== 'string' || sql.length === 0) {
throw new TypeError('pgQuery: sql must be a non-empty string');
}
const env = { ...process.env, ...(password !== undefined && { PGPASSWORD: password }) };

Comment thread src/commands/gc.js
Comment on lines +177 to +180
writeGcAudit({ action: 'error', reason: 'select_meta_rows_failed', detail: err.message });
if (opts.json) emitJson({ ...summary, error: err.message });
else process.stderr.write(`pgserve gc: ${err.message}\n`);
return 3;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This error handling logic is very similar to the logic in the catch block on lines 189-192. To improve maintainability and reduce code duplication, consider extracting this logic into a helper function. This function could take the error, a reason string, and other context as arguments, and handle both audit logging and outputting the error.

Comment thread src/commands/gc.js Outdated
if (!opts.apply) {
writeGcAudit({
action: 'skip',
dryRun: 'true',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The dryRun property is being set as a string 'true', which is inconsistent with other boolean flags in this file, such as summary.applied. For consistency, it's better to use a boolean true.

Suggested change
dryRun: 'true',
dryRun: true,

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 12990dbb3e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gc/pg-queries.js
});
if (!out) return [];
return out.split('\n').filter(Boolean).map((line) => {
const [fingerprint, database_name, role_name, publisher, source_path, last_used_at] = line.split('\t');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Parse psql output with the configured delimiter

pgQuery invokes psql -At without -F, so unaligned output uses | as the field separator by default, but selectMetaRows splits each row on \t. In hosts where pgserve_meta has rows, each line is parsed into a single token, leaving database_name empty and causing downstream classification to fail (or misclassify) instead of sweeping orphans. Set -F '\t' (or split on |) so parsing matches the actual CLI output format.

Useful? React with 👍 / 👎.

… bool

PR #91 bot reviews found three real correctness gaps and one
consistency nit:

[CRITICAL — Codex P1] psql -At unaligned output defaults to '|', not
'\t'. selectMetaRows / selectActiveDbs / selectExistingDbs all split
on '\t' — every multi-column SELECT silently produced one-token rows
(database_name='db|role|...', everything else undefined). The fast-
path classification was effectively non-functional.
  Fix: pass -F '\t' explicitly so the separator matches what callers
  parse against.

[CRITICAL] psql in script mode (-f) by default continues past
statement errors and still exits 0. A failed CREATE DATABASE / GRANT
in the provision verb's sequence could be invisible. (Caught by
Codex P1 on the sibling PR #92, but the bug applies to gc's pgQuery
helper too — this PR's pg-queries.js is the same helper.)
  Fix: pass -v ON_ERROR_STOP=1 so SQL errors surface as non-zero.

[HIGH — Gemini] Hardcoded password = 'postgres' fallback prevented
.pgpass / peer-auth on hosts that use them.
  Fix: only set PGPASSWORD in the child env when the caller or the
  caller's environment supplies one; let psql fall through to its
  default authentication flow otherwise.

[MEDIUM — Gemini] dryRun: 'true' (string) was inconsistent with
other boolean fields in the audit-log payload.
  Fix: dryRun: true.

Filtered out (low-value):
  - "Extract duplicate error-handling helper" (Gemini medium) —
    cosmetic; the two error paths handle different layers (meta-rows
    vs pg_state).

Validation: 26/26 tests pass; lint clean. The integration round-trip
now requires that callers strip empty trailing-tab tokens correctly
when reading single-column queries — verified manually against the
existing parsing in selectExistingDbs / selectActiveDbs which split
on '\n' first, then on '\t' is benign because there's only one
field.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
namastex888 added a commit that referenced this pull request May 9, 2026
…rrency honesty

PR #92 bot reviews found four real correctness gaps:

[CRITICAL — Codex P2 + Gemini HIGH] psql -At unaligned output defaults
to '|', not '\t'. selectMetaRow split on '\t' → database_name picked
up the entire 'db|role' string and role_name was undefined, so the
fast-path idempotency check missed every existing database and the
verb re-ran the create sequence on every replay.
  Fix: pass -F '\t' explicitly in pgQuery so the separator matches
  what callers parse against.

[CRITICAL — Codex P1] psql in script mode (-f) by default continues
past statement errors and STILL exits 0. A failed CREATE DATABASE /
GRANT could be invisible — the metadata write would still happen,
producing a pgserve_meta row pointing at a database that doesn't
exist.
  Fix: pass -v ON_ERROR_STOP=1 so SQL errors surface as non-zero exits.

[HIGH — Gemini] Hardcoded password = 'postgres' fallback prevented
.pgpass / peer-auth on hosts that use them.
  Fix: only set PGPASSWORD in the child env when the caller or the
  caller's environment supplies one.

[HIGH — Gemini] The advisory-lock-only-tx step provided ZERO
concurrency protection. Every pgQuery call starts a NEW psql session
via spawnSync, so an xact-scoped lock releases on the lock-step's
COMMIT and a session-scoped lock can't survive across separate
spawnSync invocations either. The "second SELECT inside the catch
path" claim in the file header was also wishful — never implemented.

  Fix: drop the misleading lock-only-tx step entirely. The honest
  serialization mechanism is **idempotency under racing replays**,
  documented now in the file header:
    - CREATE ROLE wrapped in DO + IF NOT EXISTS.
    - CREATE DATABASE catches "already exists" race.
    - GRANT is set-style (re-run adds nothing).
    - INSERT ... ON CONFLICT (fingerprint) DO UPDATE.
  10 concurrent provisioners with the same fingerprint converge to
  one (database, role, meta-row) trio without any explicit lock.

  The advisory-lock helpers in src/provision/advisory-lock.js are
  retained on main for future single-session callers (daemon mode);
  removing the stale import here, plus the bigintLiteral helper that
  only existed to feed the now-deleted lock SQL.

[MEDIUM — Gemini] psql does NOT emit SQLSTATE codes (42P04) by
default. The catch-path text-match against '42P04' would never fire
on a real race; psql emits the human-readable "ERROR: database "<name>"
already exists" message instead.
  Fix: match against 'already exists' in stderr.

src/lib/pg-query.js:
  Same -F '\t' + ON_ERROR_STOP + PGPASSWORD-only-when-set fixes (the
  shared primitive that gc + provision both use; mirrors the parallel
  fix-up commit on PR #91's src/gc/pg-queries.js).

Validation: 21/21 tests pass; lint clean; knip clean.

Filtered out (none — every bot finding addressed).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@namastex888 namastex888 merged commit d9443bc into main May 9, 2026
9 checks passed
@namastex888 namastex888 deleted the feat/singleton-g3-gc-verb branch May 10, 2026 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant