Skip to content

feat(doctor): pgserve doctor read-only V1 (singleton G3 first verb)#86

Merged
namastex888 merged 3 commits into
mainfrom
feat/singleton-g3-doctor
May 9, 2026
Merged

feat(doctor): pgserve doctor read-only V1 (singleton G3 first verb)#86
namastex888 merged 3 commits into
mainfrom
feat/singleton-g3-doctor

Conversation

@namastex888

@namastex888 namastex888 commented May 8, 2026

Copy link
Copy Markdown
Contributor

Singleton wish G3 — first of the four CLI verbs (provision/gc/trust/doctor). Read-only V1; --fix tiered modes deferred to a follow-up per SHARED-DESIGN §3.2.

What ships

src/commands/doctor.js runs 7 checks against the local install:

# Check What it validates
1 version_blocklist current version vs BLOCKED_VERSIONS (G5)
2 admin_json_exists ~/.autopg/admin.json present, mode 0600
3 admin_json_shape supervisor enum + port + socketDir present
4 supervisor_liveness dispatches based on admin.json.supervisor:
• pm2 → pm2 jlist for autopg-server entry online
• systemd-user → systemctl --user is-active autopg.service
• launchd → launchctl list dev.automagik.autopg
• external → operator owns supervision (PASS, no probe)
5 runtime_json <socketDir>/runtime.json present + alive; cohort-contract guard: NO supervisor field
6 uds_reachable connect to <socketDir>/.s.PGSQL.<port>
7 tcp_reachable connect to localhost:<port>

Output modes

  • default: human-readable with severity tags + remediation hints
  • --json: structured { ok, summary, findings } for tool consumption
  • --fix / --fix --aggressive: rejected with exit 64 (not implemented in V1)

Exit codes

Code Meaning
0 All checks PASS
1 At least one FAIL
2 At least one WARN, no FAILs
64 --fix requested (not implemented in V1)

Felipe constraint honored

Felipe directive 2026-05-08: doctor reports passively; never suggests swapping supervisor tiers. Confirmed in supervisor_liveness — findings call out the active mode but never recommend changing it.

Validation

  • bun test tests/cli/doctor.test.js12 pass / 0 fail
  • bun run lint → clean
  • bun run deadcode → clean

Cohort

Singleton G3 verb 1 of 4. trust + gc + provision come in follow-up PRs (each ~150-300 LOC, separately shippable).

--fix tiered modes (Cat 1/2/3 mutations) tracked as a separate follow-up — locked in SHARED-DESIGN §3.2 but implementation needs its own design pass.

Summary by CodeRabbit

  • New Features

    • Added pgserve doctor: a passive, read-only diagnostics command that checks config, supervisor liveness, runtime shape, and socket/database reachability; supports human or JSON output. Exits 0/2/1 for PASS/WARN/FAIL; --fix is rejected with a specific error exit.
  • Tests

    • Added tests validating command outputs, exit codes, severity values, and selected check behaviors.

pgserve-singleton-no-proxy wish, Group 3. First of the four CLI verbs
(provision/gc/trust/doctor) — read-only V1.

What ships:
- src/commands/doctor.js (NEW): runs 7 checks against the local install:
  1. version_blocklist     — current version vs BLOCKED_VERSIONS (G5)
  2. admin_json_exists     — file present at ~/.autopg/admin.json mode 0600
  3. admin_json_shape      — supervisor enum + port + socketDir present
  4. supervisor_liveness   — dispatches based on admin.json.supervisor:
       pm2          → 'pm2 jlist' for autopg-server entry online
       systemd-user → 'systemctl --user is-active autopg.service'
       launchd      → 'launchctl list dev.automagik.autopg'
       external     → 'operator owns supervision' (PASS, no probe)
  5. runtime_json          — <socketDir>/runtime.json present + alive,
                             cohort-contract guard: NO 'supervisor' field
  6. uds_reachable         — connect to <socketDir>/.s.PGSQL.<port>
  7. tcp_reachable         — connect to localhost:<port>

- src/cli-install.cjs: 'doctor' subcommand routing (dynamic-import pattern,
  matches 'uninstall').

- tests/cli/doctor.test.js (NEW): 12 tests covering exit-code logic,
  SEVERITY enum, --fix refusal (returns 64 — not implemented in V1),
  --json output mode, default-mode runs end-to-end.

Output modes:
- default: human-readable with severity tags + remediation hints
- --json: structured { ok, summary, findings } for tool consumption
- --fix / --fix --aggressive: rejected with exit 64; tiered Cat 1/2/3
  modes deferred to a follow-up per SHARED-DESIGN §3.2

Felipe directive 2026-05-08: doctor reports passively; never suggests
swapping supervisor tiers. Confirmed in supervisor_liveness check —
findings call out the active mode but never recommend changing it.

Validation:
- bun test tests/cli/doctor.test.js → 12 pass / 0 fail
- bun run lint: clean
- bun run deadcode: clean

Cohort: singleton G3 verb #1 of 4. trust + gc + provision come in
follow-up PRs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 8, 2026

Copy link
Copy Markdown

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d3add0b9-a516-4d24-b3ce-29211827b623

📥 Commits

Reviewing files that changed from the base of the PR and between 8fe1030 and 03a9733.

📒 Files selected for processing (1)
  • src/commands/doctor.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/commands/doctor.js

📝 Walkthrough

Walkthrough

This PR implements the pgserve doctor health-check command, a read-only diagnostic tool that validates pgserve version, admin.json state, supervisor liveness, runtime discovery, and socket reachability. It routes through the existing CLI dispatcher, defines a Finding data model with PASS/WARN/FAIL severities, runs a suite of probes, and outputs human-readable or JSON reports with computed exit codes.

Changes

Doctor Health-Check Command

Layer / File(s) Summary
CLI Routing
bin/pgserve-wrapper.cjs, src/cli-install.cjs
Adds doctor subcommand to wrapper skip-list and wires it into the CLI dispatcher to dynamically import and execute runDoctor().
Finding Model & Exit Logic
src/commands/doctor.js
Defines SEVERITY enum (PASS/WARN/FAIL), Finding shape with id/severity/detail/hint, and exitCodeFor() function (FAIL→1, WARN→2, PASS→0).
Imports & Helper Utilities
src/commands/doctor.js
Adds Node imports, version detection, and probe timeout helpers; implements pm2EntryOnline(), systemdUnitActive(), and launchdJobLoaded().
Version & Configuration Checks
src/commands/doctor.js
Implements checkVersionNotBlocked(), checkAdminJsonExists() (existence and 0600 permissions), and checkAdminJsonShape() (validates supervisor/port/socketDir fields).
Supervisor & Runtime Checks
src/commands/doctor.js
Implements checkSupervisorLiveness() (probes pm2/systemd-user/launchd/external) and checkRuntimeJson() (validates socketDir/runtime.json and live-discovery invariants).
Socket Reachability Checks
src/commands/doctor.js
Implements checkUdsReachable() (Unix socket existence/connect) and checkTcpReachable() (127.0.0.1 TCP connect with timeout).
Command Orchestration & Output
src/commands/doctor.js
Implements runChecks() orchestration, summary(), emitHuman() and emitJson(), runDoctor() entry rejecting --fix with exit code 64, and exports __testInternals.
Tests
tests/cli/doctor.test.js
Bun test suite validating exit code logic, SEVERITY immutability, runDoctor() entry behavior (including --fix rejection), and shape-level tests for selected internal checks.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant Doctor
  participant Supervisors
  participant Runtime
  participant Sockets
  participant Exit
  CLI->>Doctor: invoke runDoctor()
  Doctor->>Supervisors: checkSupervisorLiveness()
  Doctor->>Runtime: checkRuntimeJson()
  Doctor->>Sockets: checkUdsReachable() / checkTcpReachable()
  Doctor-->>CLI: return findings
  CLI->>Exit: compute exitCodeFor(findings) and exit
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • namastexlabs/pgserve#66: Both PRs modify CLI routing and the wrapper's skip-list to add new subcommands that bypass the bun health probe.
  • namastexlabs/pgserve#23: Related wrapper/bun pre-flight probe changes that interact with the doctor subcommand registration.

Suggested labels

stable

Poem

🐰 I hop with a stethoscope, checking sockets and config deep,

version, admin, supervisors — no secrets to keep,
PASS, WARN, or FAIL, each finding in sight,
JSON or human, I report through the night,
A tiny rabbit doctor keeping pgserve right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main addition: a read-only V1 of the pgserve doctor command, which is the primary change across all modified/added files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/singleton-g3-doctor

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8cb81af1e2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/cli-install.cjs
// active supervisor + postmaster reachability + admin.json /
// runtime.json health. --fix tiered modes deferred to a follow-up
// (SHARED-DESIGN §3.2).
return import('./commands/doctor.js').then((mod) => mod.runDoctor(args).then((code) => process.exit(code)));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Route doctor through wrapper subcommand allowlist

dispatch() now handles doctor, but this branch is unreachable from normal CLI usage because bin/pgserve-wrapper.cjs only forwards subcommands listed in __installSubcommands, and doctor is not in that set. In that common path, pgserve doctor falls through to bin/postgres-server.js (bun route) and exits with help/error instead of running diagnostics, so the new command is effectively broken for users invoking the installed binary.

Useful? React with 👍 / 👎.

Comment thread src/commands/doctor.js Outdated
Comment on lines +305 to +306
while (!connected && Date.now() - start < 500) {
execSync('true'); // yield event loop without sleep

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Replace blocking UDS probe loop with async connect handling

This loop blocks the Node event loop while waiting for connected to flip, but net.Socket emits 'connect' asynchronously on that same event loop. As a result, on a healthy Unix socket the callback cannot run until after the loop exits, so uds_reachable can be reported as FAIL even when the socket is accepting connections. Any host with a live postmaster can hit this false negative in checkUdsReachable.

Useful? React with 👍 / 👎.

Comment thread src/commands/doctor.js Outdated
Comment on lines +160 to +164
return check(
'admin_json_shape',
`admin.json.supervisor = "${supervisor}"`,
SEVERITY.PASS,
`port=${admin.port ?? '?'}, socketDir=${admin.socketDir ?? '?'}`,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate admin.json port and socketDir before PASS

The admin_json_shape check returns PASS as soon as supervisor is valid, but it does not actually enforce that port and socketDir are present/valid. When those fields are missing or malformed, later checks silently fall back to defaults (5432 and resolveSocketDir()), which can mask corrupted admin.json and produce misleading diagnostics. This contradicts the stated shape-validation goal for this check.

Useful? React with 👍 / 👎.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the pgserve doctor command, which performs a suite of health checks on the local installation, including version validation, configuration file integrity, supervisor liveness, and socket reachability. The review feedback identifies several opportunities for improvement: replacing an inefficient synchronous busy-wait loop in the Unix socket check with an asynchronous implementation, reducing redundant file I/O by passing the parsed admin configuration object between functions, and removing unnecessary error handling. Additionally, the reviewer suggests using existing constants for supervisor types to improve maintainability and strengthening the unit tests to provide more meaningful assertions.

Comment thread src/commands/doctor.js
Comment on lines +284 to +315
function checkUdsReachable(admin) {
const socketDir = admin?.socketDir || resolveSocketDir();
const port = admin?.port || 5432;
const sockPath = path.join(socketDir, `.s.PGSQL.${port}`);
if (!fs.existsSync(sockPath)) {
return check(
'uds_reachable',
`Unix socket missing at ${sockPath}`,
SEVERITY.FAIL,
'postmaster has not bound the canonical socket',
'check postmaster logs / supervisor liveness',
);
}
// Probe by connecting; postgres responds even before SSL handshake on a healthy socket.
try {
const sock = new net.Socket();
let connected = false;
sock.on('connect', () => { connected = true; sock.destroy(); });
sock.connect(sockPath);
// Synchronous-ish wait via deasync-free pattern: small timeout, deterministic result.
const start = Date.now();
while (!connected && Date.now() - start < 500) {
execSync('true'); // yield event loop without sleep
}
if (!connected) {
return check('uds_reachable', `Unix socket not accepting at ${sockPath}`, SEVERITY.FAIL);
}
return check('uds_reachable', `Unix socket accepting at ${sockPath}`, SEVERITY.PASS);
} catch (e) {
return check('uds_reachable', `Unix socket probe failed`, SEVERITY.FAIL, e.message);
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The synchronous-ish wait pattern using execSync('true') in a while loop is highly inefficient. It will cause a CPU core to spin at 100% for up to 500ms. This check should be asynchronous, similar to checkTcpReachable, to avoid blocking the event loop and wasting CPU cycles. Note that this change will require you to await the function call in runChecks.

function checkUdsReachable(admin) {
  const socketDir = admin?.socketDir || resolveSocketDir();
  const port = admin?.port || 5432;
  const sockPath = path.join(socketDir, `.s.PGSQL.${port}`);
  if (!fs.existsSync(sockPath)) {
    return Promise.resolve(check(
      'uds_reachable',
      `Unix socket missing at ${sockPath}`,
      SEVERITY.FAIL,
      'postmaster has not bound the canonical socket',
      'check postmaster logs / supervisor liveness',
    ));
  }
  // Probe by connecting; postgres responds even before SSL handshake on a healthy socket.
  return new Promise((resolve) => {
    const sock = new net.Socket();
    let resolved = false;
    const done = (severity, detail) => {
      if (resolved) return;
      resolved = true;
      sock.destroy();
      if (severity === SEVERITY.PASS) {
        resolve(check('uds_reachable', `Unix socket accepting at ${sockPath}`, SEVERITY.PASS));
      } else {
        resolve(check('uds_reachable', `Unix socket not accepting at ${sockPath}`, SEVERITY.FAIL, detail));
      }
    };

    sock.setTimeout(500);
    sock.on('connect', () => done(SEVERITY.PASS));
    sock.on('timeout', () => done(SEVERITY.FAIL, 'connection timed out'));
    sock.on('error', (e) => done(SEVERITY.FAIL, e.message));
    sock.connect(sockPath);
  });
}

Comment thread src/commands/doctor.js Outdated

findings.push(checkSupervisorLiveness(admin));
findings.push(checkRuntimeJson(admin));
findings.push(checkUdsReachable(admin));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

To accommodate the change of checkUdsReachable to an asynchronous function (as suggested in another comment), you must await its result here.

Suggested change
findings.push(checkUdsReachable(admin));
findings.push(await checkUdsReachable(admin));

Comment thread src/commands/doctor.js Outdated
Comment on lines +150 to +158
const validSupervisor = ['pm2', 'systemd-user', 'launchd', 'external'].includes(supervisor);
if (!validSupervisor) {
return check(
'admin_json_shape',
`admin.json.supervisor is invalid: ${JSON.stringify(supervisor)}`,
SEVERITY.FAIL,
'expected one of {pm2, systemd-user, launchd, external}',
'reinstall via `pgserve install` (Tier A) or `autopg service install` (Tier B)',
);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The list of valid supervisor types is hardcoded. This list is already defined and exported as SUPERVISOR_VALUES in src/lib/admin-json.js. To improve maintainability and avoid duplication, you should import and use this constant.

To apply this suggestion, you'll also need to add SUPERVISOR_VALUES to the import from ../lib/admin-json.js at the top of the file:

import { getAdminFilePath, readAdminJson, SUPERVISOR_VALUES } from '../lib/admin-json.js';
Suggested change
const validSupervisor = ['pm2', 'systemd-user', 'launchd', 'external'].includes(supervisor);
if (!validSupervisor) {
return check(
'admin_json_shape',
`admin.json.supervisor is invalid: ${JSON.stringify(supervisor)}`,
SEVERITY.FAIL,
'expected one of {pm2, systemd-user, launchd, external}',
'reinstall via `pgserve install` (Tier A) or `autopg service install` (Tier B)',
);
const validSupervisor = SUPERVISOR_VALUES.includes(supervisor);
if (!validSupervisor) {
return check(
'admin_json_shape',
`admin.json.supervisor is invalid: ${JSON.stringify(supervisor)}`,
SEVERITY.FAIL,
`expected one of {${SUPERVISOR_VALUES.join(', ')}}`,
'reinstall via `pgserve install` (Tier A) or `autopg service install` (Tier B)',
);
}

Comment thread src/commands/doctor.js Outdated
Comment on lines +243 to +248
let runtime;
try {
runtime = readRuntimeJson(socketDir);
} catch {
runtime = null;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The readRuntimeJson function from ../lib/runtime-json.js is designed to never throw, returning null on any failure. Therefore, the try...catch block around its call is redundant and can be removed.

Suggested change
let runtime;
try {
runtime = readRuntimeJson(socketDir);
} catch {
runtime = null;
}
const runtime = readRuntimeJson(socketDir);

Comment thread src/commands/doctor.js
Comment on lines +342 to +357
export async function runChecks() {
const findings = [];
findings.push(checkVersionNotBlocked());
findings.push(checkAdminJsonExists());
findings.push(checkAdminJsonShape());

let admin = null;
try { admin = readAdminJson(); } catch { /* admin_json_shape already reported */ }

findings.push(checkSupervisorLiveness(admin));
findings.push(checkRuntimeJson(admin));
findings.push(checkUdsReachable(admin));
findings.push(await checkTcpReachable(admin));

return findings;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The readAdminJson() function is called independently within runChecks() and checkAdminJsonShape(), leading to redundant file I/O. To improve efficiency, readAdminJson() should be called only once within runChecks(), and the resulting admin object (or null) should be passed to checkAdminJsonShape() and other checks that need it.

This would involve modifying checkAdminJsonShape to accept the admin object as an argument, like function checkAdminJsonShape(admin) { ... }, and adjusting its internal logic to work with the passed argument instead of calling readAdminJson() itself.

Comment thread tests/cli/doctor.test.js
Comment on lines +88 to +94
test('checkAdminJsonShape handles null admin gracefully', () => {
// We cannot easily mock readAdminJson without bun:test mock infra;
// instead exercise the WARN/FAIL paths through admin_json_exists when
// admin.json may not exist on the test host. This is covered by the
// entry-point test above; here we just confirm the helper exists.
expect(typeof __testInternals.checkAdminJsonShape).toBe('function');
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test only checks for the existence of the checkAdminJsonShape function, which is not a very meaningful assertion. The test can be made more robust by testing the function's behavior with different inputs.

bun:test supports mocking, which would allow you to mock readAdminJson to return various outputs (e.g., null, an object with an invalid shape) and assert that checkAdminJsonShape returns the expected Finding.

Alternatively, if you refactor checkAdminJsonShape to accept the admin object as a parameter (as suggested in another comment), you could test it like this without mocking:

  test('checkAdminJsonShape handles null admin gracefully', () => {
    // Assumes checkAdminJsonShape is refactored to take admin object as argument
    const f = __testInternals.checkAdminJsonShape(null);
    expect(f.id).toBe('admin_json_shape');
    expect(f.severity).toBe(SEVERITY.WARN);
    expect(f.title).toContain('missing or empty');
  });

PR #86 review feedback (real correctness gaps; bot style nits filtered out).

- bin/pgserve-wrapper.cjs: add 'doctor' to __installSubcommands. Without this,
  `pgserve doctor` falls through to the bun route on installed binaries and
  never reaches dispatch() → feature was effectively broken for users.

- src/commands/doctor.js checkUdsReachable: replace `execSync('true')` busy
  wait with Promise + setTimeout/error events. The previous loop spawned a
  child process per iteration and never yielded the libuv event loop, so the
  `connect` callback could not fire → false-negative FAIL on healthy hosts.

- src/commands/doctor.js checkAdminJsonShape: harden the PASS gate. Previously
  PASSed when supervisor was valid even if port/socketDir were missing or
  malformed, masking a corrupted admin.json. Now FAILs when port is not a
  positive integer in [1, 65535] or socketDir is empty/non-string.

- Use SUPERVISOR_VALUES from src/lib/admin-json.js (DRY) and drop the
  redundant try/catch around readRuntimeJson (which never throws).

Filtered out (low-value bot suggestions): single-readAdminJson micro-perf,
mock-based test for checkAdminJsonShape — entry-point test already exercises
every code path on different host states.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cli-install.cjs`:
- Around line 1080-1085: The doctor branch currently calls process.exit inside
the dispatch promise chain; change it to return the Promise of runDoctor's
numeric exit code instead so dispatch returns that Promise and the external
wrapper can centralize exit handling—locate the 'doctor' case in dispatch (the
code that calls import('./commands/doctor.js') and mod.runDoctor(args)) and
remove the .then(... process.exit(code)) call, instead returning the Promise
from mod.runDoctor(args) directly (consistent with how the 'uninstall' case is
handled).

In `@src/commands/doctor.js`:
- Around line 89-94: The remediation message in src/commands/doctor.js (in the
block that calls the function creating the 'version_blocklist' failure)
incorrectly references `pgserve update`; change that text to `pgserve upgrade`
so the hint points to the actual command used by this PR—update the string
ending "install a different version (run `pgserve update` for the latest)" to
use `pgserve upgrade` instead, keeping the rest of the message and parameters
(the 'version_blocklist' identifier, SEVERITY.FAIL, and hit.reason) unchanged.
- Around line 187-214: The three supervisor probe spawnSync calls
(spawnSync('pm2', ...), spawnSync('systemctl', ...), spawnSync('launchctl',
...)) need a timeout to avoid hanging; update each call in the functions that
contain those calls (the pm2 probe block, systemdUnitActive, and
launchdJobLoaded) to include a timeout option (e.g. timeout: 5000) in the
options object passed to spawnSync, leaving existing encoding and stdio settings
and keeping the existing try/catch behavior so timeouts surface via the
catch/returned result.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c4392860-3b8b-485c-b7f1-df29cc7e32ba

📥 Commits

Reviewing files that changed from the base of the PR and between 68751d8 and 8fe1030.

📒 Files selected for processing (4)
  • bin/pgserve-wrapper.cjs
  • src/cli-install.cjs
  • src/commands/doctor.js
  • tests/cli/doctor.test.js

Comment thread src/cli-install.cjs
Comment on lines +1080 to +1085
case 'doctor':
// pgserve-singleton-no-proxy Group 3: read-only V1. Reports the
// active supervisor + postmaster reachability + admin.json /
// runtime.json health. --fix tiered modes deferred to a follow-up
// (SHARED-DESIGN §3.2).
return import('./commands/doctor.js').then((mod) => mod.runDoctor(args).then((code) => process.exit(code)));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

# First, let's view the specific lines mentioned and surrounding context
sed -n '1075,1095p' src/cli-install.cjs

Repository: namastexlabs/pgserve

Length of output: 1051


🏁 Script executed:

# Let's also look at the dispatch function structure and wrapper
grep -n "function dispatch\|const dispatch\|async function.*dispatch" src/cli-install.cjs | head -20

Repository: namastexlabs/pgserve

Length of output: 112


🏁 Script executed:

# Check how process.exit is called in other cases
grep -n "process.exit" src/cli-install.cjs | head -20

Repository: namastexlabs/pgserve

Length of output: 291


🏁 Script executed:

# Let's see the full dispatch function to understand the pattern
sed -n '1000,1100p' src/cli-install.cjs

Repository: namastexlabs/pgserve

Length of output: 3732


🏁 Script executed:

# Find the wrapper function that calls dispatch
grep -n "\.dispatch\|dispatch(" src/cli-install.cjs | head -30

Repository: namastexlabs/pgserve

Length of output: 528


🏁 Script executed:

# Look for the wrapper or main entry point
head -150 src/cli-install.cjs

Repository: namastexlabs/pgserve

Length of output: 6925


🏁 Script executed:

# Search for where the dispatch result is handled
grep -B5 -A5 "dispatch(subcommand" src/cli-install.cjs

Repository: namastexlabs/pgserve

Length of output: 485


🏁 Script executed:

# Find the actual wrapper/main entry point that calls the dispatch function
grep -n "module.exports\|function main\|async main" src/cli-install.cjs | head -20

Repository: namastexlabs/pgserve

Length of output: 88


🏁 Script executed:

# Look for code after dispatch definition to find how it's called
sed -n '1110,1150p' src/cli-install.cjs

Repository: namastexlabs/pgserve

Length of output: 1274


🏁 Script executed:

# Check the very end of the file for the main entry
tail -50 src/cli-install.cjs

Repository: namastexlabs/pgserve

Length of output: 1456


🏁 Script executed:

# Find where dispatch is called from the wrapper script
grep -r "dispatch(" bin/ src/ | grep -v ".cjs:" | head -20

Repository: namastexlabs/pgserve

Length of output: 46


🏁 Script executed:

# Check the bin directory for the wrapper
ls -la bin/

Repository: namastexlabs/pgserve

Length of output: 391


🏁 Script executed:

# Look at the actual wrapper that calls dispatch
cat bin/postgres-server.js | head -100

Repository: namastexlabs/pgserve

Length of output: 3852


🏁 Script executed:

# Search for how the wrapper invokes dispatch and handles its return value
grep -A10 "\.dispatch\(" bin/postgres-server.js

Repository: namastexlabs/pgserve

Length of output: 88


🏁 Script executed:

# Look at pgserve-wrapper.cjs which is the main wrapper
cat bin/pgserve-wrapper.cjs

Repository: namastexlabs/pgserve

Length of output: 12659


🏁 Script executed:

# Also check autopg-wrapper.cjs
cat bin/autopg-wrapper.cjs

Repository: namastexlabs/pgserve

Length of output: 682


Return runDoctor's exit code directly; don't call process.exit inside dispatch.

The doctor case calls process.exit(code) inside the promise chain (line 1085), which exits the process before the wrapper's promise handler can execute. The dispatcher contract expects to return a Promise that resolves with an exit code so the wrapper can handle exit centralization—see the uninstall case comment above it: "dispatch() returns a Promise here; the wrapper already handles both numeric and Promise returns." Keep the exit behavior in the wrapper.

Suggested fix
    case 'doctor':
      // pgserve-singleton-no-proxy Group 3: read-only V1. Reports the
      // active supervisor + postmaster reachability + admin.json /
      // runtime.json health. --fix tiered modes deferred to a follow-up
      // (SHARED-DESIGN §3.2).
-      return import('./commands/doctor.js').then((mod) => mod.runDoctor(args).then((code) => process.exit(code)));
+      return import('./commands/doctor.js').then((mod) => mod.runDoctor(args));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case 'doctor':
// pgserve-singleton-no-proxy Group 3: read-only V1. Reports the
// active supervisor + postmaster reachability + admin.json /
// runtime.json health. --fix tiered modes deferred to a follow-up
// (SHARED-DESIGN §3.2).
return import('./commands/doctor.js').then((mod) => mod.runDoctor(args).then((code) => process.exit(code)));
case 'doctor':
// pgserve-singleton-no-proxy Group 3: read-only V1. Reports the
// active supervisor + postmaster reachability + admin.json /
// runtime.json health. --fix tiered modes deferred to a follow-up
// (SHARED-DESIGN §3.2).
return import('./commands/doctor.js').then((mod) => mod.runDoctor(args));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/cli-install.cjs` around lines 1080 - 1085, The doctor branch currently
calls process.exit inside the dispatch promise chain; change it to return the
Promise of runDoctor's numeric exit code instead so dispatch returns that
Promise and the external wrapper can centralize exit handling—locate the
'doctor' case in dispatch (the code that calls import('./commands/doctor.js')
and mod.runDoctor(args)) and remove the .then(... process.exit(code)) call,
instead returning the Promise from mod.runDoctor(args) directly (consistent with
how the 'uninstall' case is handled).

Comment thread src/commands/doctor.js
Comment thread src/commands/doctor.js Outdated
Reviewer-pr-86 verdict FIX-FIRST. Real findings:

[HIGH] Supervisor probes (pm2 / systemctl / launchctl) called spawnSync
with no timeout. A hung supervisor (stuck pm2 daemon, dbus deadlock on
systemctl, launchctl waiting on a lock) would make `pgserve doctor`
itself hang forever — defeating the "diagnostic tool the operator runs
when something is wrong" use case.

  Fix: cap every probe at SUPERVISOR_PROBE_TIMEOUT_MS = 5000. Healthy
  probes are single-digit ms in practice. Detect timeout via
  spawnHitTimeout(result) — checks both error.code === 'ETIMEDOUT' and
  the {signal: 'SIGTERM', status: null} shape node uses on some
  platforms — and return a FAIL whose reason names the timeout
  threshold so operators get a clear "<probe> timed out after Nms"
  signal.

[LOW] Wrong remediation hint name. version_blocklist FAIL pointed
operators at `pgserve update`, but this branch (and main) only routes
`upgrade` (wrapper allowlist + dispatcher both use 'upgrade'; there is
no 'update' verb).

  Fix: replace `pgserve update` with `pgserve upgrade`.

Validation: bun test tests/cli/doctor.test.js → 12/12 pass; lint clean.

Filtered out (already addressed in fix-up commit 8fe1030):
  - wrapper allowlist + async UDS + admin.json shape + DRY (5 prior
    bot comments now superseded by the fix-up commit on this branch).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@namastex888 namastex888 merged commit c041180 into main May 9, 2026
9 checks passed
namastex888 added a commit that referenced this pull request May 9, 2026
…5 more

PR #93 bot reviews caught 7 findings (1 HIGH + 5 MEDIUM + 1 P2). All
addressed:

[HIGH — Gemini] Decision 7 contradicted reality. README + CHANGELOG +
package.json all show `autopg` and `pgserve` as INTERCHANGEABLE CLI
bins — both ship in package.json bin, both route to the same
dispatcher. Earlier draft incorrectly claimed `autopg` was "never as a
CLI verb." Flipped Decision 7 to match reality: both bins coexist;
new verbs land for both; examples use `pgserve <verb>` for consistency
with PRs #86#92 but `autopg <verb>` is equivalent at runtime. No
rename, no bin removal in this wish.

[MEDIUM — Gemini] `~/.pgserve/` vs `~/.autopg/` inconsistency. Reality
on main: both exist. autopg houses admin.json + settings.json (newer
cohort schema); pgserve houses trust + audit (authored in PRs #87 +
#90 before the split was rationalized). Added Decision 9 documenting
the split + explicitly OUTING migration as scope of a future wish.
G3's per-consumer state goes under `~/.autopg/<sanitized-slug>/...`
to match autopg posture; trust/audit references stay at `~/.pgserve/`
to match shipped code.

[MEDIUM — Gemini] install.sh 404 doesn't help operators with
bookmarked URLs. Replaced "let it 404" plan with a tiny ≤15-line
shim at `install.sh` that prints a deprecation note + new URL on
stderr and exits 0. Acceptance criterion added.

[MEDIUM — Gemini] `<scope>` directory naming was ambiguous —
`@demo/app` could be read as nested. Pinned the sanitization rule to
`src/provision/db-naming.js#sanitizeSlug` (lowercase + non-alnum →
'_' + collapse + trim), so `@demo/app` becomes `demo_app` (one flat
dir).

[MEDIUM — Gemini] State across 3 locations (autopg_meta + admin.json
+ manifest) without a stated source-of-truth invited desync. Pinned
`autopg_meta` as authoritative; admin.json + manifest are derived
caches; `pgserve doctor` reports divergences and `--fix` regenerates
the caches from the table. Documented in the deliverable.

[MEDIUM — Gemini] G5 smoke test used `npx pgserve@latest install` —
that tests the PUBLISHED version, not the changes about to be
released. Switched to local-build install (npm pack + npx local
tarball, or worktree wrapper directly).

[P2 — Codex] G4 validation `grep -E … docs/` passes a directory
without `-r`, returning exit 2 ("Is a directory"). Added `-r` flag.

`genie wish lint autopg-distribution-cutover-finalize`: clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@namastex888 namastex888 deleted the feat/singleton-g3-doctor branch May 10, 2026 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant