feat(doctor): pgserve doctor read-only V1 (singleton G3 first verb)#86
Conversation
pgserve-singleton-no-proxy wish, Group 3. First of the four CLI verbs
(provision/gc/trust/doctor) — read-only V1.
What ships:
- src/commands/doctor.js (NEW): runs 7 checks against the local install:
1. version_blocklist — current version vs BLOCKED_VERSIONS (G5)
2. admin_json_exists — file present at ~/.autopg/admin.json mode 0600
3. admin_json_shape — supervisor enum + port + socketDir present
4. supervisor_liveness — dispatches based on admin.json.supervisor:
pm2 → 'pm2 jlist' for autopg-server entry online
systemd-user → 'systemctl --user is-active autopg.service'
launchd → 'launchctl list dev.automagik.autopg'
external → 'operator owns supervision' (PASS, no probe)
5. runtime_json — <socketDir>/runtime.json present + alive,
cohort-contract guard: NO 'supervisor' field
6. uds_reachable — connect to <socketDir>/.s.PGSQL.<port>
7. tcp_reachable — connect to localhost:<port>
- src/cli-install.cjs: 'doctor' subcommand routing (dynamic-import pattern,
matches 'uninstall').
- tests/cli/doctor.test.js (NEW): 12 tests covering exit-code logic,
SEVERITY enum, --fix refusal (returns 64 — not implemented in V1),
--json output mode, default-mode runs end-to-end.
Output modes:
- default: human-readable with severity tags + remediation hints
- --json: structured { ok, summary, findings } for tool consumption
- --fix / --fix --aggressive: rejected with exit 64; tiered Cat 1/2/3
modes deferred to a follow-up per SHARED-DESIGN §3.2
Felipe directive 2026-05-08: doctor reports passively; never suggests
swapping supervisor tiers. Confirmed in supervisor_liveness check —
findings call out the active mode but never recommend changing it.
Validation:
- bun test tests/cli/doctor.test.js → 12 pass / 0 fail
- bun run lint: clean
- bun run deadcode: clean
Cohort: singleton G3 verb #1 of 4. trust + gc + provision come in
follow-up PRs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughThis PR implements the ChangesDoctor Health-Check Command
Sequence Diagram(s)sequenceDiagram
participant CLI
participant Doctor
participant Supervisors
participant Runtime
participant Sockets
participant Exit
CLI->>Doctor: invoke runDoctor()
Doctor->>Supervisors: checkSupervisorLiveness()
Doctor->>Runtime: checkRuntimeJson()
Doctor->>Sockets: checkUdsReachable() / checkTcpReachable()
Doctor-->>CLI: return findings
CLI->>Exit: compute exitCodeFor(findings) and exit
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8cb81af1e2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // active supervisor + postmaster reachability + admin.json / | ||
| // runtime.json health. --fix tiered modes deferred to a follow-up | ||
| // (SHARED-DESIGN §3.2). | ||
| return import('./commands/doctor.js').then((mod) => mod.runDoctor(args).then((code) => process.exit(code))); |
There was a problem hiding this comment.
Route
doctor through wrapper subcommand allowlist
dispatch() now handles doctor, but this branch is unreachable from normal CLI usage because bin/pgserve-wrapper.cjs only forwards subcommands listed in __installSubcommands, and doctor is not in that set. In that common path, pgserve doctor falls through to bin/postgres-server.js (bun route) and exits with help/error instead of running diagnostics, so the new command is effectively broken for users invoking the installed binary.
Useful? React with 👍 / 👎.
| while (!connected && Date.now() - start < 500) { | ||
| execSync('true'); // yield event loop without sleep |
There was a problem hiding this comment.
Replace blocking UDS probe loop with async connect handling
This loop blocks the Node event loop while waiting for connected to flip, but net.Socket emits 'connect' asynchronously on that same event loop. As a result, on a healthy Unix socket the callback cannot run until after the loop exits, so uds_reachable can be reported as FAIL even when the socket is accepting connections. Any host with a live postmaster can hit this false negative in checkUdsReachable.
Useful? React with 👍 / 👎.
| return check( | ||
| 'admin_json_shape', | ||
| `admin.json.supervisor = "${supervisor}"`, | ||
| SEVERITY.PASS, | ||
| `port=${admin.port ?? '?'}, socketDir=${admin.socketDir ?? '?'}`, |
There was a problem hiding this comment.
Validate
admin.json port and socketDir before PASS
The admin_json_shape check returns PASS as soon as supervisor is valid, but it does not actually enforce that port and socketDir are present/valid. When those fields are missing or malformed, later checks silently fall back to defaults (5432 and resolveSocketDir()), which can mask corrupted admin.json and produce misleading diagnostics. This contradicts the stated shape-validation goal for this check.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Code Review
This pull request introduces the pgserve doctor command, which performs a suite of health checks on the local installation, including version validation, configuration file integrity, supervisor liveness, and socket reachability. The review feedback identifies several opportunities for improvement: replacing an inefficient synchronous busy-wait loop in the Unix socket check with an asynchronous implementation, reducing redundant file I/O by passing the parsed admin configuration object between functions, and removing unnecessary error handling. Additionally, the reviewer suggests using existing constants for supervisor types to improve maintainability and strengthening the unit tests to provide more meaningful assertions.
| function checkUdsReachable(admin) { | ||
| const socketDir = admin?.socketDir || resolveSocketDir(); | ||
| const port = admin?.port || 5432; | ||
| const sockPath = path.join(socketDir, `.s.PGSQL.${port}`); | ||
| if (!fs.existsSync(sockPath)) { | ||
| return check( | ||
| 'uds_reachable', | ||
| `Unix socket missing at ${sockPath}`, | ||
| SEVERITY.FAIL, | ||
| 'postmaster has not bound the canonical socket', | ||
| 'check postmaster logs / supervisor liveness', | ||
| ); | ||
| } | ||
| // Probe by connecting; postgres responds even before SSL handshake on a healthy socket. | ||
| try { | ||
| const sock = new net.Socket(); | ||
| let connected = false; | ||
| sock.on('connect', () => { connected = true; sock.destroy(); }); | ||
| sock.connect(sockPath); | ||
| // Synchronous-ish wait via deasync-free pattern: small timeout, deterministic result. | ||
| const start = Date.now(); | ||
| while (!connected && Date.now() - start < 500) { | ||
| execSync('true'); // yield event loop without sleep | ||
| } | ||
| if (!connected) { | ||
| return check('uds_reachable', `Unix socket not accepting at ${sockPath}`, SEVERITY.FAIL); | ||
| } | ||
| return check('uds_reachable', `Unix socket accepting at ${sockPath}`, SEVERITY.PASS); | ||
| } catch (e) { | ||
| return check('uds_reachable', `Unix socket probe failed`, SEVERITY.FAIL, e.message); | ||
| } | ||
| } |
There was a problem hiding this comment.
The synchronous-ish wait pattern using execSync('true') in a while loop is highly inefficient. It will cause a CPU core to spin at 100% for up to 500ms. This check should be asynchronous, similar to checkTcpReachable, to avoid blocking the event loop and wasting CPU cycles. Note that this change will require you to await the function call in runChecks.
function checkUdsReachable(admin) {
const socketDir = admin?.socketDir || resolveSocketDir();
const port = admin?.port || 5432;
const sockPath = path.join(socketDir, `.s.PGSQL.${port}`);
if (!fs.existsSync(sockPath)) {
return Promise.resolve(check(
'uds_reachable',
`Unix socket missing at ${sockPath}`,
SEVERITY.FAIL,
'postmaster has not bound the canonical socket',
'check postmaster logs / supervisor liveness',
));
}
// Probe by connecting; postgres responds even before SSL handshake on a healthy socket.
return new Promise((resolve) => {
const sock = new net.Socket();
let resolved = false;
const done = (severity, detail) => {
if (resolved) return;
resolved = true;
sock.destroy();
if (severity === SEVERITY.PASS) {
resolve(check('uds_reachable', `Unix socket accepting at ${sockPath}`, SEVERITY.PASS));
} else {
resolve(check('uds_reachable', `Unix socket not accepting at ${sockPath}`, SEVERITY.FAIL, detail));
}
};
sock.setTimeout(500);
sock.on('connect', () => done(SEVERITY.PASS));
sock.on('timeout', () => done(SEVERITY.FAIL, 'connection timed out'));
sock.on('error', (e) => done(SEVERITY.FAIL, e.message));
sock.connect(sockPath);
});
}|
|
||
| findings.push(checkSupervisorLiveness(admin)); | ||
| findings.push(checkRuntimeJson(admin)); | ||
| findings.push(checkUdsReachable(admin)); |
| const validSupervisor = ['pm2', 'systemd-user', 'launchd', 'external'].includes(supervisor); | ||
| if (!validSupervisor) { | ||
| return check( | ||
| 'admin_json_shape', | ||
| `admin.json.supervisor is invalid: ${JSON.stringify(supervisor)}`, | ||
| SEVERITY.FAIL, | ||
| 'expected one of {pm2, systemd-user, launchd, external}', | ||
| 'reinstall via `pgserve install` (Tier A) or `autopg service install` (Tier B)', | ||
| ); |
There was a problem hiding this comment.
The list of valid supervisor types is hardcoded. This list is already defined and exported as SUPERVISOR_VALUES in src/lib/admin-json.js. To improve maintainability and avoid duplication, you should import and use this constant.
To apply this suggestion, you'll also need to add SUPERVISOR_VALUES to the import from ../lib/admin-json.js at the top of the file:
import { getAdminFilePath, readAdminJson, SUPERVISOR_VALUES } from '../lib/admin-json.js';| const validSupervisor = ['pm2', 'systemd-user', 'launchd', 'external'].includes(supervisor); | |
| if (!validSupervisor) { | |
| return check( | |
| 'admin_json_shape', | |
| `admin.json.supervisor is invalid: ${JSON.stringify(supervisor)}`, | |
| SEVERITY.FAIL, | |
| 'expected one of {pm2, systemd-user, launchd, external}', | |
| 'reinstall via `pgserve install` (Tier A) or `autopg service install` (Tier B)', | |
| ); | |
| const validSupervisor = SUPERVISOR_VALUES.includes(supervisor); | |
| if (!validSupervisor) { | |
| return check( | |
| 'admin_json_shape', | |
| `admin.json.supervisor is invalid: ${JSON.stringify(supervisor)}`, | |
| SEVERITY.FAIL, | |
| `expected one of {${SUPERVISOR_VALUES.join(', ')}}`, | |
| 'reinstall via `pgserve install` (Tier A) or `autopg service install` (Tier B)', | |
| ); | |
| } |
| let runtime; | ||
| try { | ||
| runtime = readRuntimeJson(socketDir); | ||
| } catch { | ||
| runtime = null; | ||
| } |
There was a problem hiding this comment.
The readRuntimeJson function from ../lib/runtime-json.js is designed to never throw, returning null on any failure. Therefore, the try...catch block around its call is redundant and can be removed.
| let runtime; | |
| try { | |
| runtime = readRuntimeJson(socketDir); | |
| } catch { | |
| runtime = null; | |
| } | |
| const runtime = readRuntimeJson(socketDir); |
| export async function runChecks() { | ||
| const findings = []; | ||
| findings.push(checkVersionNotBlocked()); | ||
| findings.push(checkAdminJsonExists()); | ||
| findings.push(checkAdminJsonShape()); | ||
|
|
||
| let admin = null; | ||
| try { admin = readAdminJson(); } catch { /* admin_json_shape already reported */ } | ||
|
|
||
| findings.push(checkSupervisorLiveness(admin)); | ||
| findings.push(checkRuntimeJson(admin)); | ||
| findings.push(checkUdsReachable(admin)); | ||
| findings.push(await checkTcpReachable(admin)); | ||
|
|
||
| return findings; | ||
| } |
There was a problem hiding this comment.
The readAdminJson() function is called independently within runChecks() and checkAdminJsonShape(), leading to redundant file I/O. To improve efficiency, readAdminJson() should be called only once within runChecks(), and the resulting admin object (or null) should be passed to checkAdminJsonShape() and other checks that need it.
This would involve modifying checkAdminJsonShape to accept the admin object as an argument, like function checkAdminJsonShape(admin) { ... }, and adjusting its internal logic to work with the passed argument instead of calling readAdminJson() itself.
| test('checkAdminJsonShape handles null admin gracefully', () => { | ||
| // We cannot easily mock readAdminJson without bun:test mock infra; | ||
| // instead exercise the WARN/FAIL paths through admin_json_exists when | ||
| // admin.json may not exist on the test host. This is covered by the | ||
| // entry-point test above; here we just confirm the helper exists. | ||
| expect(typeof __testInternals.checkAdminJsonShape).toBe('function'); | ||
| }); |
There was a problem hiding this comment.
This test only checks for the existence of the checkAdminJsonShape function, which is not a very meaningful assertion. The test can be made more robust by testing the function's behavior with different inputs.
bun:test supports mocking, which would allow you to mock readAdminJson to return various outputs (e.g., null, an object with an invalid shape) and assert that checkAdminJsonShape returns the expected Finding.
Alternatively, if you refactor checkAdminJsonShape to accept the admin object as a parameter (as suggested in another comment), you could test it like this without mocking:
test('checkAdminJsonShape handles null admin gracefully', () => {
// Assumes checkAdminJsonShape is refactored to take admin object as argument
const f = __testInternals.checkAdminJsonShape(null);
expect(f.id).toBe('admin_json_shape');
expect(f.severity).toBe(SEVERITY.WARN);
expect(f.title).toContain('missing or empty');
});PR #86 review feedback (real correctness gaps; bot style nits filtered out). - bin/pgserve-wrapper.cjs: add 'doctor' to __installSubcommands. Without this, `pgserve doctor` falls through to the bun route on installed binaries and never reaches dispatch() → feature was effectively broken for users. - src/commands/doctor.js checkUdsReachable: replace `execSync('true')` busy wait with Promise + setTimeout/error events. The previous loop spawned a child process per iteration and never yielded the libuv event loop, so the `connect` callback could not fire → false-negative FAIL on healthy hosts. - src/commands/doctor.js checkAdminJsonShape: harden the PASS gate. Previously PASSed when supervisor was valid even if port/socketDir were missing or malformed, masking a corrupted admin.json. Now FAILs when port is not a positive integer in [1, 65535] or socketDir is empty/non-string. - Use SUPERVISOR_VALUES from src/lib/admin-json.js (DRY) and drop the redundant try/catch around readRuntimeJson (which never throws). Filtered out (low-value bot suggestions): single-readAdminJson micro-perf, mock-based test for checkAdminJsonShape — entry-point test already exercises every code path on different host states. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/cli-install.cjs`:
- Around line 1080-1085: The doctor branch currently calls process.exit inside
the dispatch promise chain; change it to return the Promise of runDoctor's
numeric exit code instead so dispatch returns that Promise and the external
wrapper can centralize exit handling—locate the 'doctor' case in dispatch (the
code that calls import('./commands/doctor.js') and mod.runDoctor(args)) and
remove the .then(... process.exit(code)) call, instead returning the Promise
from mod.runDoctor(args) directly (consistent with how the 'uninstall' case is
handled).
In `@src/commands/doctor.js`:
- Around line 89-94: The remediation message in src/commands/doctor.js (in the
block that calls the function creating the 'version_blocklist' failure)
incorrectly references `pgserve update`; change that text to `pgserve upgrade`
so the hint points to the actual command used by this PR—update the string
ending "install a different version (run `pgserve update` for the latest)" to
use `pgserve upgrade` instead, keeping the rest of the message and parameters
(the 'version_blocklist' identifier, SEVERITY.FAIL, and hit.reason) unchanged.
- Around line 187-214: The three supervisor probe spawnSync calls
(spawnSync('pm2', ...), spawnSync('systemctl', ...), spawnSync('launchctl',
...)) need a timeout to avoid hanging; update each call in the functions that
contain those calls (the pm2 probe block, systemdUnitActive, and
launchdJobLoaded) to include a timeout option (e.g. timeout: 5000) in the
options object passed to spawnSync, leaving existing encoding and stdio settings
and keeping the existing try/catch behavior so timeouts surface via the
catch/returned result.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c4392860-3b8b-485c-b7f1-df29cc7e32ba
📒 Files selected for processing (4)
bin/pgserve-wrapper.cjssrc/cli-install.cjssrc/commands/doctor.jstests/cli/doctor.test.js
| case 'doctor': | ||
| // pgserve-singleton-no-proxy Group 3: read-only V1. Reports the | ||
| // active supervisor + postmaster reachability + admin.json / | ||
| // runtime.json health. --fix tiered modes deferred to a follow-up | ||
| // (SHARED-DESIGN §3.2). | ||
| return import('./commands/doctor.js').then((mod) => mod.runDoctor(args).then((code) => process.exit(code))); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's view the specific lines mentioned and surrounding context
sed -n '1075,1095p' src/cli-install.cjsRepository: namastexlabs/pgserve
Length of output: 1051
🏁 Script executed:
# Let's also look at the dispatch function structure and wrapper
grep -n "function dispatch\|const dispatch\|async function.*dispatch" src/cli-install.cjs | head -20Repository: namastexlabs/pgserve
Length of output: 112
🏁 Script executed:
# Check how process.exit is called in other cases
grep -n "process.exit" src/cli-install.cjs | head -20Repository: namastexlabs/pgserve
Length of output: 291
🏁 Script executed:
# Let's see the full dispatch function to understand the pattern
sed -n '1000,1100p' src/cli-install.cjsRepository: namastexlabs/pgserve
Length of output: 3732
🏁 Script executed:
# Find the wrapper function that calls dispatch
grep -n "\.dispatch\|dispatch(" src/cli-install.cjs | head -30Repository: namastexlabs/pgserve
Length of output: 528
🏁 Script executed:
# Look for the wrapper or main entry point
head -150 src/cli-install.cjsRepository: namastexlabs/pgserve
Length of output: 6925
🏁 Script executed:
# Search for where the dispatch result is handled
grep -B5 -A5 "dispatch(subcommand" src/cli-install.cjsRepository: namastexlabs/pgserve
Length of output: 485
🏁 Script executed:
# Find the actual wrapper/main entry point that calls the dispatch function
grep -n "module.exports\|function main\|async main" src/cli-install.cjs | head -20Repository: namastexlabs/pgserve
Length of output: 88
🏁 Script executed:
# Look for code after dispatch definition to find how it's called
sed -n '1110,1150p' src/cli-install.cjsRepository: namastexlabs/pgserve
Length of output: 1274
🏁 Script executed:
# Check the very end of the file for the main entry
tail -50 src/cli-install.cjsRepository: namastexlabs/pgserve
Length of output: 1456
🏁 Script executed:
# Find where dispatch is called from the wrapper script
grep -r "dispatch(" bin/ src/ | grep -v ".cjs:" | head -20Repository: namastexlabs/pgserve
Length of output: 46
🏁 Script executed:
# Check the bin directory for the wrapper
ls -la bin/Repository: namastexlabs/pgserve
Length of output: 391
🏁 Script executed:
# Look at the actual wrapper that calls dispatch
cat bin/postgres-server.js | head -100Repository: namastexlabs/pgserve
Length of output: 3852
🏁 Script executed:
# Search for how the wrapper invokes dispatch and handles its return value
grep -A10 "\.dispatch\(" bin/postgres-server.jsRepository: namastexlabs/pgserve
Length of output: 88
🏁 Script executed:
# Look at pgserve-wrapper.cjs which is the main wrapper
cat bin/pgserve-wrapper.cjsRepository: namastexlabs/pgserve
Length of output: 12659
🏁 Script executed:
# Also check autopg-wrapper.cjs
cat bin/autopg-wrapper.cjsRepository: namastexlabs/pgserve
Length of output: 682
Return runDoctor's exit code directly; don't call process.exit inside dispatch.
The doctor case calls process.exit(code) inside the promise chain (line 1085), which exits the process before the wrapper's promise handler can execute. The dispatcher contract expects to return a Promise that resolves with an exit code so the wrapper can handle exit centralization—see the uninstall case comment above it: "dispatch() returns a Promise here; the wrapper already handles both numeric and Promise returns." Keep the exit behavior in the wrapper.
Suggested fix
case 'doctor':
// pgserve-singleton-no-proxy Group 3: read-only V1. Reports the
// active supervisor + postmaster reachability + admin.json /
// runtime.json health. --fix tiered modes deferred to a follow-up
// (SHARED-DESIGN §3.2).
- return import('./commands/doctor.js').then((mod) => mod.runDoctor(args).then((code) => process.exit(code)));
+ return import('./commands/doctor.js').then((mod) => mod.runDoctor(args));📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| case 'doctor': | |
| // pgserve-singleton-no-proxy Group 3: read-only V1. Reports the | |
| // active supervisor + postmaster reachability + admin.json / | |
| // runtime.json health. --fix tiered modes deferred to a follow-up | |
| // (SHARED-DESIGN §3.2). | |
| return import('./commands/doctor.js').then((mod) => mod.runDoctor(args).then((code) => process.exit(code))); | |
| case 'doctor': | |
| // pgserve-singleton-no-proxy Group 3: read-only V1. Reports the | |
| // active supervisor + postmaster reachability + admin.json / | |
| // runtime.json health. --fix tiered modes deferred to a follow-up | |
| // (SHARED-DESIGN §3.2). | |
| return import('./commands/doctor.js').then((mod) => mod.runDoctor(args)); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/cli-install.cjs` around lines 1080 - 1085, The doctor branch currently
calls process.exit inside the dispatch promise chain; change it to return the
Promise of runDoctor's numeric exit code instead so dispatch returns that
Promise and the external wrapper can centralize exit handling—locate the
'doctor' case in dispatch (the code that calls import('./commands/doctor.js')
and mod.runDoctor(args)) and remove the .then(... process.exit(code)) call,
instead returning the Promise from mod.runDoctor(args) directly (consistent with
how the 'uninstall' case is handled).
Reviewer-pr-86 verdict FIX-FIRST. Real findings:
[HIGH] Supervisor probes (pm2 / systemctl / launchctl) called spawnSync
with no timeout. A hung supervisor (stuck pm2 daemon, dbus deadlock on
systemctl, launchctl waiting on a lock) would make `pgserve doctor`
itself hang forever — defeating the "diagnostic tool the operator runs
when something is wrong" use case.
Fix: cap every probe at SUPERVISOR_PROBE_TIMEOUT_MS = 5000. Healthy
probes are single-digit ms in practice. Detect timeout via
spawnHitTimeout(result) — checks both error.code === 'ETIMEDOUT' and
the {signal: 'SIGTERM', status: null} shape node uses on some
platforms — and return a FAIL whose reason names the timeout
threshold so operators get a clear "<probe> timed out after Nms"
signal.
[LOW] Wrong remediation hint name. version_blocklist FAIL pointed
operators at `pgserve update`, but this branch (and main) only routes
`upgrade` (wrapper allowlist + dispatcher both use 'upgrade'; there is
no 'update' verb).
Fix: replace `pgserve update` with `pgserve upgrade`.
Validation: bun test tests/cli/doctor.test.js → 12/12 pass; lint clean.
Filtered out (already addressed in fix-up commit 8fe1030):
- wrapper allowlist + async UDS + admin.json shape + DRY (5 prior
bot comments now superseded by the fix-up commit on this branch).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…5 more PR #93 bot reviews caught 7 findings (1 HIGH + 5 MEDIUM + 1 P2). All addressed: [HIGH — Gemini] Decision 7 contradicted reality. README + CHANGELOG + package.json all show `autopg` and `pgserve` as INTERCHANGEABLE CLI bins — both ship in package.json bin, both route to the same dispatcher. Earlier draft incorrectly claimed `autopg` was "never as a CLI verb." Flipped Decision 7 to match reality: both bins coexist; new verbs land for both; examples use `pgserve <verb>` for consistency with PRs #86–#92 but `autopg <verb>` is equivalent at runtime. No rename, no bin removal in this wish. [MEDIUM — Gemini] `~/.pgserve/` vs `~/.autopg/` inconsistency. Reality on main: both exist. autopg houses admin.json + settings.json (newer cohort schema); pgserve houses trust + audit (authored in PRs #87 + #90 before the split was rationalized). Added Decision 9 documenting the split + explicitly OUTING migration as scope of a future wish. G3's per-consumer state goes under `~/.autopg/<sanitized-slug>/...` to match autopg posture; trust/audit references stay at `~/.pgserve/` to match shipped code. [MEDIUM — Gemini] install.sh 404 doesn't help operators with bookmarked URLs. Replaced "let it 404" plan with a tiny ≤15-line shim at `install.sh` that prints a deprecation note + new URL on stderr and exits 0. Acceptance criterion added. [MEDIUM — Gemini] `<scope>` directory naming was ambiguous — `@demo/app` could be read as nested. Pinned the sanitization rule to `src/provision/db-naming.js#sanitizeSlug` (lowercase + non-alnum → '_' + collapse + trim), so `@demo/app` becomes `demo_app` (one flat dir). [MEDIUM — Gemini] State across 3 locations (autopg_meta + admin.json + manifest) without a stated source-of-truth invited desync. Pinned `autopg_meta` as authoritative; admin.json + manifest are derived caches; `pgserve doctor` reports divergences and `--fix` regenerates the caches from the table. Documented in the deliverable. [MEDIUM — Gemini] G5 smoke test used `npx pgserve@latest install` — that tests the PUBLISHED version, not the changes about to be released. Switched to local-build install (npm pack + npx local tarball, or worktree wrapper directly). [P2 — Codex] G4 validation `grep -E … docs/` passes a directory without `-r`, returning exit 2 ("Is a directory"). Added `-r` flag. `genie wish lint autopg-distribution-cutover-finalize`: clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Singleton wish G3 — first of the four CLI verbs (
provision/gc/trust/doctor). Read-only V1;--fixtiered modes deferred to a follow-up per SHARED-DESIGN §3.2.What ships
src/commands/doctor.jsruns 7 checks against the local install:version_blocklistBLOCKED_VERSIONS(G5)admin_json_exists~/.autopg/admin.jsonpresent, mode 0600admin_json_shapesupervisor_livenessadmin.json.supervisor:• pm2 →
pm2 jlistforautopg-serverentry online• systemd-user →
systemctl --user is-active autopg.service• launchd →
launchctl list dev.automagik.autopg• external → operator owns supervision (PASS, no probe)
runtime_json<socketDir>/runtime.jsonpresent + alive; cohort-contract guard: NOsupervisorfielduds_reachable<socketDir>/.s.PGSQL.<port>tcp_reachablelocalhost:<port>Output modes
--json: structured{ ok, summary, findings }for tool consumption--fix/--fix --aggressive: rejected with exit 64 (not implemented in V1)Exit codes
--fixrequested (not implemented in V1)Felipe constraint honored
Felipe directive 2026-05-08: doctor reports passively; never suggests swapping supervisor tiers. Confirmed in
supervisor_liveness— findings call out the active mode but never recommend changing it.Validation
bun test tests/cli/doctor.test.js→ 12 pass / 0 failbun run lint→ cleanbun run deadcode→ cleanCohort
Singleton G3 verb 1 of 4.
trust+gc+provisioncome in follow-up PRs (each ~150-300 LOC, separately shippable).--fixtiered modes (Cat 1/2/3 mutations) tracked as a separate follow-up — locked in SHARED-DESIGN §3.2 but implementation needs its own design pass.Summary by CodeRabbit
New Features
pgserve doctor: a passive, read-only diagnostics command that checks config, supervisor liveness, runtime shape, and socket/database reachability; supports human or JSON output. Exits 0/2/1 for PASS/WARN/FAIL;--fixis rejected with a specific error exit.Tests