Skip to content

fix(doctor): emit accurate --break-lock hint for non-sync stale locks#1553

Open
Sanjays2402 wants to merge 1 commit into
garrytan:masterfrom
Sanjays2402:fix/doctor-stale-cycle-lock-hint
Open

fix(doctor): emit accurate --break-lock hint for non-sync stale locks#1553
Sanjays2402 wants to merge 1 commit into
garrytan:masterfrom
Sanjays2402:fix/doctor-stale-cycle-lock-hint

Conversation

@Sanjays2402

Copy link
Copy Markdown

Summary

Fixes #1534.

doctor's stale_locks check hard-coded gbrain sync --break-lock as the recovery hint for every stale row. But that command, per sync.ts, only handles gbrain-sync:* keys — it silently no-ops on gbrain-cycle (and per-source gbrain-cycle:<source>) locks with Lock gbrain-sync:default is not held (nothing to break)., leaving operators with no shipped path to clear the lock.

Fix

Extracted a small formatStaleLockBreakHint(lockId) helper that branches on the lock-id prefix:

  • gbrain-sync:<src>gbrain sync --break-lock --source <src>
  • gbrain-syncgbrain sync --break-lock
  • anything else (notably gbrain-cycle, gbrain-cycle:<src>, future kinds) → an honest SQL-fallback hint:
    manual clear required (no CLI breaker for <id>): DELETE FROM gbrain_cycle_locks WHERE id = '<id>';

The id is single-quote-escaped so the printed snippet is safe to copy verbatim.

Why this shape instead of shipping a new CLI breaker

The issue suggests also adding gbrain dream --break-lock or gbrain admin break-lock <name>. That's the right next step but it's a CLI-surface decision (PID liveness, force-break semantics, cross-host guards) that's better landed on its own with proper tests. This PR is the minimum honest-hint change so that doctor stops actively misleading operators today.

Risk

Pure formatter change inside the doctor output builder. No DB writes, no new CLI surface, no behavior change for the existing gbrain-sync:* cases. Output text changes for cycle locks only.

Test plan

No unit tests target this string today. Verified the helper output by extracting it into a Node REPL (covers sync, sync-with-source, cycle, cycle-with-source, and single-quote-in-id):

gbrain sync --break-lock --source default
gbrain sync --break-lock
manual clear required (no CLI breaker for gbrain-cycle): DELETE FROM gbrain_cycle_locks WHERE id = 'gbrain-cycle';
manual clear required (no CLI breaker for gbrain-cycle:notes): DELETE FROM gbrain_cycle_locks WHERE id = 'gbrain-cycle:notes';
manual clear required (no CLI breaker for weird's-lock): DELETE FROM gbrain_cycle_locks WHERE id = 'weird''s-lock';

Did not run the full bun test / bun run verify suite locally (no Postgres + Bun on this machine); CI will cover.

The stale_locks check hard-coded 'gbrain sync --break-lock' for every
stale row, but 'gbrain sync --break-lock' only handles 'gbrain-sync:*'
keys. For 'gbrain-cycle' (and per-source 'gbrain-cycle:<source>') locks
the suggested command silently no-ops with 'Lock gbrain-sync:default is
not held', leaving operators with no shipped path to clear the lock.

Branch the hint by lock-id prefix: keep the sync hint for sync locks,
and for unrecognized kinds (cycle, future kinds) print the SQL fallback
DELETE so the message is honest instead of misleading. The DELETE id is
single-quote-escaped to keep the printed snippet safe to copy.

Pure formatter change in the doctor output path; no DB or CLI surface
behavior changes.

Fixes garrytan#1534
@iannwu

iannwu commented Jun 1, 2026

Copy link
Copy Markdown

Confirmed this same failure mode on a real local Postgres brain after upgrading to gbrain 0.41.38.0 (and checked that current origin/master still has the pre-PR hint shape).

What I saw:

stale_locks warn - 1 stale lock(s) detected (ttl_expires_at < NOW()):
  gbrain-cycle (pid 45905 on <host>, age 24h) → gbrain sync --break-lock

Following the hint did not clear the lock because sync --break-lock targets gbrain-sync:<source> locks:

$ gbrain sync --break-lock --max-age 3600
Lock gbrain-sync:default is not held (nothing to break).

I cleared the expired gbrain-cycle row manually with direct SQL after verifying it was stale. This PR's "honest SQL-fallback hint for non-sync locks" would have made the operator path clear and prevented the confusing no-op.

Also agree with the PR body that #1591 is the right follow-up for a real cycle-lock breaker CLI; this PR is still valuable as the minimum correction because current doctor output actively points at the wrong command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

doctor stale_locks: --break-lock hint hard-codes <code>gbrain sync</code> even when lock is <code>gbrain-cycle</code>

2 participants