v0.42.23.0 feat(jobs): --nice scheduling-priority flag for jobs work/supervisor (#1815)#1820
Merged
Conversation
OS scheduling-priority primitives for issue #1815: - niceness.ts: parseNiceValue (whole-string), applyNiceness (re-reads effective in success AND failure paths), getEffectiveNiceness, formatNice. - worker-registry.ts: live workers self-register pid + requested/effective nice under gbrainPath('workers'); readWorkers prunes ESRCH (keeps EPERM) with a pid-reuse start-time guard. - supervisor-pid.ts: readSupervisorPid extracted from the copy-pasted PID-file + liveness block.
Wires the --nice <n> flag (and GBRAIN_NICE env) through the CLI (issue #1815): - jobs work: applies niceness + registers the worker; cleanup on finally and process.on('exit'). - jobs supervisor: applies in the foreground-start path only (after the --detach fork), passes the apply result into MinionSupervisor. - supervisor.ts: nice opts, extracted testable buildWorkerArgs (appends --nice), emits niceness on started/worker_spawned audit events. - jobs stats / supervisor status: surface effective worker + supervisor nice. - doctor: separate supervisor_niceness check (warns on requested != effective) so it can't clobber the supervisor crash-check precedence; registered in doctor-categories.
Unit tests for issue #1815: parseNiceValue rejects 3.5/10abc that parseInt would accept; applyNiceness re-reads effective on EPERM; registry ESRCH/EPERM + pid-reuse guard + brain-isolated path; readSupervisorPid states; parseNiceFlag flag>env precedence; buildWorkerArgs --nice propagation.
…flag # Conflicts: # src/commands/doctor.ts
--nice flag for jobs work/supervisor (issue #1815). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- minions-deployment.md: niceness tuning section (full concurrency, low priority). - KEY_FILES.md: entries for niceness.ts, worker-registry.ts, supervisor-pid.ts; supervisor.ts entry notes buildWorkerArgs + nice opts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The enrich_thin cycle phase (src/core/cycle.ts ALL_PHASES, between conversation_facts_backfill and skillopt) shipped without updating the e2e phase-order expectation, so dream-cycle-phase-order-pglite failed on master. Sync the expected list to the real ALL_PHASES order. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ll contract v0.41.13.0 intentionally dropped the "--break-lock + --all is refused" guard so cron can self-heal every source in one call (sync.ts runBreakLock iterates sources under --all). The e2e test still asserted the old exit-1 refusal and failed on master. Assert the current contract: the combination is accepted and takes the iterate / no-active-sources path (exit 0, no refusal message). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The native fsevents watcher occasionally missed a freshly written file, timing out the 15s waitFor (~1/3 on master under load). Three fixes: - inject a polling chokidar watcher via the source's _watchFactory seam (usePolling, 20ms interval) so detection never depends on fsevents timing; - drop deterministic fixtures BEFORE start so the initial scan (ignoreInitial:false) emits them, keeping live-watch coverage only where it's robust; - poll for the dedup hit instead of a fixed 600ms sleep. 15/15 green under stress. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
connect-bearer and serve-stdio-roundtrip init a PGLite brain and spawn serve,
but passed {...process.env} through — leaking an ambient DATABASE_URL /
GBRAIN_DATABASE_URL into the subprocess, which then came up on Postgres and
failed the `engine: pglite` assertion. Strip both DB vars from the spawned env
so the tests are deterministic whether or not the shell/CI has a DB URL set.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…flag # Conflicts: # CHANGELOG.md # VERSION # package.json
The DATABASE_URL/GBRAIN_DATABASE_URL strip used `delete` on a narrowly-typed env literal (tsc-only failure; bun test doesn't typecheck). Annotate connect-bearer's env as Record<string,string|undefined> and build serve-stdio's as a concrete Record<string,string> (StdioClientTransport.env rejects undefined). Runtime behavior unchanged (7/7 + 3/3 green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
worker-registry.test.ts sets process.env.GBRAIN_HOME per-test so gbrainPath resolves to a temp dir, then lazy-imports the module — a process-global mutation the parallel isolation lint (rule R1) forbids. Rename to worker-registry.serial.test.ts: it runs in the serial pass (own bun process, max-concurrency=1) where env mutation is safe, and the lint skips *.serial files. No logic change (6/6 green); fixes the failing `verify` CI job. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…flag # Conflicts: # CHANGELOG.md # VERSION # docs/architecture/KEY_FILES.md # package.json # src/commands/jobs.ts # src/core/minions/supervisor.ts
mgunnin
added a commit
to mgunnin/gbrain
that referenced
this pull request
Jun 3, 2026
* upstream/master: v0.42.23.0 feat(jobs): --nice scheduling-priority flag for jobs work/supervisor (garrytan#1815) (garrytan#1820) v0.42.22.0 fix(minions): supervisor progress watchdog + worker DB self-defense — alive-but-wedged worker self-heals (garrytan#1801) (garrytan#1824) v0.42.21.0 fix(postgres): module-singleton ownership — canonical landing for the dream-cycle "connect() has not been called" class (garrytan#1404/garrytan#1471/garrytan#1619) (garrytan#1805) v0.42.20.0 fix: reliability wave — PGLite capture lock-pin + Postgres reconnect race + search embed-hang (garrytan#1762 garrytan#1745 garrytan#1775) (garrytan#1810) v0.42.19.0 fix(skillopt): close the last gap in the AI SDK v6 tool-loop fix (write-capture mapper + regression test) (garrytan#1809) v0.42.18.0 fix: sync orphan-pileup watchdog (garrytan#1633) + links-lag µs stamp (garrytan#1768) (garrytan#1807) v0.42.17.0 fix(sync): resumable incremental sync — killed mid-import no longer loses progress (garrytan#1794) (garrytan#1808) v0.42.16.0 feat(doctor): brain health as a solved problem — cause-ranked doctor + OOM-loop line + auto-drain + pool-reap (garrytan#1685) (garrytan#1802) v0.42.15.0 fix: decouple CLI primary output from process.stdout.isTTY (garrytan#1784) (garrytan#1806) v0.42.14.0 fix(zero-config): code-* readiness signal + init embedding-key validation + lock self-heal (garrytan#1780) (garrytan#1804) v0.42.13.0 fix(search): archive/ content findable by default, demoted not hard-excluded (garrytan#1777) (garrytan#1797) v0.42.12.0 feat: self-upgrading gbrain — invocation-riding update check + opt-in auto-upgrade (garrytan#1798) v0.42.11.0 feat(skillopt): held-out eval gate, honest receipts, ENFORCE + ablation opts (garrytan#1759) v0.42.10.0 feat(extract): opt-in global-basename wikilink resolution (closes garrytan#972) (garrytan#1388)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds a
--nice <n>flag togbrain jobs workandgbrain jobs supervisorthat sets the process's OS scheduling priority (POSIX-20..19), propagates to spawned workers and their children, and surfaces the effective niceness injobs stats,jobs supervisor status, andgbrain doctor. Closes #1815.Full concurrency, low priority — the work finishes just as fast when the box is idle and yields when it's busy. In the incident that drove this, reniceing the job tree took load from ~7 to ~3 with no measurable throughput loss. Distinct from the concurrency/inflight cap (#1801); composes with it (
--nice= priority, concurrency = width).How
niceness.ts—parseNiceValue(whole-string parse),applyNiceness(re-reads effective in both success and failure paths),getEffectiveNiceness,formatNice.worker-registry.ts— workers self-register their real pid + requested/effective nice undergbrainPath('workers')(brain-isolated);readWorkersprunes ESRCH (keeps EPERM) with a pid-reuse start-time guard. Cleanup onfinallyandprocess.on('exit').supervisor-pid.ts—readSupervisorPidextracted from the copy-pasted PID-file + liveness block (now shared by status/doctor/stats).supervisor.ts— nice opts, extracted testablebuildWorkerArgs(appends--nice), emits niceness onstarted/worker_spawnedaudit events.doctor.ts— separatesupervisor_nicenesscheck (warns on requested ≠ effective) so it can't clobber the supervisor crash-check precedence; registered indoctor-categories.--detachfork), so the long-lived process gets reniced, not the throwaway parent.Test plan
applyNiceness(7)→ps -o niconfirms7→ registry round-trips → cleanup unlinks.bun build --compile).Reviews
Plan cleared by
/plan-eng-review(5 findings resolved) and/codexoutside-voice (12 findings: 11 folded, 1 documented). Codex caught the detached-supervisor renice-ordering bug and theparseInt("3.5")gap.🤖 Generated with Claude Code
Documentation
docs/guides/minions-deployment.md: new "Lowering scheduling priority (--nice)" section — full-concurrency-low-priority pattern,GBRAIN_NICEenv, how to confirm the effective value.docs/architecture/KEY_FILES.md: added per-file entries forniceness.ts,worker-registry.ts,supervisor-pid.ts; extended thesupervisor.tsentry withbuildWorkerArgs+ nice opts.CHANGELOG.md: 0.42.23.0 entry (user-facing + contributors subsection).Coverage:
--niceflag andGBRAIN_NICEhave reference (README/deployment guide) + how-to (deployment guide examples) coverage. No diagram drift. No documentation debt.