You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
gbrain dream (invoked via systemd timer cron) intermittently exits with code 1 after ~4 seconds of wall-time and ~7MB memory peak, having written no diagnostic output to its configured StandardOutput/StandardError log. The crash leaves the gbrain-cycle row in gbrain_cycle_locks un-released, compounding with #1534.
Symptoms
From journalctl -u gbrain-dream.service --no-pager:
May 25 09:00:00 CloudTron systemd[1]: Starting gbrain-dream.service - GBrain nightly dream cycle...
May 25 09:34:07 CloudTron systemd[1]: Finished gbrain-dream.service - GBrain nightly dream cycle.
May 25 09:34:07 CloudTron systemd[1]: gbrain-dream.service: Consumed 27.721s CPU time.
May 26 09:00:01 CloudTron systemd[1]: Starting gbrain-dream.service - GBrain nightly dream cycle...
May 26 09:00:05 CloudTron systemd[1]: gbrain-dream.service: Main process exited, code=exited, status=1/FAILURE
May 26 09:00:05 CloudTron systemd[1]: gbrain-dream.service: Failed with result "exit-code".
May 26 09:00:05 CloudTron systemd[1]: gbrain-dream.service: Consumed 2.618s CPU time, 7.0M memory peak, 0B memory swap peak.
A successful run takes 30-35 min wall-time, ~28s CPU, ~440 MB memory peak. The failure mode is a 4-second exit with 7 MB memory peak — the process barely started.
A second instance: cycle attempt at 21:40 PST same day also died (PID 3917507 left a stale lock).
What I tried
Reproduce interactively with the exact systemd ExecStart args:
Result: Skipped: another cycle is already running. (locked) — exit 0. (The previous crash left the lock held, so this path is benign and exits cleanly.)
env -i HOME=/root PATH=... to strip env contamination — same "Skipped" result
Check /var/log/gbrain/gbrain-dream.log — file uses StandardOutput=append so all runs commingle without per-run boundary. The 4s crash either wrote nothing or wrote ≤3 lines indistinguishable from prior runs ending mid-[cycle.conversation_facts_backfill].
Postgres health — 9 connections active of 100 max, no errors in postgres log, SELECT 1 returns fine.
systemd drop-ins — none for this service.
Recent config changes — /root/.config/api-keys.env last modified 24h before the crash, /root/.gbrain/pg.env modified 7d before, /etc/systemd/system/gbrain-dream.service modified 12h before.
Suggested upstream improvements
Per-run log demarcation: write ---- run START <ISO> pid=<N> args=... and ---- run END <ISO> exit=<N> duration=<ms> markers to make StandardOutput=append logs separable.
Startup heartbeat: write a startup ok line within the first 500ms of gbrain dream so failures BEFORE that line are distinguishable from failures AFTER.
Crash trap: wrap the top-level dream entrypoint in try/catch that flushes stderr with the exception class + stack before process.exit(1). Right now the 4s exit-1 produces zero diagnostic output.
Summary
gbrain dream(invoked via systemd timer cron) intermittently exits with code 1 after ~4 seconds of wall-time and ~7MB memory peak, having written no diagnostic output to its configuredStandardOutput/StandardErrorlog. The crash leaves thegbrain-cyclerow ingbrain_cycle_locksun-released, compounding with #1534.Symptoms
From
journalctl -u gbrain-dream.service --no-pager:A successful run takes 30-35 min wall-time, ~28s CPU, ~440 MB memory peak. The failure mode is a 4-second exit with 7 MB memory peak — the process barely started.
A second instance: cycle attempt at 21:40 PST same day also died (PID 3917507 left a stale lock).
What I tried
Reproduce interactively with the exact systemd ExecStart args:
Result:
Skipped: another cycle is already running. (locked)— exit 0. (The previous crash left the lock held, so this path is benign and exits cleanly.)gbrain dream --dry-run --json— exit 0, status: skipped, reason: cycle_already_runningenv -i HOME=/root PATH=...to strip env contamination — same "Skipped" resultCheck
/var/log/gbrain/gbrain-dream.log— file usesStandardOutput=appendso all runs commingle without per-run boundary. The 4s crash either wrote nothing or wrote ≤3 lines indistinguishable from prior runs ending mid-[cycle.conversation_facts_backfill].Postgres health — 9 connections active of 100 max, no errors in postgres log,
SELECT 1returns fine.systemd drop-ins — none for this service.
Recent config changes —
/root/.config/api-keys.envlast modified 24h before the crash,/root/.gbrain/pg.envmodified 7d before,/etc/systemd/system/gbrain-dream.servicemodified 12h before.Suggested upstream improvements
---- run START <ISO> pid=<N> args=...and---- run END <ISO> exit=<N> duration=<ms>markers to make StandardOutput=append logs separable.startup okline within the first 500ms ofgbrain dreamso failures BEFORE that line are distinguishable from failures AFTER.process.exit(1). Right now the 4s exit-1 produces zero diagnostic output.gbrain-cyclelocks. Fixing doctor stale_locks: --break-lock hint hard-codes <code>gbrain sync</code> even when lock is <code>gbrain-cycle</code> #1534 + adding (3) above would close the loop.Environment
/etc/systemd/system/gbrain-dream.service→/usr/local/bin/gbrain-job.sh dream --dir /data/brain --source defaultRelated