Skip to content

autopilot: minion worker crashes with "No database connection: connect() has not been called" #1491

@justemu

Description

@justemu

Description

The minion worker subprocess (gbrain jobs work) spawned by gbrain autopilot crashes repeatedly because engine.connect() has not been called before the worker's main loop attempts to access the database.

This prevents ALL autopilot cycles from completing, and causes data loss during the extract phase.

Environment

  • gbrain version: 0.41.14.0
  • Engine: Postgres
  • Config: agent.use_gateway_loop=true, chat_model: deepseek:deepseek-v4-flash
  • OS: Linux (WSL2 Ubuntu)

Symptoms

1. Worker crash loop (51 crashes recorded)

Promotion error: No database connection: connect() has not been called. Fix: Run gbrain init --supabase or gbrain init --url <connection_string>
[autopilot] worker exited code=1 signal=null after 1409ms, crashCount=4, cause=runtime_error
[autopilot] crash backoff 8111ms (crashCount=4)

Error counts:

  • 51 worker crashes
  • 265 "Promotion error" in promoteDelayed()
  • 446 total "No database connection" errors
  • 52 batch link row losses during extraction

2. Data loss during extract.links_fs

[extract.links_fs] 60/60 (100%)
[extract.links_fs] connection blip, retrying 26 rows in 500ms (No database connection: connect() has not been called. Fix: Run gbrain init --supabase or gbrain init --url <connection_string>)
  batch error (26 link rows lost): No database connection: connect() has not been called. Fix: Run gbrain init --supabase or gbrain init --url <connection_string>
[extract.links_fs] 60/60 (100%) done
Links: created 0 from 60 pages

3. Unhandled rejection during conversation_facts_backfill

[cycle.conversation_facts_backfill] start
[unhandledRejection] GBrainError: No database connection: connect() has not been called. Fix: Run gbrain init --supabase or gbrain init --url <connection_string>
    at getConnection (/.../gbrain/src/core/db.ts:153:15)
    at transaction (/.../gbrain/src/core/postgres-engine.ts:764:34)
    at failJob (/.../gbrain/src/core/minions/queue.ts:855:24)
    at executeJob (/.../gbrain/src/core/minions/worker.ts:831:39)
    at processTicksAndRejections (native:7:39)

4. All sources never complete a full cycle

[FAIL] cycle_freshness: Source 'aevum' has never completed a full cycle;
Source 'agent-arch' has never completed a full cycle; ... (all 9 sources)

Brain score permanently stuck at 51/100 because link extraction (3/25), timeline extraction (2/15), and orphan resolution (1/15) cannot complete.

Root Cause Analysis

The main gbrain CLI process correctly calls createEngine()connectWithRetry() at startup (src/cli.ts:1687-1691). However, the minion worker is spawned as a separate bun gbrain jobs work subprocess by the supervisor (src/core/minions/supervisor.ts). This child process also goes through the full CLI initialization and should call connectWithRetry(), but something in the worker's startup path causes connect() to not be reliably established before promoteDelayed() or executeJob() attempts DB access.

The worker's queue.promoteDelayed() runs in the main loop before any job is claimed (src/core/minions/worker.ts:434), and if the engine's connection pool isn't ready at that point, the error is caught but the worker then crashes with an unhandled rejection when executeJob() later tries to call getConnection().

Contrast: Main process works fine

The parent autopilot process and direct gbrain dream invocations work correctly — DB access succeeds. Only the minion worker child process experiences this.

Suggested Fix Direction

The worker's main loop should verify that engine.connect() has been called before entering the promoteDelayed() / claim / execute cycle. Alternatively, the connection should be lazily established on first use rather than requiring an explicit connect() call.

A workaround might be to pass GBRAIN_DATABASE_URL as an environment variable so the worker can self-initialize without depending on parent state.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions