Skip to content

fix(subagent): drop JSON.stringify on jsonb writes (orchestrator slug-collection bug)#597

Open
vinsew wants to merge 1 commit into
garrytan:masterfrom
vinsew:vinsew/subagent-jsonb-double-encode-fix
Open

fix(subagent): drop JSON.stringify on jsonb writes (orchestrator slug-collection bug)#597
vinsew wants to merge 1 commit into
garrytan:masterfrom
vinsew:vinsew/subagent-jsonb-double-encode-fix

Conversation

@vinsew

@vinsew vinsew commented May 3, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes the v0.16.0+ jsonb double-encode bug in src/core/minions/handlers/subagent.ts that breaks dream synthesize's orchestrator slug-collection. Symptom: transcripts_processed=1, pages_written=0 even when child subagents successfully wrote pages to the DB.

Root cause

The four subagent_* jsonb writers (persistMessage, persistToolExec{Pending,Complete,Failed}) used engine.executeRaw (which routes to postgres.js's unsafe()) with JSON.stringify(value) plus $N::jsonb cast. Verified empirically:

P1 JSON.stringify(obj) + $::jsonb  →  jsonb_typeof='string'   (broken — current code)
P2 raw object         + (no cast)  →  jsonb_typeof='object'   (correct)
P3 raw object         + $::jsonb   →  jsonb_typeof='object'   (correct — this PR)
P4 sql.json(obj) via unsafe        →  jsonb_typeof='object'   (correct)

postgres.js v3 auto-encodes objects/arrays for jsonb columns. queue.ts:233 already passes raw objects for minion_jobs.data and was never affected. PGLite handles raw values identically (verified — all three patterns produce object jsonb on PGLite).

Symptom this resolves

collectChildPutPageSlugs in src/core/cycle/synthesize.ts:459 queries:

SELECT input->>'slug' FROM subagent_tool_executions WHERE ...

On string-typed jsonb the ->> operator returns NULL, so pages_written reports 0 even when child subagents successfully put_page'd. The reverse-write step that mirrors DB → markdown then has no targets, leaving real synthesized pages stranded in the DB. After this fix, the orchestrator picks up slugs correctly and the dream cycle closes the loop end-to-end.

Companion changes in this commit

  • repair-jsonb extended from 5 → 8 columns (subagent_messages.content_blocks, subagent_tool_executions.{input, output}); each target guarded with to_regclass() so pre-v0.15 brains that lack the subagent_* tables don't throw at upgrade time.
  • doctor's jsonb_integrity check extended to the same 8 columns with the same guard. Any user hitting this bug will see "X rows double-encoded ... Fix: gbrain repair-jsonb" instead of silent dream failures.
  • Header comment of repair-jsonb.ts updated: the prior claim that the parameterized $N::jsonb form was always safe was wrong — safety came from queue.ts/sources.ts/etc. passing raw objects, not from the binding form itself.

Field evidence

On the developer's brain at the time of fix (production gbrain deployment, postgres engine):

  • subagent_messages.content_blocks: 40/40 corrupt
  • subagent_tool_executions.input: 39/39 corrupt
  • subagent_tool_executions.output: 39/39 corrupt

gbrain repair-jsonb after this patch repaired all 118 rows. Subsequent gbrain dream --input <transcript> --json produced pages_written=2, reverse_write_count=2, child job completed, and the rendered markdown landed on disk.

Test plan

  • bun run typecheck clean
  • 245 tests pass across minions, subagent-handler, cycle-synthesize, cycle-patterns, repair-jsonb, migrations-v0_12_2, doctor
  • End-to-end: fresh transcript → gbrain dream --input → DB has object-typed jsonb → orchestrator returns correct slugs → markdown lands on disk
  • Reviewer should consider whether to ship a one-shot migration that auto-runs repair-jsonb for affected tables, or rely on gbrain doctor to nudge users — current PR ships the detection + repair tool but does not auto-trigger it

Upstream context

PR #525 (chmod fix, by another author) was unrelated to this jsonb work despite name overlap from local commits that mis-attributed it. PR #539 (open) does post-hoc UPDATE-based data repair in synthesize.ts/patterns.ts but does NOT patch the write path — this PR is the missing complement. They can land in either order: this PR stops the bleeding, #539 cleans existing brains.

🤖 Generated with Claude Code


View in Codesmith
Need help on this PR? Tag @codesmith with what you need.

  • Let Codesmith autofix CI failures and bot reviews

@Qodo-Free-For-OSS

Copy link
Copy Markdown

Hi, repair-jsonb.ts uses the same to_regclass($1) unqualified existence check and skips a target when it returns NULL, which can cause the command to report 0 repairs even when broken rows exist but the table is not resolvable via search_path. This is especially risky because migrations (v0_12_2 verify phase) treat total_repaired=0 as “no broken rows remain.”

Severity: remediation recommended | Category: correctness

How to fix: Use schema-aware existence check

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

repairJsonb() can skip repairs/counts if to_regclass($1) fails to resolve an unqualified table name under search_path. This can incorrectly report 0 repairs (and pass migration verify) while broken rows still exist.

Issue Context

The migration verifier (v0_12_2) only inspects total_repaired from repair-jsonb --dry-run --json.

Fix Focus Areas

  • src/commands/repair-jsonb.ts[129-140]
  • src/commands/migrations/v0_12_2.ts[59-80]

Suggested fix

Replace the to_regclass($1) check with a schema-explicit existence query, e.g.:

  • SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_schema = current_schema() AND table_name = $1) AS exists

If the app assumes public, use table_schema='public' to avoid dependence on search_path. Keep the skip behavior, but make the existence test deterministic.


Found by Qodo code review

@vinsew vinsew force-pushed the vinsew/subagent-jsonb-double-encode-fix branch from ea91f3e to f25e0b6 Compare May 11, 2026 08:30
@vinsew vinsew force-pushed the vinsew/subagent-jsonb-double-encode-fix branch from f25e0b6 to b4a8f91 Compare May 23, 2026 19:47
…nb + doctor

The four `subagent_*` jsonb writers in `src/core/minions/handlers/subagent.ts`
(persistMessage, persistToolExecPending, persistToolExecComplete,
persistToolExecFailed) used `engine.executeRaw` with `JSON.stringify(value)`
plus `$N::jsonb` cast. On the postgres-engine path that goes through
postgres.js's `unsafe()`, that combination produces a jsonb string scalar
instead of an object — exactly the v0.12.0 double-encode shape, on a second
set of tables.

Verified empirically (probe inside this branch, not committed):

  P1 JSON.stringify(obj) + $::jsonb → jsonb_typeof='string'  (broken)
  P3 raw object + $::jsonb           → jsonb_typeof='object' (correct)
  P4 sql.json(obj) via unsafe        → jsonb_typeof='object' (correct)

The four call sites are changed to pass the raw value. postgres.js v3
auto-encodes objects/arrays for jsonb columns (already what queue.ts:233
relies on for `minion_jobs.data`, never broken). PGLite handles raw values
identically (verified — all three patterns produce object jsonb on PGLite).

Existing data on the user's brain at the time of fix:
  subagent_messages.content_blocks: 40/40 corrupt
  subagent_tool_executions.input:   39/39 corrupt
  subagent_tool_executions.output:  39/39 corrupt
All 118 rows repaired by `gbrain repair-jsonb` after extending TARGETS.

Symptom this resolves: dream synthesize's orchestrator
(`collectChildPutPageSlugs` in src/core/cycle/synthesize.ts:459) queries
`input->>'slug'` to gather slugs the subagent wrote; on string-typed jsonb
that operator returns NULL, so `pages_written` reports 0 even when child
subagents successfully wrote pages. The reverse-write step that mirrors
DB → markdown is then skipped, leaving real synthesized pages stranded
in the DB. After this fix, the orchestrator picks up slugs correctly and
the full dream cycle closes the loop.

Companion changes:
  - `repair-jsonb` extended from 5 → 8 columns (subagent_messages.content
    _blocks, subagent_tool_executions.{input,output}); guards each target
    with `to_regclass()` so pre-v0.15 brains that lack the subagent_*
    tables don't throw at upgrade time.
  - `doctor`'s `jsonb_integrity` check extended to the same 8 columns
    with the same guard. Any future user hitting this bug from upstream
    master gets a clear "X rows double-encoded ... Fix: gbrain
    repair-jsonb" warning instead of silent dream failures.
  - `repair-jsonb.ts` header comment retracted: the prior claim that
    parameterized `$N::jsonb` was always safe was wrong; safety came
    from queue.ts/etc. passing raw objects, not from the binding form.

Upstream history: PR garrytan#525 was a chmod fix mis-attributed by local commits;
the real outstanding upstream PR is garrytan#539 (zombie lock + post-hoc data
repair via UPDATE) which does NOT patch the write path. Master HEAD still
ships the buggy form. This patch is upstream-worthy as a follow-up to

Tests: 245 pass across minions, subagent-handler, cycle-synthesize,
cycle-patterns, repair-jsonb, migrations-v0_12_2, and doctor. typecheck
clean.
@vinsew vinsew force-pushed the vinsew/subagent-jsonb-double-encode-fix branch from b4a8f91 to 364911e Compare June 1, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants