v0.39.3.0: productionize the v0.38 ingestion cathedral (smoke-test fix wave from PR #1299)#1308
Merged
Merged
Conversation
…s note) PR #1299 from garrytan-agents shipped a 308-line production smoke-test report against v0.38.0.0 on Supabase+PgBouncer. Bringing the report verbatim into this worktree alongside the actual fixes (v0.38.3.0 wave). Privacy scrub passed per CLAUDE.md placeholder rule — no real people, companies, funds, or deals named. Editor's Note prepended to flag two re-diagnosed findings: - BUG-2 actual crash line is :1594-1597 (req.body === undefined → JSON.stringify returns literal undefined → Buffer.from throws), not the originally reported :1508. - WARN-5 root cause is `capture` missing from CLI_ONLY_SELF_HELP set; the detailed HELP constant exists but is unreachable. Co-Authored-By: garrytan-agents <garrytan-agents@users.noreply.github.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-fix: `gbrain capture --file foo.md` on a file that already has YAML frontmatter stamped a second outer `---` block whose `title` field was the file's own opening `---` delimiter. Users got pages with two frontmatter blocks: outer `title: '---'`, inner the real metadata. The parser then treated the second block as a body-side horizontal rule. Root cause: capture.ts:136-152 `buildContent` always prepended its own frontmatter without inspecting whether the input already had one. The title-from-first-line heuristic at :141 picked up the `---` delimiter when present. Fix: new pure helper `mergeCaptureFrontmatter(rawBody, opts)` parses existing frontmatter via gray-matter (the lib markdown.ts already uses) and merges capture's auto-fields with user-wins precedence on user- declared keys. Single output frontmatter block in all cases. Precedence: - `type`: opts.type (CLI flag) > userFm.type > 'note' - `title`: userFm.title > derived-from-body - `captured_via`: userFm.captured_via > opts.source > 'capture-cli' - `captured_at`: userFm.captured_at > now() (user can pre-stamp) - All other user keys (description, tags, slug, ...) pass through verbatim Files without frontmatter keep the original behavior (stamp fresh, wrap under derived `# heading` if body lacks markdown structure). CQ2 boil-the-lake test coverage in test/capture-build-content.test.ts (19 cases, 47 assertions): 13 specified cases including CJK title, CRLF line endings, UTF-8 BOM, empty frontmatter `---\n---\n`, malformed YAML, no-trailing-newline-before-body, user description/tags/slug passthrough; plus 3 deriveTitle helper cases, 2 --type CLI flag precedence cases, and the BUG-1 regression guard with the exact reported input shape. Decisions deferred to later wave phases: - CV3 source_kind taxonomy + CV15 canonical source resolver: Phase 3c - CV8 dedup hash from rawBody: Phase 3c (receipt) + Phase 3d (DB) - CV9 trim boundary split: Phase 3c - CV10 binary-byte guard: Phase 3c Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 2a) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… JSON
Pre-fix: a POST /ingest with NO body (no Content-Length, no body bytes
— common shape for misconfigured webhooks and the smoke-test repro)
leaves `req.body === undefined`. The body-coercion block's `else`
branch then called `Buffer.from(JSON.stringify(undefined), 'utf8')` —
and `JSON.stringify(undefined) === undefined` (the literal, not the
string), so `Buffer.from(undefined, 'utf8')` threw TypeError. The
unhandled throw reached express's default error handler, which
served an HTML 500 page. Clients expecting JSON envelopes broke.
Two changes:
1. Explicit `req.body == null` null-guard at the top of the handler
(BEFORE the body coercion). Catches both `null` and `undefined`
request bodies and returns the same 400 `empty_body` envelope as
the empty-Buffer guard at :1600. Closes the TypeError class
structurally.
2. Outer try/catch wrapping the entire handler body with a
`!res.headersSent` guard (codex F#16) so any unexpected throw —
downstream of the inner queue.add try/catch, in a logging
side-effect, or in the SSE broadcast — returns a JSON 500
envelope instead of leaking the HTML error page. Mirrors the F14
pattern around the MCP handler's transport.handleRequest at
serve-http.ts:1508-1520.
E2E regression test in test/e2e/serve-http-ingest-webhook.test.ts:
'BUG-2: POST with no body (undefined req.body) → 400 JSON envelope
(not 500 HTML)' uses `fetch(URL, {method:'POST'})` with no body field
to provoke the exact pre-fix shape. Asserts status !== 500, status
=== 400, content-type is application/json, body.error === 'empty_body'.
The existing 'empty body → 400' test at line 200 already covered the
empty-string-body case (Express raw() produces an empty Buffer; the
:1600 guard fires correctly). Both cases now reach the same envelope
via two structurally distinct paths.
Codex F#15 correction absorbed: the misleading 'body \"\"' → empty_body
test case was dropped from the plan. An empty JSON string body has
bytes and is not the same as a missing body.
Codex F#14 correction absorbed: header in the smoke-test verification
command is `Authorization:` not `Auth:` (updated in the plan; tests
already use the correct shape).
Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 2b)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… + CV12 COALESCE
Migration v81 added 4 nullable provenance columns to `pages`
(source_kind, source_uri, ingested_via, ingested_at). Pre-fix, put_page
wrote them into the FILE's frontmatter (operations.ts:637-644 write-
through path) but never to the DB columns. `gbrain call get_page | jq
.source_kind` returned null even when the JSON receipt claimed
`capture-cli`. This commit closes the loop on the write side.
Surface changes:
- src/core/types.ts: PageInput gains 4 optional provenance fields
(source_kind, source_uri, ingested_via, ingested_at) with docstring
explaining the trust model.
- src/core/import-file.ts: importFromContent opts accepts 3 (source_kind,
source_uri, ingested_via); threads them to tx.putPage. ingested_at is
NOT a caller-controllable param — engine stamps it.
- src/core/operations.ts put_page op: accepts 3 client params on the
wire schema (uniform across transports). CV6 trust gate: when
ctx.remote !== false, IGNORES client params and server-stamps
source_kind='mcp:put_page'/source_uri=null/ingested_via='mcp:put_page'.
Only ctx.remote === false (capture CLI, autopilot, dream cycle)
honors client values.
- src/core/postgres-engine.ts + pglite-engine.ts putPage: INSERT/UPDATE
SQL extended with 4 columns. ingested_at stamped to now() only when
ANY provenance field is being written this call (null otherwise).
ON CONFLICT clause uses COALESCE-preserve UPDATE (CV12) so a later
put_page without provenance does NOT erase the original first-write
audit trail — first-write-wins for routine edits.
CV6 closes the spoofing surface codex caught: a write-scope OAuth
token can no longer poison the audit trail with arbitrary labels
('source_kind: capture-cli' from an MCP agent). Anything that isn't
strictly `ctx.remote === false` falls through to server-stamped
'mcp:put_page', matching the v0.26.9 F7b fail-closed discipline.
CV12 closes the audit-trail-erasure trap: routine put_page edits (no
provenance args) preserve the original ingestion's source_kind /
ingested_at via `COALESCE(EXCLUDED.x, pages.x)`. Explicit re-ingestion
that passes new provenance overwrites (latest-ingestion-wins).
Tests in test/put-page-provenance.test.ts (11 cases, 29 assertions):
- Trusted local caller: client params populate DB columns; partial
provenance still triggers ingested_at stamp; omitting all leaves
all 4 null.
- CV6 spoofing guard: remote caller's source_kind='capture-cli' claim
becomes 'mcp:put_page'; ctx.remote === undefined treated as remote
(v0.26.9 F7b: fail-closed on anything not strictly false).
- CV12 COALESCE-preserve: second write without provenance preserves
first-write audit AND timestamp; explicit re-ingestion with new
provenance overwrites; remote second write after local first
records the most-recent honest source (mcp:put_page).
- T2 subagent regression: namespace check fires when provenance params
are present; subagent within wiki/agents/<id>/ succeeds with server-
stamped provenance per CV6; missing subagentId still fail-closes.
Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 3a)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ages Phase 3a wrote the 4 provenance columns into the DB via put_page. This phase makes them visible to the read side so the smoke-test verification command `gbrain call get_page <slug> | jq .source_kind` actually returns the value the write side just stored. Surface changes: - src/core/types.ts Page: 4 optional fields (source_kind, source_uri, ingested_via, ingested_at). Three-state read pattern matching v0.26.5's deleted_at convention — undefined when the SELECT projection didn't include the column (older callers), null when historical pre-v0.38 row, populated when the v0.38+ ingestion stamped it. - src/core/utils.ts rowToPage: maps the 4 columns to Page fields via the same conditional-spread pattern as deleted_at / effective_date. Older SELECTs without the projection continue to compile. - src/core/postgres-engine.ts + pglite-engine.ts getPage: projection extended to include the 4 columns. listPages uses `SELECT p.*` so the columns flow through automatically — no listPages SQL change. Both engines' putPage RETURNING clause already projects the 4 columns from Phase 3a, so the round-trip (write→get) is symmetric. get_page op + JSON renderer require no changes: `gbrain call get_page` goes through `runCall` (src/commands/call.ts:52) which is `console.log(JSON.stringify(result, null, 2))`. Since `result` is the Page object from get_page (which is the engine's getPage return, which is rowToPage), the new fields surface automatically. The markdown renderer (`gbrain get`) goes through serializeMarkdown which strips structured fields; that's the right behavior for the markdown view. Structured access stays on `gbrain call get_page` and `list_pages` (JSON output). Engine-parity test (T3) extended in test/e2e/engine-parity.test.ts with 2 new cases: - 'provenance columns: putPage writes + getPage returns identical shape on both engines' — seeds capture-cli provenance, asserts source_kind/source_uri/ingested_via match across engines and ingested_at is a Date on both. - 'provenance COALESCE-preserve UPDATE: parity on both engines (CV12)' — first write stamps capture-cli, second write WITHOUT provenance preserves the first-write source_kind / ingested_via on BOTH engines (CV12 first-write-wins is engine-uniform). Gated on DATABASE_URL — runs PGLite half always; Postgres half skips without it (existing engine-parity pattern). When the test fires, a drift between the two engines now fails loudly instead of waiting for a user's `gbrain migrate --to supabase` to surface the bug. Phase 3a test suite (test/put-page-provenance.test.ts) re-verified green (11 pass) after the read-path additions — no regression. Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 3b) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…te-side overhaul
Lands the seven decisions resolved in the plan-eng-review for the
capture CLI's write-side. Each addresses a specific smoke-test finding
or codex outside-voice concern. Atomic single commit because all
changes are in capture.ts and tightly coupled (CV9's trim split feeds
CV8's hash input which the tests depend on).
Decisions resolved:
CV3 — source_kind taxonomy: capture ALWAYS stamps source_kind='capture-cli'
in the receipt JSON AND threads it through to put_page's provenance
params. Pre-fix `parsed.source ?? 'capture-cli'` conflated the DB source
FK (where to write) with the ingestion-channel taxonomy (what kind of
ingestion this was). --source now ONLY maps to source_id; source_kind
is closed taxonomy per migration v81's documented set.
CV7 — thin-client --source rejection: when isThinClient(cfg) AND
parsed.source is set, exit 1 BEFORE any network call with a clear error
pointing at the right fix: `gbrain auth register-client <name>
--source <id>` on the server. Mirrors CV6 trust posture (server-side
OAuth client registration owns source scope; per-call override would
reopen the spoofing surface CV6 just closed).
CV8 (CLI side) — receipt content_hash now comes from normalized rawBody,
NOT the assembled fullContent (which contained a timestamp-bearing
frontmatter). Two captures of identical text now produce identical
content_hash, restoring the daemon's 24h LRU dedup at the CLI receipt
layer. The DB hash (importFromContent) gets the same treatment in
Phase 3d.
CV9 — trim boundary split: introduced `normalizeForHash(s)` pure
helper (trim + BOM strip + LF + NFKC). Hash input gets aggressive
normalization for dedup correctness; the STORED body (passed to
buildContent → mergeCaptureFrontmatter) preserves user bytes (CRLF,
BOM, whitespace) for round-trip fidelity. CQ2's CRLF/BOM tests
continue to pass.
CV10 — binary file guard: read --file via `readFileSync(path)` with NO
encoding (Buffer-side), then sniff first 8KB for NUL bytes via
`detectBinaryNullByte(buf)`. Mirror the same sniff for --stdin (which
also now reads Buffer-side via `readStdinBuffer()`). Real UTF-8 text
(including CJK, emoji, BOM) never contains NUL bytes; binary formats
(executables, archives, most image formats) do. Reject with friendly
error before UTF-8 decode mangles the bytes. Deterministic test
fixtures use `Buffer.from([...])` with explicit byte arrays instead
of /dev/urandom (which a 256-byte sample often had no NUL in).
CV15 — canonical source resolver: route through
`resolveSourceWithTier(engine, parsed.source, cwd)` from
src/core/source-resolver.ts (the v0.37.7.0 6-tier chain every other
CLI op uses). Honors flag → env (GBRAIN_SOURCE) → dotfile
(.gbrain-source) → local_path → brain_default → seed_default. Closes
the WARN-3-adjacent UX divergence where capture silently used
`parsed.source ?? 'default'`, ignoring env / dotfile / local_path /
brain_default tiers. Bonus: resolveSourceWithTier's assertSourceExists
throws a friendly error if the source is missing, BEFORE put_page is
called — making capture-level pre-flight redundant for the common
case (A2 fallback below handles TOCTOU + thin-client edge cases).
A2 — friendly FK error rewrite: new `maybeRewriteSourceFkError(err,
sourceId)` helper detects the Postgres FK-violation patterns
('pages_source_id_fk' OR 'foreign key constraint ... source') and
returns a paste-ready hint: `source '<id>' is not registered.
Register it first: gbrain sources add <id> --path <path>`. Applied
in BOTH the local-engine catch block AND the thin-client (callRemoteTool)
catch block per T1 — so the same friendly error surfaces regardless of
install type. Defense-in-depth alongside CV15's upstream check (covers
TOCTOU race + thin-client implicit source from dotfile pointing at a
server-deleted source).
WARN-1 receipt-side dedup, WARN-3 friendly source error, WARN-7 binary
guard, and the source_kind taxonomy fix all become user-visible via
this commit. The DB-side dedup (WARN-1's daemon side) lands in Phase 3d.
HELP text updated:
- Documents the 6-tier source resolution chain
- Notes thin-client --source restriction
- Notes binary content rejection
- Notes dedup behavior (whitespace + line endings normalized)
- Notes source_kind != source_id distinction
Tests in test/capture-runcapture.test.ts (26 cases, 30 assertions):
- CV10 detectBinaryNullByte: 10 cases incl. ASCII, CJK, emoji, BOM,
start/mid NUL, PNG magic, 8KB cap boundary, empty
- CV9 normalizeForHash: 6 cases incl. whitespace, BOM, CRLF, NFKC,
no-op on clean, whitespace-only → empty
- CV8 hash stability: 4 cases proving identical input → identical
hash regardless of whitespace / line endings / timing
- A2 maybeRewriteSourceFkError: 6 cases incl. raw PG message, wrapped
OperationError, postgres.js-wrapped, unrelated errors, missing
sourceId, non-Error throws
Phase 3a + Phase 2a tests re-verified green (30 pass across both
files) — no regression in the surfaces this commit didn't directly
touch.
Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 3c)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_hash Pre-fix: every gbrain capture invocation produced a fresh DB content_hash because parsed.frontmatter included captured_at (and now ingested_at) which change per call. Two consequences: - existing.content_hash === hash short-circuit never fired for identical body captures → every capture re-chunked and re-embedded unchanged content, burning embedding API spend - daemon-side 24h LRU dedup (separate consumer keyed on the same hash) silently never matched, defeating its design intent Phase 3c shipped the CLI-side fix (receipt hash from normalized rawBody). This commit shipps the DB-side fix. Approach: strip timestamp-bearing frontmatter keys before hashing, NOT strip all frontmatter. Whitelist: ['captured_at', 'ingested_at']. Future timestamp keys add to the list — small, stable surface. Why not strip ALL frontmatter? Sync would regress: a user editing a markdown file to add a tag changes the frontmatter without changing the body. The pre-fix hash captures that change; tag reconciliation fires. If we stripped all frontmatter, the hash wouldn't change, the short-circuit would fire, and the tag-add would silently no-op. The narrow whitelist preserves frontmatter-change-detection for real edits (tags, type, slug, description, ...) while ignoring the ephemeral timestamp keys that capture-cli and provenance-write-through stamp per-call. Tests in test/import-file.test.ts (4 new cases in 'CV8 DB content_hash stability' describe block): - captured_at differences → IDENTICAL hash → second capture status 'skipped' (short-circuit fires); putPage NOT called the second time - body change → DIFFERENT hash → second capture status 'imported' (real edits still flow through) - tag change → DIFFERENT hash → re-import fires (REGRESSION GUARD: proves frontmatter-change detection survives the strip) - ingested_at differences → IDENTICAL hash (provenance-only refresh doesn't invalidate the chunk cache) Combined with Phase 3c's CLI-side hash fix, WARN-1 (dedup not actually deduplicating) is now fully resolved: both the user-visible receipt hash AND the DB / daemon hash stabilize across identical-content captures. Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 3d) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sts BRAIN
WARN-5: `gbrain capture --help` printed only the generic short-circuit
fallback ('Usage: gbrain capture\\n\\ngbrain capture - run gbrain --help
for the full command list.'). Root cause: `capture` was missing from
CLI_ONLY_SELF_HELP at src/cli.ts:34-53. The detailed HELP constant at
src/commands/capture.ts:90+ existed but was unreachable because the
dispatcher's printCliOnlyHelp short-circuit at :101 fired first.
WARN-6: `gbrain --help` did not mention capture / brainstorm / lsd
anywhere. New v0.37/v0.38 commands were implemented and dispatched but
absent from the hardcoded printHelp text.
Two fixes in cli.ts:
1. Add 'capture' to CLI_ONLY_SELF_HELP. This skips the generic
short-circuit, allowing the dispatch flow to reach runCapture which
has its own --help branch printing the detailed HELP constant.
2. Add a pre-engine-bind '--help' short-circuit for capture in
handleCliOnly (mirrors the existing sync + reinit-pglite pattern).
Without this, `gbrain capture --help` on a fresh tmpdir with no
config would hit the engine bind at :1077 and exit with 'Cannot
connect to database' before runCapture's --help branch fires.
3. Add BRAIN section to printHelp text between TOOLS and SOURCES.
Documents capture / brainstorm / lsd with their key flags, matching
the tone of the existing grouped sections.
Tests in test/cli-help-discoverability.test.ts (6 cases, 31 assertions):
- WARN-5: capture --help contains every documented flag (--slug,
--type, --file, --stdin, --source, --quiet, --json)
- WARN-5: output is NOT the generic short-circuit fallback (presence
of 'Examples:' + length > 10 lines + does not match the bare-
short-circuit regex)
- WARN-5: -h short flag works too
- WARN-6: main --help mentions all 3 commands as command-line entries
- WARN-6: BRAIN section heading is present and the 3 commands appear
textually after it
- regression: existing top-level commands (init, doctor, get, search,
query, import, export, files, embed) still listed (snapshot guard
against accidental deletion of other groups during the BRAIN insertion)
Tests use spawnSync subprocess execution so the real dispatcher flow
is exercised end-to-end (no mocking of cli.ts internals).
Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 4a)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pre-fix at serve-http.ts:1091: the /admin/api/register-client route
destructured `scopes` from req.body and passed `scopes || 'read'` to
`oauthProvider.registerClientManual(name, grants, scopes, uris)`.
Three bugs in that single line:
1. Field-name mismatch: OAuth wire format uses `scope` (singular).
The destructure looked for `scopes` (plural). A request body sending
`{"scope": "read write"}` had `scopes === undefined`, fell through to
the `'read'` default, and silently created a read-only client. This
is the exact behavior the smoke test reported.
2. Array shapes crashed: registerClientManual's parseScopeString calls
`.split(' ')` on its argument. Arrays don't have `.split`, so
`{"scopes": ["read", "write"]}` threw TypeError mid-request,
surfacing as a 500.
3. Empty-array truthy: `[]` is truthy in JS so `scopes || 'read'`
returned the empty array, also crashing on split.
Codex outside-voice (CV12) also flagged: validation depth is
insufficient. `["read write"]` (a single-element array where the
element contains a space) silently passes the type check but produces
an unknown scope `"read write"` that registers as garbage. Same for
non-string elements (null, numbers), empty strings, and duplicates.
Fix: new `normalizeScopesInput(raw: unknown)` helper in src/core/scope.ts
handles all four valid input shapes and rejects everything else with
a typed error. The admin route accepts BOTH `scopes` (admin SPA) AND
`scope` (OAuth wire), normalizes, and surfaces a 400 invalid_scopes
on validation failure.
Validation matrix:
- undefined / null / missing → 'read' default
- string → split on /\s+/, dedupe, validate each element against
ALLOWED_SCOPES allowlist, re-join sorted
- string[] → reject non-string elements, reject empty strings, reject
internal-whitespace (catches ['read write'] bug shape), dedupe,
validate each element, re-join sorted
- everything else (number, boolean, plain object) → Error
- empty array / whitespace-only string after split → Error
- unknown scope name → InvalidScopeError ('Unknown scope "X". Allowed: ...')
Output is sorted for determinism so two registrations with the same
scope set produce identical DB rows.
Tests in test/scope-normalize.test.ts (28 cases, 30 assertions):
- Happy paths: 12 cases incl. all 4 shapes, dedupe in both directions,
whitespace tolerance (tab/newline), every hierarchy scope
- Rejection paths: 14 cases incl. number/object/boolean inputs, empty
array, empty string, non-string array elements, empty-string element,
whitespace-in-element (the codex bug), unknown scope name in both
string and array forms, mix of known+unknown
- Determinism: 2 cases proving sorted output is order-independent
Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 4b)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ith diagnostic Pre-fix: every gbrain capture invocation logged '[facts:absorb] failed to log gateway_error for inbox/...: No database connection: connect() has not been called. Fix: Run gbrain init...'. Non-fatal — the page write itself succeeded — but loud per-capture noise that the smoke test rightly flagged as a user-visible bug. Root cause: the facts subsystem grabs a separate engine handle that isn't connected on the CLI capture path. The actual write succeeds via the connected engine that capture uses; the facts:absorb log is a courtesy that the doctor health check reads. Codex flagged this as symptom-treatment vs root-cause-fix (CV13). Plan picked the middle path: suppress the per-capture noise NOW, instrument first-occurrence with a stack trace so the v0.38.4 fix knows where to look. Two changes: CQ1 — typed access via instanceof + .problem field on GBrainError (NOT string-match on .message). The class GBrainError already has structured (problem, cause_description, fix) fields per src/core/types.ts:1104. Matching the structured field is impossible to silently break: if someone edits the error wording in db.ts thinking it's cosmetic, the typed access still routes correctly. String-match on .message would be the same fragility class we just closed elsewhere in this wave (capture's FK error rewrite uses both patterns deliberately because raw PG messages don't go through GBrainError). CV13 — first-occurrence diagnostic: module-scoped _hasLoggedDisconnectedFactsAbsorb flag fires ONE stderr warn with the full stack trace the first time this class occurs in a process. Subsequent occurrences are silent. The next user reporting the warning gives us the call site without an extra round-trip. Other failure classes (GBrainError with a different .problem field, plain Error, anything else) keep the loud per-call warn so real subsystem errors (PgBouncer crash, schema drift, etc.) still surface. The suppression is narrowly scoped to the known-broken-but-non-fatal WARN-4 class only. Test seam: `_resetFactsAbsorbDisconnectedFlagForTests()` exported so each test can assert from a clean slate. Tests in test/facts-absorb-log.test.ts (4 new cases in 'WARN-4 disconnected-engine suppression' describe block): - First occurrence prints ONE warn with 'WARN-4' + 'First-occurrence trace' substring; subsequent 3 calls are silent - GBrainError with a DIFFERENT problem field still warns loudly (the suppression is narrow, not class-wide) - Plain Error (non-GBrainError) still warns loudly - The engine.logIngest call STILL fires for the suppressed case (the suppression is on the WARN output, not the attempt — preserves the doctor health check's read path if/when the wiring is fixed in v0.38.4) v0.38.4 follow-up TODO filed per the plan: trace why the facts pipeline opens a separate engine handle on the CLI capture path, and either share the connected engine OR no-op the absorb-log when called from a CLI context. The diagnostic stack trace this commit prints is the input that fix needs. Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 4c) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ucturedAgentError
Pre-fix: brainstorm + lsd silently produced no output on PgBouncer
transaction-mode environments. Postgres statement_timeout fired
canceling listPrefixSampledPages or hybrid search; the unhandled
error reached main()'s catch-all and surfaced as a generic
'gbrain: unknown error' line — after the user had already waited
through the 10-second cost-preview window. Zero ideas, no diagnostic,
no hint about what to do.
Three changes per the resolved decisions:
CV11 + T4 — orchestrator-level entry-point wrap (NOT per-call
whack-a-mole). The public runBrainstorm becomes a thin wrapper that
delegates to runBrainstormImpl inside try/catch; the catch runs
classifyBrainstormError on the thrown value. Adding a new internal
SQL call to runBrainstormImpl is automatically covered — codex F#20's
'scope too narrow' concern resolves structurally.
A3 — reuse StructuredAgentError (the v0.19.0 envelope every new
agent-facing surface uses) with code='brainstorm_timeout'. No new
BrainstormError class; matches CLAUDE.md's stated convention. Future
typed errors in brainstorm follow the same pattern.
T4 + codex F#19 — classifier matches SQLSTATE 57014 specifically (the
spec-defined query_canceled code) via postgres.js .code, alternate
.sqlState, or message-substring fallback. Hint wording reads 'query
canceled' (generic) covering all three PG cancel sub-causes:
statement_timeout (often PgBouncer transaction-mode), lock_timeout,
user-cancel. Honest under each.
CV11 CLI formatter — runBrainstormCli (used by both gbrain brainstorm
AND gbrain lsd) catches StructuredAgentError before main()'s catch-all
sees it. Prints in the cli.ts:188-191 OperationError-block shape:
'Error [<code>]: <message>' then ' Hint: <hint>'. JSON mode emits
the structured envelope (matches serializeError shape). Non-typed
errors fall through to the dispatcher's existing catch — natural
shape preserved (codex F#20 — no broad swallowing).
Files:
- src/core/brainstorm/error-classify.ts (new): isQueryCanceledError +
classifyBrainstormError pure helpers. Module isolated from the
orchestrator so the classifier can be unit-tested without spinning
up the full brainstorm pipeline.
- src/core/brainstorm/orchestrator.ts: imports the helpers; public
runBrainstorm becomes a try/catch wrapper around the unchanged
runBrainstormImpl. ~30 LOC change with zero body edits below.
- src/commands/brainstorm.ts: catches StructuredAgentError before the
generic main() handler. Imports StructuredAgentError from
'../core/errors.ts'.
Tests in test/brainstorm-timeout.test.ts (14 cases, 43 assertions):
- isQueryCanceledError: 8 cases covering postgres.js {code}, alternate
{sqlState}, message-substring fallbacks for all 3 cancel sub-causes,
case-insensitive SQLSTATE match, negative cases (different codes,
non-DB errors, null/undefined/non-object inputs)
- classifyBrainstormError: 5 cases pinning the StructuredAgentError
envelope (class, code, message), hint covers all 3 PG sub-causes
(codex F#19 honesty contract), non-57014 errors pass through with
SAME REFERENCE (codex F#20 — no clone, no swallow), null/undefined
pass through, classified Error.message channel is descriptive
- Source-shape regression guards: orchestrator.ts imports the helpers
AND wraps runBrainstormImpl in try/catch at the public entry point
(NOT per-call); commands/brainstorm.ts has the CLI formatter
recognizing StructuredAgentError with 'Error [' + 'Hint:' shape
WARN-10 root cause (PgBouncer-friendly SQL shape for listPrefixSampledPages)
deferred per the plan's out-of-scope rationale — this commit adds the
diagnostic surfacing so users know what hit them instead of silent
no-output. TODOS.md follow-up filed in Phase 6.
Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 5)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s regen VERSION 0.38.2.0 → 0.38.3.0; package.json mirror. CHANGELOG: ELI10-lead-first entry per CLAUDE.md voice rules. "Things you can now do" section covers all 12 user-visible fixes (BUG-1 frontmatter merge, BUG-2 /ingest empty body, WARN-1 dedup, WARN-3 friendly source error, WARN-5 capture --help, WARN-6 main --help, WARN-7 binary guard, WARN-8 provenance write+read, WARN-9 admin scopes, WARN-10 brainstorm timeout surfacing, WARN-2 type-overwrite documentation, WARN-4 facts:absorb suppression). "What you'd see in a concrete example" block shows real terminal output. "Things to watch" covers CV7 thin-client --source rejection, CV12 COALESCE- preserve UPDATE semantics, CV3 closed source_kind taxonomy, brainstorm diagnostic-only deferral, facts:absorb first-occurrence diagnostic. "To take advantage" verification block with paste-ready commands. "For contributors" credits @garrytan-agents for PR #1299 and links the plan + decision trace. TODOS.md: 7 new v0.39 follow-ups filed at the top (SQL-shape rewrite of listPrefixSampledPages for PgBouncer, magic-byte allowlist, facts:absorb root-cause trace, --source-kind override flag, ingest_capture Minion handler architecture migration, provenance-history table, ingest webhook provenance pass-through). CLAUDE.md: single consolidated v0.38.3.0 entry under "Key files" naming every touched file + every decision (D1, A1-A3, CQ1-CQ2, T1-T4, CV3, CV5-CV15) with their resolution. Future maintainers see the full surface from one paragraph instead of grepping the diff. llms.txt + llms-full.txt: regenerated via `bun run build:llms` per CLAUDE.md's mandatory chaser ('every CLAUDE.md edit needs a build:llms chaser or test/build-llms.test.ts fails in CI'). 7 cases pass post-regen. Verify gate passes: typecheck clean + all 8 shell pre-checks (privacy, jsonb, progress, wasm, admin-scope-drift, cli-executable, system-of- record, eval-glossary-fresh, synthetic-corpus-privacy, skill-brain- first, fuzz-purity). Trio audit: VERSION: 0.38.3.0 package.json: 0.38.3.0 CHANGELOG: ## [0.38.3.0] - 2026-05-22 Plan: ~/.claude/plans/system-instruction-you-are-working-async-popcorn.md (Phase 6) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…oductionize # Conflicts: # CHANGELOG.md # TODOS.md # VERSION # package.json # src/commands/brainstorm.ts # src/core/brainstorm/orchestrator.ts # src/core/import-file.ts # src/core/operations.ts
…anges
Two CI failures from the post-merge state, both pre-existing tests
pinning implementation detail that v0.39.3.0's fixes legitimately
changed. Repair the tests, not the production behavior.
1) test/commands/capture.test.ts — 2 cases ('uses first non-empty
line as the title' + 'caps title at 80 chars') were checking the
YAML literal `title: "..."` (double-quoted). v0.39.3.0 Phase 2a
(BUG-1 frontmatter merge) replaced the hand-rolled `JSON.stringify`-
quoting with `matter.stringify()`, which follows YAML defaults:
simple strings emit unquoted (`title: Real first line`), special-
char strings get single-quoted. The semantic ("title equals X")
is correct; the literal-quoting check was incidental. Parse the
YAML and assert on the value via gray-matter.
2) test/fix-wave-structural.test.ts — the v0.36.1.x #1077 PKCE-
public-clients regex pinned the exact destructure `const { name,
scopes, tokenTtl, ... } = req.body`. v0.39.3.0 Phase 4b (WARN-9
admin scopes normalization) moved `scopes` to a separate read
line so the route can accept BOTH `scopes` (admin SPA) AND `scope`
(OAuth wire singular) via `?? `. Relax the destructure regex to
accept either layout AND add a NEW regex pinning the
`scopes ?? scope` fallback so the actual v0.39.3.0 contract is
load-bearing. PKCE-fix assertions (tokenEndpointAuthMethod ===
'none' + client_secret_hash = NULL + token_endpoint_auth_method =
'none') unchanged.
Both fixes are tests-only — no production code change.
Verify gate clean post-fix; 30/30 cases pass in the two affected
test files.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The merge resolution bumped VERSION + package.json + CHANGELOG header to v0.39.3.0 (per user direction; higher than master's existing v0.39.0.0 + v0.39.1.0 commit subjects). But the wave's source code comments + test file headers + the smoke-test report's Editor's Note + the CLAUDE.md extension entry all still carried the v0.38.3.0 internal label. Sed-pass across 25 files (24 in-tree + the smoke-test report Editor's Note): - 13 src/ files: capture.ts, cli.ts, serve-http.ts, operations.ts, import-file.ts, types.ts, utils.ts, scope.ts, postgres-engine.ts, pglite-engine.ts, facts/absorb-log.ts, brainstorm/orchestrator.ts, brainstorm/error-classify.ts - 10 test files: capture-build-content, capture-runcapture, put-page-provenance, scope-normalize, cli-help-discoverability, brainstorm-timeout, facts-absorb-log, import-file, e2e/engine-parity, e2e/serve-http-ingest-webhook - docs/v0.38-smoke-test-report.md (Editor's Note only — the filename + the report body's references to v0.38.0.0 stay since they identify the historical subject) - CLAUDE.md (extension entry tag) Comments-only change; no production behavior shift. Existing tests continue to pass (175 cases across 10 wave-specific test files). llms.txt + llms-full.txt regenerated to keep the CLAUDE.md update in sync (per CLAUDE.md's mandatory build:llms chaser). Trio audit re-confirmed: VERSION: 0.39.3.0 package.json: 0.39.3.0 CHANGELOG: ## [0.39.3.0] - 2026-05-22 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…oductionize # Conflicts: # CHANGELOG.md # VERSION # package.json
mgunnin
added a commit
to mgunnin/gbrain
that referenced
this pull request
May 28, 2026
* upstream/master: (22 commits) v0.41.4.0 wave: local providers + cross-platform stdin + gateway-routed dream judge (6 community PRs) (garrytan#1377) v0.41.3.0 fix(security/mcp): OAuth CORS lockdown + pre-register without DCR + validator surface (garrytan#1403) v0.41.2.0 feat: lens packs + epistemology unification — atoms + concepts as first-class units, calibration profile widening, gstack-learnings bridge (garrytan#1364) v0.41.1.0 feat: eval-loop wave — gbrain bench publish + gbrain eval gate close the LOOP (garrytan#1352) v0.41.0.0 feat(minions): fleet you supervise (4 field bugs + cathedral) (garrytan#1367) v0.40.10.0 feat: content sanity defense — junk-pattern throw + oversize-skip-embed (garrytan#1351) v0.40.9.0 feat(chunker): .sql indexing via tree-sitter + code-def on SQL DDL (garrytan#1173) (garrytan#1350) v0.40.8.1 docs: README rewrite + personal-brain + company-brain tutorials (garrytan#1345) v0.40.8.0 test: e2e + unit gap coverage + master flake root-cause fixes (garrytan#1313) v0.40.6.1 docs(todos): file v0.41 wave commitments + 7 verified-missing items (garrytan#1333) v0.40.7.0 Schema Cathedral v3 — agent-on-ramp + production rebuild of PR garrytan#1321 (garrytan#1327) v0.40.6.0 feat(sync): parallel sync --all + per-source lock invariant + sources status dashboard (productionized from PR garrytan#1314) (garrytan#1324) v0.40.5.0 Federated Sync v2 — parallel source sync + push triggers + per-source health (garrytan#1322) v0.40.4.0 feat(search): selective graph signals + per-stage attribution + audit-writer unification (garrytan#1300) v0.40.3.0 feat: contextual retrieval + cache invalidation gate + 4 deferred-item closures (garrytan#1323) v0.40.2.0 feat: trajectory routing for temporal + knowledge_update (gbrain think + LongMemEval) (garrytan#1296) v0.40.1.0 Track D — eval infrastructure (catch retrieval regressions, prove answer-quality wins) (garrytan#1298) v0.40.0.0 feat: agent-voice (Mars + Venus) + copy-into-host-repo skillpack paradigm (garrytan#1128) v0.39.3.0: productionize the v0.38 ingestion cathedral (smoke-test fix wave from PR garrytan#1299) (garrytan#1308) v0.39.2.0 feat(autopilot): per-source fan-out + cycle lock primitive + phase taxonomy (garrytan#1295) ...
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Productionizes the v0.38.0.0 ingestion cathedral release after a real production smoke test against Supabase+PgBouncer. Supersedes PR #1299 (docs-only smoke-test report from @garrytan-agents) by landing the report verbatim alongside the actual fixes for all 2 bugs + 10 warnings + 6 suggestions in one wave.
~/.claude/plans/system-instruction-you-are-working-async-popcorn.mdCloses / Supersedes
Test plan
bun run verify(typecheck + 8 shell pre-checks)bun test test/capture-build-content.test.ts(19 cases)bun test test/capture-runcapture.test.ts(26 cases)bun test test/put-page-provenance.test.ts(11 cases)bun test test/scope-normalize.test.ts(28 cases)bun test test/cli-help-discoverability.test.ts(6 cases)bun test test/brainstorm-timeout.test.ts(14 cases)bun test test/facts-absorb-log.test.ts(16 cases incl. 4 new)bun test test/import-file.test.ts(25 cases incl. 4 new)bun run test:e2e(engine-parity provenance cases + ingest empty-body E2E) — requires DATABASE_URL setup; run pre-merge🤖 Generated with Claude Code