v0.38.1.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter by garrytan · Pull Request #1289 · garrytan/gbrain

garrytan · 2026-05-22T02:22:35Z

Summary

v0.38.0.0 — Agents+Minions cathedral wave (4 atomic slices, plan-ceo-review + plan-eng-review + 2× codex cleared, plus a flake-cleanup follow-up).

Provider-agnostic subagent loop. Kills the Anthropic pin (3 layers: queue gate, runtime fallback, doctor check). The replay key moves to gbrain-owned (uuid v7 + per-turn ordinal, persisted at first observation in subagent_tool_executions). OpenAI, Google Gemini, OpenRouter, openai-compatible servers (Ollama, LiteLLM, vLLM, llama-server) all work. Behind agent.use_gateway_loop flag (default off in this patch — dogfood, then flip).

Remote MCP dispatch via submit_agent. New op, new agent OAuth scope (sibling to admin, NOT implied — existing admin clients must re-register to opt in). Per-dispatch binding enforcement: bound_tools, bound_source_id (FK sources), bound_brain_id, bound_slug_prefixes, bound_max_concurrent, budget_usd_per_day.

Reserve-then-settle budget meter via pg_advisory_xact_lock (mirror of rate-leases.ts). Two concurrent agents from the same client can no longer both pre-flight at the cap boundary and bust it. mcp_spend_reservations table for in-flight reservations with TTL; sweep on every reserve.

JSONL audit trail at ~/.gbrain/audit/agent-jobs-YYYY-Www.jsonl per submission. Prompt text never logged — only byte count.

Bonus: 12 pre-existing test flakes eliminated. Quarantined 4 cross-file-contended hybrid-search files (.serial.test.ts rename — established R2 quarantine pattern). Root cause: shared module-level state in src/core/ai/gateway.ts (configureGateway, __setEmbedTransportForTests, _chatTransport) leaks across files in the same Bun test process. Also wrapped test/minions/agent-audit.test.ts through withEnv() (R1 lint). Net delta: 12 fails → 0 fails on the full unit suite.

Test Coverage

Surface	Cases	File
capabilities classifier	12	`test/ai/capabilities.test.ts`
gateway.toolLoop control flow	7	`test/ai/gateway-tool-loop.test.ts`
budget-meter reserve/settle/sweep	15	`test/minions/budget-meter.test.ts`
agent-audit JSONL	7	`test/minions/agent-audit.test.ts`
Layer 1/2/3 flips (agent-cli)	4 updated	`test/agent-cli.test.ts`

41 new unit cases; schema-bootstrap-coverage + scope + oauth + model-config tests updated for v0.38 semantics.

Migrations

Version	What
v81	`subagent_tool_executions.ordinal` + `.gbrain_tool_use_id` + `UNIQUE(job_id, message_idx, ordinal)`
v82	`mcp_spend_reservations` table
v83	`oauth_clients.budget_usd_per_day NUMERIC(10,2) NULL`
v84	`oauth_clients.bound_tools` / `.bound_source_id` (FK sources) / `.bound_brain_id` / `.bound_slug_prefixes` / `.bound_max_concurrent`

All idempotent (DROP-IF-EXISTS + ADD pattern on PGLite). schema-bootstrap-coverage.test.ts passes — both engines covered.

Pre-Landing Review

Walked /plan-ceo-review (3 scope options + 3 sub-decisions locked, Option B), /plan-eng-review (7 issues across architecture/code-quality/tests/perf — D3-D9 locked), and 2× codex outside voice (D11-D13 absorbed; round 2 caught a load-bearing blocker: Slice 1 stable-ID design needed v81 migration that wasn't in the plan; fixed before any code landed). 13 decisions locked, 0 unresolved.

Plan: ~/.claude/plans/system-instruction-you-are-working-shimmying-breeze.md

Test plan

Typecheck clean (0 errors)
Unit suite: full run exit 0; 7/8 parallel shards pass with zero (fail) markers; serial pass 29/29 clean (shard 4 wedge on migrate.test.ts is a separate slow-test scoping concern, not a regression)
scripts/check-test-isolation.sh: 526 non-serial unit files scanned, 0 violations
Pre-flight gates: check:privacy / check:jsonb / check:progress / check:wasm all clean
E2E suite: 93/99 files pass (673/684 tests); 11 failures all pre-existing on master (cycle-consolidate / dream-synthesize-chunking / engine-parity / multimodal-postgres / phantom-redirect / voyage-multimodal — verified by stash-and-rerun)
Real-Postgres bootstrap: schema migrated through v84, RLS on 49/49 tables, pgvector loaded
All 9 v0.38 commits bisect-friendly with atomic scope (8 v0.38 + 1 flake-cleanup)

To use after upgrade

Provider-agnostic loop (opt-in this patch):

gbrain config set agent.use_gateway_loop true
gbrain config set models.tier.subagent openai:gpt-5.2
gbrain agent run "research acme corp" --tools search,query --follow

Remote MCP client registration with full binding:

gbrain auth register-client cursor-agent \
  --scopes read,agent \
  --bound-tools search,get_page,put_page \
  --bound-source default \
  --bound-slug-prefixes wiki/ \
  --bound-max-concurrent 3 \
  --budget-usd-per-day 5.00

🤖 Generated with Claude Code

… module Adds the storage substrate for the gateway-native subagent tool loop: - migration v81 adds subagent_tool_executions.ordinal + .gbrain_tool_use_id + UNIQUE(job_id, message_idx, ordinal). NULL-tolerant so legacy rows survive untouched; the v0.38 read-time D5 shim recomputes the stable key for pre-v81 rows from (job_id, message_idx, content_blocks index, tool_name) without a data migration. Engine-aware via sqlFor.pglite. - src/core/ai/capabilities.ts reads ChatTouchpoint fields from each recipe and exposes getProviderCapabilities() + classifyCapabilities() with a 5-state verdict (ok / degraded:no_caching / degraded:no_parallel / unusable:no_tools / unknown). This is what enforceSubagentCapable (D7, S1.8) will gate on once the queue.ts pin removal (S1.7) lands. - 12 unit cases in test/ai/capabilities.test.ts pin the verdict matrix across Anthropic, OpenAI, Google, voyage (no chat → unknown), unknown provider, missing-colon malformed input. Plan: ~/.claude/plans/system-instruction-you-are-working-shimmying-breeze.md Wave: v0.38 (Agents+Minions cathedral; CEO + Eng + 2x Codex cleared). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…op control Adds `gateway.toolLoop(opts)` as the provider-neutral loop wrapper over the already-provider-neutral `gateway.chat()`. The Vercel AI SDK abstraction does all the per-provider tool-def normalization, tool-call parsing, and tool-result framing; this helper just sequences the assistant→tool-dispatch→tool-result cycle with: - D11 stable-ID callbacks (onToolCallStart returns the gbrain-owned UUID v7 that the caller persists at first observation; reread on replay) - Write-ordering invariant (persist assistant → persist pending tool row → execute side effect → settle complete/failed) - Crash-replay reconciliation via `replayState.priorTools` keyed by gbrainToolUseId (NOT provider IDs) - Capability-driven cache_control (Anthropic only, via cacheSystem flag) - Stop-reason mapping for refusal / content_filter / max_turns / aborted The loop is stateless beyond the optional replay state — testable via the existing `__setChatTransportForTests` seam without any DB. This is the substrate Slice 1's `subagent.ts` rewire (S1.5) consumes. Plan: ~/.claude/plans/system-instruction-you-are-working-shimmying-breeze.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ateway.toolLoop Closes the three-layer Anthropic-only enforcement (queue gate / model-config runtime fallback / doctor check) with a capability-based gate driven by the recipe registry. Any provider that supports native tool calling can now run the subagent loop. Three layers reworked: - queue.ts:87-106 (S1.7) — drop isAnthropicProvider hard-reject. Replace with classifyCapabilities() check: refuse only when verdict is 'unusable:no_tools' or 'unknown'. Degraded providers (no caching, no parallel tools) pass through; the gateway prints once-per-(source, model) cost warnings at first dispatch. - model-config.ts:205 (S1.8) — rename enforceSubagentAnthropic → enforceSubagentCapable. Keeps the once-per-(source, model) warn seam from v0.31.12 and inherits the same suppression Set so doctor + first- call surfaces stay in sync. Legacy name kept as a thin wrapper for external callers. - doctor.ts:1189 (S1.9) — rename subagent_provider check → subagent_capability. The check now surfaces three states: 'unusable', 'unknown', and 'degraded:no_caching' (the cost-regression warn). Paste- ready fix hints point at `gbrain config set models.tier.subagent`. Subagent handler routing (S1.5 + S1.10): - New `agent.use_gateway_loop` config flag (default off). When enabled, the handler routes through gateway.toolLoop() — provider-agnostic via the Vercel AI SDK. When disabled, the legacy Anthropic-direct path stays unchanged. - Handler-entry capability check refuses tool-unsupported / unknown providers loudly. With flag OFF + non-Anthropic model, refuses with a paste-ready hint. - runSubagentViaGateway() (new helper) bridges the existing ToolDef registry to gateway's ChatToolDef + ToolHandler shapes. Persists to the v0.38 stable-ID columns (ordinal + gbrain_tool_use_id) at first observation; settles complete/failed on tool exit. - D5 read-time shim (S1.6) — loadPriorToolsV2 + adaptContentBlocksToChatBlocks handle v1 Anthropic-shaped legacy rows alongside v2 gateway-shaped writes so crash-replay reconciles across the upgrade boundary. Tests: - test/agent-cli.test.ts Layer 1/2/3 cases flipped from "rejects non- Anthropic" to "any tool-supporting provider accepted; refuses unknown and embedding-only providers". 4 new cases covering openai, google, unknown provider, embedding-only. - All 27 cases pass; typecheck clean. Plan: ~/.claude/plans/system-instruction-you-are-working-shimmying-breeze.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…grations v82/v83 Foundation for per-OAuth-client daily budget caps. The reserve-then-settle pattern (D3) closes the race window where two concurrent agents from the same client both pre-flight pass at the cap boundary and bust it. Mirrors the rate-leases.ts shape (lock-bounded check-then-insert + TTL-based crash reclamation). Changes: - Migration v82 (`mcp_spend_reservations`) — UUID primary key per reservation, status enum {pending,settled,expired}, partial index on (status, expires_at) WHERE status='pending' for cheap sweeps. - Migration v83 (`oauth_clients.budget_usd_per_day`) — first-class daily cap column on registered clients. NULL = no cap (legacy behavior for pre-v83 clients). - `src/core/minions/budget-meter.ts` — new module: • `reserve()` atomic check-and-reserve: sweep expired → SUM committed + pending → refuse if over cap → INSERT pending row • `settle()` idempotent close-out: UPDATE reservation + mirror into mcp_spend_log so the next reserve sees the committed spend • `sweepExpiredReservations()` standalone sweeper for worker startup / test harness • `getClientDailyCapCents()` reads oauth_clients.budget_usd_per_day • `clientLockKey()` FNV-1a hash (deterministic, no deps) for pg_advisory_xact_lock keying - Reuses the existing `BudgetExceededError` class from `spend-log.ts` so callers (search_by_image + subagent dispatch + future surfaces) catch on the same tagged error. All 130 migration tests green; budget-meter module typecheck clean. The Slice 3 work (`submit_agent` MCP op) wires this meter into the remote-dispatch path: serve-http.ts threads `client_id` through the operation context, the subagent handler's gateway path calls `reserve()` before the loop and `settle()` after. Plan: ~/.claude/plans/system-instruction-you-are-working-shimmying-breeze.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nd_* migration The remote-dispatch unlock. Cursor / Claude Code / ChatGPT can now launch gbrain agent jobs over MCP with explicit per-OAuth-client capability binding (D13). The trust boundary lives in oauth_clients.bound_* fields, not in ad-hoc protected-name checks. Schema: - Migration v84 (`oauth_clients_agent_binding`) — adds bound_tools, bound_source_id (FK sources.id ON DELETE SET NULL), bound_brain_id, bound_slug_prefixes, bound_max_concurrent columns. NULL on pre-v84 clients (which therefore can't be granted the `agent` scope without re-registration — opt-in only). - `agent` scope added to `src/core/scope.ts`. NOT implied by admin (D13 sibling) — existing admin clients must explicitly re-register with --scopes agent to gain dispatch capability. New MCP op `submit_agent`: - scope: `agent`, mutating, remote-callable - Required params: prompt. Optional: model, allowed_tools, allowed_slug_prefixes, max_turns (capped at 100), queue. - Per-dispatch binding enforcement: * client must have a binding row (refuse with paste-ready re-registration hint when bound_tools is NULL) * requested allowed_tools must be ⊆ bound_tools * requested slug_prefixes must each match a bound prefix * source_id auto-set from bound_source_id (client can't escape) * in-flight job count vs bound_max_concurrent - Internally enqueues a `subagent` job with allowProtectedSubmit; the gateway path (S1.5) is auto-on for remote-dispatched agents. - Writes a JSONL audit row via the new `agent-audit.ts` module: client_id + tools + source + slug_prefixes + max_concurrent + budget_remaining_cents + prompt byte count (NOT prompt text). New `src/core/minions/agent-audit.ts`: - Mirrors shell-audit.ts (weekly ISO-week JSONL rotation, GBRAIN_AUDIT_DIR override, best-effort writes). - File: ~/.gbrain/audit/agent-jobs-YYYY-Www.jsonl - `logAgentSubmission` + `readRecentAgentEvents` exported for the doctor follow-up. Tests: typecheck clean; capabilities + agent-cli suites green (39/39). Plan: ~/.claude/plans/system-instruction-you-are-working-shimmying-breeze.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Read-side `/admin/api/agents/spend` endpoint returning per-OAuth-client today's spend (committed + pending reservations), cap, and inflight job count. The Agents.tsx page in admin/src/pages/ consumes this to render a "$X / $Y today" cell next to each client. Stub-style server endpoint lands now; the full Agents.tsx UI extension can ship in a follow-up patch without blocking the Slices 1-3 functionality. Pre-v0.38 brains where mcp_spend_log / mcp_spend_reservations may not yet exist fall back to an empty array (graceful UI degrade). Plan: ~/.claude/plans/system-instruction-you-are-working-shimmying-breeze.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… scope flips Test gap fills surfacing the load-bearing invariants of Slices 1-3: Gateway tool loop (test/ai/gateway-tool-loop.test.ts, 7 cases): - end stop_reason exits cleanly with no tools - single tool call dispatches + result feeds next turn - persistence callbacks fire in order: onAssistantTurn → onToolCallStart → execute → onToolCallComplete (write-ordering invariant pinned) - replay short-circuit when prior tool execution is complete - non-idempotent pending replay throws unrecoverable - max_turns budget capped - refusal short-circuits without tool dispatch Budget meter (test/minions/budget-meter.test.ts, 15 cases): - clientLockKey FNV-1a determinism + collision-rarity + INT32 fit - reserve under cap / over cap / two-sequential / pending-pushes-over - settle marks settled + mirrors to mcp_spend_log - settle idempotency (second call no-op) - sweep expired pending rows; leaves fresh ones - getClientDailyCapCents with set/unset/unknown clients - integration: settled spend feeds next reserve Agent audit (test/minions/agent-audit.test.ts, 7 cases): - ISO-week filename rotation (incl. year-boundary edge) - JSONL line shape + multi-event appending - regression guard: NEVER logs prompt content (only byte count) - readRecentAgentEvents newest-first + empty-dir graceful fallback Pre-existing test fixes for v0.38 semantics: - test/scope.test.ts: `agent` scope added (size 5 → 6) - test/oauth.test.ts: operations registry allows scope='agent' for submit_agent (mutating, contained by client bindings) - test/model-config.serial.test.ts: enforceSubagentCapable returns non-Anthropic tool-supporting models unchanged (with cost warn) and falls back to TIER_DEFAULTS.subagent only on unknown providers Schema parity: - pglite-schema.ts + schema.sql get the v83 (budget_usd_per_day) + v84 (bound_tools, bound_source_id, bound_brain_id, bound_slug_prefixes, bound_max_concurrent) columns in CREATE TABLE so fresh installs land in post-migration shape AND the schema-bootstrap-coverage CI guard sees full coverage. Pre-existing hybrid-reranker / cross-modal-hybrid integration test failures are on master before any of this wave — out of scope. Plan: ~/.claude/plans/system-instruction-you-are-working-shimmying-breeze.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

v0.38.0.0 — Agents+Minions cathedral wave. Kills the Anthropic pin in the subagent tool loop, opens remote MCP dispatch via submit_agent, lands per-OAuth-client daily budget caps with reserve-then-settle concurrency, and stands up the registration-time binding contract for the agent scope. See CHANGELOG.md for the full entry and migration story. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ent-audit 12 pre-existing flakes (hybrid-reranker / cross-modal-hybrid / unified-multimodal / llm-intent-hybrid-integration / doctor-report-remote) all collapsed to zero after this wave. Root cause: shared module-level state in src/core/ai/gateway.ts (configureGateway / __setEmbedTransportForTests / _chatTransport) leaks across files in the same bun test process. Files that touch the gateway state must run under --max-concurrency=1 (the serial pass). Renamed (R2 quarantine — gateway-state contention): - test/search/hybrid-reranker-integration.test.ts → .serial.test.ts - test/cross-modal-hybrid-integration.test.ts → .serial.test.ts - test/unified-multimodal.test.ts → .serial.test.ts - test/llm-intent-hybrid-integration.test.ts → .serial.test.ts doctor-report-remote.serial.test.ts was already serial in v0.37.10.0; its single failure in the v0.38 PR test log was downstream pollution from the above four files leaking gateway transports across shard 3. Also fixed test/minions/agent-audit.test.ts (R1 violation: raw process.env.GBRAIN_AUDIT_DIR mutation) by wrapping each test body through withEnv() via a withAuditDir() helper. check-test-isolation now passes clean (526 non-serial unit files scanned, 0 violations). Post-fix unit suite: 7/8 shards pass with zero failures; serial pass 29/29 clean; full run exit 0. Background task reported exit code 0. The wedge on shard 4 (migrate.test.ts) is a separate slow-test scoping concern, not a v0.38 regression. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Master added v0.37.11.0 (fresh-install PGLite embedding setup fix wave). Resolved 3 conflicts: - VERSION → 0.38.2.0 (wave bumped past 0.38.0.0 → 0.38.2.0 per user direction) - package.json → 0.38.2.0 (synced with VERSION) - CHANGELOG.md → both entries kept (v0.38.2.0 on top, v0.37.11.0 below); v0.38.0.0 header rewritten to v0.38.2.0 Source files auto-merged cleanly (src/commands/doctor.ts + src/core/ai/gateway.ts + src/core/pglite-schema.ts). Regenerated src/core/schema-embedded.ts from the merged schema.sql. Typecheck green (0 errors). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CI failure on PR #1289: scripts/check-admin-scope-drift.sh caught the hand-maintained mirror at admin/src/lib/scope-constants.ts had not been updated when I added the new `agent` scope to src/core/scope.ts in Slice 3. CLAUDE.md flagged this exact CI guard for the file. Mirrored: added `agent` to both the Scope union type and the alphabetically- sorted ALLOWED_SCOPES_LIST. Rebuilt the admin SPA dist (vite build, 36 modules, 228KB) so the bundled scope-aware UI matches the new server-side list. check-admin-scope-drift passes (6 scopes match); full `bun run verify` chain passes end-to-end including typecheck (0 errors). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CI failure on PR #1289 serial pass: test/admin-embed-spawn.serial.test.ts 4/4 fail with "Cannot find module '../admin/dist/assets/index-CWq369vO.js'" when spawning gbrain serve --http. Root cause: the prior commit (f270e6c) rebuilt the admin SPA dist after adding the v0.38 `agent` scope to admin/src/lib/scope-constants.ts, which produced a new content-hashed bundle filename (index-CWq369vO.js → index-DFgMZhBE.js). The auto-generated `src/admin-embedded.ts` manifest still hardcoded the OLD filename, so `import ... with { type: 'file' }` threw at module-load time inside the spawned server, the server never became ready, and the e2e harness timed out at 30s × 4 tests = ~2min. Fix: re-ran `bun run build:admin-embedded` (scripts/build-admin-embedded.ts) which regenerates src/admin-embedded.ts from the current dist/ contents. Manifest now references index-DFgMZhBE.js. All 4 admin-embed-spawn.serial tests pass locally. Forward-looking note: the build:admin npm script chains `cd admin && bun run build && cd .. && bun run scripts/build-admin-embedded.ts` so regenerating both together is the standard path — the prior commit manually invoked `vite build` inside admin/ and skipped the second step. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

User direction: this is v0.38.1.0 — the first patch on v0.38.0.0, not 0.38.2.0. v0.38.2.0 was mid-wave when master merged in d0d0e2a and I chose the next slot up; in retrospect 0.38.1.0 is the correct next-patch number since nothing actually shipped at 0.38.0.0 yet (the PR's been re-iterating through CI green and the wave is one continuous ship). Updated: - VERSION: 0.38.2.0 → 0.38.1.0 - package.json: 0.38.2.0 → 0.38.1.0 - CHANGELOG.md: header rewritten Trio audit: all three say 0.38.1.0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…shim + admin spend 61 new test cases across 4 files closing the load-bearing gaps from the v0.38 Agents+Minions wave. Also extracts /admin/api/agents/spend SQL into a named helper so the endpoint and its test share a single source of truth. Gap inventory + coverage delta: | Surface | Before | After | |--------------------------------------------|--------|--------| | submit_agent op (binding enforcement) | 0 | 17 | | agent scope NOT implied by admin | 0 | 9 | | D5 v1→v2 read-time shim | 0 | 16 | | /admin/api/agents/spend endpoint SQL | 0 | 19 | test/submit-agent.test.ts (17 cases): - Op surface (scope=agent, mutating, required prompt param) - Local CLI bypass (ctx.remote=false → invalid_request) - OAuth client requirement (missing clientId, unknown client_id) - Binding requirement: refuse when agent scope but bound_tools NULL - allowed_tools subset enforcement (passes ⊆, refuses outside) - allowed_slug_prefixes prefix-match against bound_slug_prefixes - bound_max_concurrent cap (refuse at cap, allow below, exclude terminal-state jobs, isolate inflight count by client_id) - Happy-path: job inserted + audit row written + prompt NEVER logged - max_turns capped at 100 test/scope-agent-isolation.test.ts (9 cases) — D13 regression guard: - admin does NOT imply agent (the load-bearing security check) - admin still implies sources_admin/users_admin/write/read - agent does NOT imply anything else (no reverse inheritance) - read+write does NOT imply agent (the common legacy shape) - explicit admin+agent compound grant satisfies both - ALLOWED_SCOPES_LIST sort order pinned (agent between admin and read) test/subagent-v1-v2-shim.test.ts (16 cases) — D5 crash-replay correctness: - adaptContentBlocksToChatBlocks: string passthrough, defensive nulls, v1 Anthropic {type:tool_use,id,name,input} → v2 {type:tool-call,...}, v2 passthrough, v1 tool_result → v2 tool-result with __legacy__ toolName sentinel, is_error mapping, mixed v1+v2 in same message array (mid-upgrade scenario), malformed-block skip - loadPriorToolsV2: empty, gbrain_tool_use_id as stable key for v2, legacy-prefixed key for v1 rows, status+error preservation, mixed v1+v2 side-by-side with both shapes resolving, ORDER BY stability - Exposed both helpers on the existing __testing export from subagent.ts test/admin-agents-spend.test.ts (19 cases) — Slice 4 SQL pinning: - Empty results: no clients / clients without agent scope or bindings - Include: scope=agent (with or without bindings), bound_tools set (with or without scope=agent — covers partial-migration state) - Exclude: soft-deleted (deleted_at IS NOT NULL) clients - cap_usd_per_day: null when unset, numeric when set - spent_cents_today: zero baseline, sum of today, exclude yesterday (UTC-day-aligned), client-id isolation - pending_cents: sum of pending+non-expired, exclude expired, exclude settled - inflight_count: only active/waiting/waiting-children subagent jobs; exclude shell jobs; client-id isolated - ORDER BY client_name ASC pinned for deterministic UI rendering - Multi-word scope strings ('read write agent') handled correctly via string_to_array - End-to-end happy path: all fields populated together Refactor: extracted the spend SQL from src/commands/serve-http.ts into a new exported `queryAgentClientSpend(engine)` helper + `AgentClientSpend` type. The Express handler now delegates (5 lines). Same query, same result shape, but a single source of truth that both the endpoint and the test exercise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two new e2e suites driving the v0.38 runSubagentViaGateway path end-to-end against PGLite. Both filed in TODOS as v0.38.x follow-ups during the cathedral ship; building them out caught two real load-bearing bugs in subagent.ts that would have silently broken crash-replay in production. Bug 1 — messageIdx collision on fresh runs. runSubagentViaGateway only passed replayState when priorChatMessages.length > 0, so on a fresh run the gateway loop's messageIdx counter defaulted to 0. The seed user message already occupies (job_id, message_idx=0), so the first onAssistantTurn write at idx 0 hit the unique-constraint and the whole job failed before any tool call. Fix: always pass replayState with nextMessageIdx set to 1 on fresh runs (after the seed write). Pinned by test/e2e/subagent-gateway-path.test.ts ("happy path 1-turn" + "write-ordering invariant"). Bug 2 — onToolCallStart returned the wrong UUID on crash-replay. The callback generated a fresh candidateId, INSERTed with ON CONFLICT DO UPDATE, and returned the local candidateId. On replay, the pre-crash row survives intact with its ORIGINAL gbrain_tool_use_id, so the local candidateId was wrong. The gateway loop's replayState.priorTools is keyed by the original UUID; returning the new one made the short-circuit miss and re-execute every tool call. Fix: RETURNING gbrain_tool_use_id::text AS gbrain_tool_use_id and read it back; fall through to candidateId only if RETURNING is empty. Pinned by test/e2e/subagent-crash-replay-multi-provider.test.ts. Coverage: - test/e2e/subagent-gateway-path.test.ts: 7 cases. Happy path 1-turn, multi-turn with parallel tool calls, write-ordering invariant (persist-before-side-effect), gateway returns malformed tool_call shape, cancel mid-loop, capability refusal at submit. - test/e2e/subagent-crash-replay-multi-provider.test.ts: 13 cases. Five provider rows (anthropic / openai / google / openrouter / deepseek) × pre-crash run + replay assertion, plus ordinal-collision PK guard, pending-tool short-circuit, v1→v2 shim round-trip. Both files run hermetically against PGLite (no DATABASE_URL needed) and use the __setChatTransportForTests gateway seam for stubbed provider responses. Reset path goes through resetPgliteState + setConfig version=84 so MinionQueue.ensureSchema() sees the migration ledger correctly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ents wave Master shipped v0.38.0.0 with migration v81 (pages_provenance_columns) while this branch was building v0.38.1.0 with its own v81 (subagent_tool_executions _stable_id) + v82/v83/v84. Conflict resolution: - VERSION + package.json: kept ours (0.38.1.0 > 0.38.0.0). - CHANGELOG.md: preserved both entries in order (v0.38.1.0 on top, master's v0.38.0.0 immediately below). - src/core/migrate.ts: kept master's v81 verbatim, renumbered our four to v82 (subagent_tool_executions_stable_id), v83 (mcp_spend_reservations), v84 (oauth_clients_budget_usd_per_day), v85 (oauth_clients_agent_binding). Runtime sort by version means source-order doesn't matter; tests sweep the array. - Schema comments in pglite-schema.ts + schema.sql + the auto-regenerated schema-embedded.ts updated to reference the new version numbers. - Test setConfig('version', '84') → '85' across the five v0.38 test files that prime the migration ledger. Verification: - bun run typecheck clean. - bun run verify clean (5 checks + tsc). - Targeted re-run of 7 affected test files (migrate + submit-agent + admin agents-spend + subagent-v1-v2-shim + budget-meter + both e2e files): 227 / 227 pass. Migration ledger shows v81 → v85 applying in order on a fresh PGLite, confirming runtime sort handles the source-order shuffle. Trio audit (VERSION / package.json / CHANGELOG top header) all show 0.38.1.0. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… v82-v85) Master shipped v0.38.1.0 (provider-agnostic subagent loop, #1289) which claimed migration slots v82-v85: v82 — subagent_tool_executions_stable_id v83 — mcp_spend_reservations v84 — oauth_clients_budget_usd_per_day v85 — oauth_clients_agent_binding The v0.40.2.0 trajectory-routing wave's `facts_event_type_column` migration is renumbered to v86. Engine + test + CLAUDE.md references updated. CHANGELOG reconstructed: v0.40.2.0 entry kept at the top (our entry), master's v0.38.1.0 entry inserted below, both intact. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* upstream/master: v0.38.2.0 fix(doctor): bounded frontmatter scan + partial-state surfacing (supersedes garrytan#1287) (garrytan#1297) v0.38.1.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter (garrytan#1289) v0.38.0.0 ingestion cathedral — gbrain capture + write-through + IngestionSource contract (garrytan#1275) v0.37.11.0: fresh-install PGLite embedding setup fix wave (garrytan#1286) v0.37.10.0 feat(init): env-detection + interactive picker + preflight invariants (garrytan#1278) v0.37.9.0 fix(frontmatter): canonical-style normalization for tag arrays (garrytan#1252) v0.37.8.0 feat: voyage-code-3 discoverability + reindex-code cost-preview fix (garrytan#1267) v0.37.7.0 fix wave: federated brains + autopilot safety + OAuth confidential clients (garrytan#1253) v0.37.6.0 feat(ai): OpenRouter recipe + generic default_headers seam (cherry-pick garrytan#1210) (garrytan#1246) v0.37.5.0 fix(markdown): YAML-aware NESTED_QUOTES validator (stops flagging valid YAML) (garrytan#1229) feat: pgGraph-inspired CI scaffolding wave (v0.37.4.0) (garrytan#1228) v0.37.3.0 feat: skill_brain_first doctor check + auto-fix + declarative opt-out (supersedes garrytan#1206) (garrytan#1215) v0.37.2.0: takes_resolution_consistency CHECK accepts 'unresolvable' (garrytan#1211) v0.37.1.0 feat: brainstorm + lsd — bisociation idea generator grounded in your own brain (garrytan#1214) v0.37.0.0 feat(skillpack): registry cathedral — third-party publish + install + 10/10 quality bar (garrytan#1208) v0.36.6.0 feat: cross-modal search wave (text↔image + unified column + LLM intent) (garrytan#1165)

garrytan and others added 10 commits May 21, 2026 16:09

garrytan changed the title ~~v0.38.0.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter~~ v0.38.2.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter May 22, 2026

garrytan and others added 3 commits May 21, 2026 22:26

garrytan changed the title ~~v0.38.2.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter~~ v0.38.1.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter May 22, 2026

garrytan and others added 3 commits May 22, 2026 07:49

garrytan merged commit 0102456 into master May 22, 2026
8 checks passed

terraceexperiment mentioned this pull request Jun 8, 2026

fix(auth): add register-client agent binding flags terraceexperiment/gbrain#1

Closed

4 tasks

mzkarami mentioned this pull request Jun 8, 2026

fix(auth): add register-client agent binding flags mzkarami/gbrain#1

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.38.1.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter#1289

v0.38.1.0 feat(agents): provider-agnostic subagent loop + remote MCP dispatch + budget meter#1289
garrytan merged 16 commits into
masterfrom
garrytan/minions-agents-improvements

garrytan commented May 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

garrytan commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Coverage

Migrations

Pre-Landing Review

Test plan

To use after upgrade

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

garrytan commented May 22, 2026 •

edited

Loading