fix: add missing items field to two array schemas (Gemini Pro strict JSON Schema compat) by DmitryBMsk · Pull Request #910 · garrytan/gbrain

DmitryBMsk · 2026-05-12T09:35:04Z

Problem

Two array properties in src/core/ are declared with type: 'array' but no items field, which strict JSON Schema validators reject. Most notably, Gemini Pro's tool-call schema validator throws:

LLM request rejected: Invalid schema for function 'gbrain__extract_facts':
In context=('properties', 'entity_hints'), array schema missing items.

Repro

Run gbrain serve as a stdio MCP child of any host that uses Gemini Pro as its chat model (OpenClaw 2026.5.4 with mcp.servers.<name>.command config is one example; ChatGPT Tool calls and other strict-schema LLM gateways behave the same).
The daemon enumerates tools and forwards schemas to the LLM.
The whole gbrain MCP surface (60 tools) becomes unavailable for that session because the LLM API rejects the request before any tool can be called.

End-to-end repro on my live OpenClaw deployment: bundle-mcp logs show failed to start server "gbrain": McpError: MCP error -32000: Connection closed, and the Telegram bot returns the Invalid schema for function 'gbrain__extract_facts' message verbatim. After the patch is applied locally and the stdio child is restarted, the bot enumerates all 60 gbrain__* tools and successfully calls e.g. gbrain__get_stats returning real numbers from Supabase.

Fix

Add the missing items field to both schemas. Behavior unchanged — these are JSON Schema metadata, not runtime contracts.

`src/core/operations.ts` — `extract_facts.params.entity_hints`

The description already says "canonical entity slugs" and the handler casts the value via p.entity_hints as string[]. Schema now matches:

-    entity_hints: { type: 'array', description: 'Existing canonical entity slugs ...' },
+    entity_hints: { type: 'array', items: { type: 'string' }, description: 'Existing canonical entity slugs ...' },

`src/core/resolvers/builtin/x-api/handle-to-tweet.ts` — `outputSchema.candidates`

The TypeScript interface XTweetCandidate already defines the per-element shape; the JSON Schema now spells the same fields out for any consumer that runs a JSON Schema validator on the resolver output:

-      candidates: { type: 'array' },
+      candidates: {
+        type: 'array',
+        items: {
+          type: 'object',
+          properties: {
+            tweet_id: { type: 'string' },
+            text: { type: 'string' },
+            created_at: { type: 'string', format: 'date-time' },
+            score: { type: 'number' },
+            url: { type: 'string', format: 'uri' },
+          },
+          required: ['tweet_id', 'text', 'created_at', 'score', 'url'],
+        },
+      },

Why not a typedef-driven generator?

Could be a future improvement (run the same TS interface through a to-json-schema step) — out of scope for this fix. Both schemas now match the TypeScript types that already exist alongside them.

Test plan

Reproduced the gbrain__extract_facts rejection on a live OpenClaw 2026.5.4 + Gemini Pro deployment (private OCI setup).
Applied the patch, restarted the OpenClaw stdio child, confirmed bundle-mcp no longer errors and all 60 gbrain__* tools are surfaced into the LLM tool inventory.
Confirmed gbrain__get_stats and gbrain__search round-trip successfully end-to-end (host → MCP → Supabase) after the patch.
Repo-level test suite — I trust the existing CI to catch regressions; both diffs are pure JSON Schema metadata, no code path changes.

Notes

I grepped for other type: 'array' declarations without items in src/; only these two showed up in master at the time of writing (commit 17b190e).
Other MCP libraries may be more lenient — Claude Sonnet 4.6 accepts the unfixed schema, which is likely why this slipped through. Gemini Pro / Google's strict validator catches it.

^{Need help on this PR? Tag @codesmith with what you need.}

Let Codesmith autofix CI failures and bot reviews

… Schema compat) Two array properties were declared with type: 'array' but no items field, which Gemini Pro's strict JSON Schema validator rejects: Invalid schema for function 'gbrain__extract_facts': In context=('properties', 'entity_hints'), array schema missing items. Effect on real deployments: when OpenClaw 2026.5.4 (or any host that uses Gemini Pro as the chat model) registers gbrain via stdio MCP, the daemon spawns 'gbrain serve', enumerates tools, and forwards their JSON Schemas to the LLM API. The schema is rejected on every request, blocking the entire gbrain__* tool surface (60 tools) for that session. Fixes: - src/core/operations.ts: entity_hints (extract_facts input) gets items: { type: 'string' } — matches the existing description ('canonical entity slugs') and runtime cast 'p.entity_hints as string[]'. - src/core/resolvers/builtin/x-api/handle-to-tweet.ts: candidates output gets items matching XTweetCandidate interface (tweet_id, text, created_at, score, url). No runtime behavior change. Schema metadata only.

…chemas Adds invariant tests that walk every operation inputSchema and every builtin resolver inputSchema/outputSchema, collecting paths where { type: 'array' } lacks an items field. The arrays.length === 0 assertion is the regression guard — it would have caught both schemas fixed in the previous commit, and will catch any future drift on the same class of bug. - test/mcp-tool-defs.test.ts: walks buildToolDefs(operations). Catches input schema arrays missing items (e.g. extract_facts.entity_hints). - test/resolvers.test.ts: walks xHandleToTweetResolver and urlReachableResolver schemas. Catches output schema arrays missing items (e.g. handle-to-tweet outputSchema.candidates), which buildToolDefs doesn't cover. Pure unit tests, no network/db required. Local run: 6 pass mcp-tool-defs, 55 pass resolvers.

… placement (#1053) * refactor(mcp): centralize ParamDef→JSON Schema via shared paramDefToSchema Three duplicate inline mappers existed across the MCP surface: - src/mcp/tool-defs.ts (stdio MCP buildToolDefs) - src/commands/serve-http.ts:837 (live HTTP MCP tools/list) - src/core/minions/tools/brain-allowlist.ts:84 (subagent tool registry) Each had subtly different items propagation. The HTTP MCP variant dropped items entirely, leaving extract_facts.entity_hints broken for OAuth- authenticated remote agents even after a buildToolDefs-only patch. The subagent variant propagated one level of items but used the same shallow shape so nested arrays would silently drop. Extract a single recursive paramDefToSchema helper exported from src/mcp/tool-defs.ts and have all three mappers consume it. Closes the bug class at the architecture level instead of patching one site at a time. The helper copies type, description, enum, default, and recursively rebuilds items so array-of-arrays preserves inner shape. Key ordering (type, description, enum, default, items) matches the pre-v0.34 inline mappers so JSON.stringify output stays byte-stable for every existing operation that does not use nested arrays. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(schema): add items to extract_facts.entity_hints and handle-to-tweet candidates Two array fields shipped without the items property required by JSON Schema. Strict-mode validators (Gemini Pro structured outputs, OpenAI strict tool definitions) reject the entire schema when any type:'array' lacks items. Downstream agents on those providers couldn't use extract_facts or the x_handle_to_tweet resolver. extract_facts.entity_hints — declared items: { type: 'string' } matching the handler at src/core/operations.ts:2733 which already coerces the runtime value to string[]. handle_to_tweet outputSchema.candidates — full XTweetCandidate spec including required + additionalProperties: false. The XTweetCandidate TypeScript interface declares all five fields as required; without required in the JSON Schema, a validator would accept {} as a valid candidate. additionalProperties: false closes the OpenAI strict-mode contract. 19 community PRs (#1028 #999 #980 #979 #910 #904 #847 #832 #863 #862 #812 for entity_hints; #910 caught candidates) converged on these locations. This wave cherry-picks the deepest variant (#910 surfaced both bugs) and centralizes via the paramDefToSchema helper from the preceding commit so the live HTTP MCP tools/list path is also fixed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-Authored-By: DmitryBMsk (PR #910) * fix(git-remote): move --no-recurse-submodules after the subcommand verb Git CLI accepts two flag positions: git [global -c flags] <subcommand> [subcommand flags] [args] Global -c config flags belong before the verb. Subcommand-specific flags (like --no-recurse-submodules) belong after. Pre-v0.34 GIT_SSRF_FLAGS spliced both kinds before the verb, so cloneRepo invoked: git -c http.followRedirects=false ... --no-recurse-submodules clone URL DIR Real git rejects this with exit 129 ("unknown option: --no-recurse-submodules") because --no-recurse-submodules is a clone subcommand flag, not a global config flag. Every remote-source clone broke in production from v0.28 onward. The fake-git harness in test/git-remote.test.ts exits 0 regardless of argv shape, which is why CI never caught it. Split GIT_SSRF_FLAGS (3 -c config flags, spread BEFORE the verb) from GIT_SSRF_SUBCOMMAND_FLAGS (--no-recurse-submodules, spread AFTER the verb). cloneRepo and pullRepo both spread the new constant after their respective verbs. The constant names signal the position rule so future additions land in the right place. 7 community PRs converged on this location (#1023 #1020 #985 #963 #846 #842 — #800 doesn't exist). This wave cherry-picks the semantic- constant approach from #846's GIT_SSRF_SUBCOMMAND_FLAGS name (the clearest signal of the position rule). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(mcp+git+resolvers): structural array-items + subcommand-position guards Three new tests / test groups close the bug classes the wave fixes: test/mcp-tool-defs.test.ts — recursive structural guard walks every operation's inputSchema and fails with a property path if any type:'array' lacks items.type. Explicit fixture assertions for extract_facts.entity_hints.items.type and a synthetic nested-array ParamDef pinning items.items.type recursion. Without the explicit fixtures the legacyInlineMap byte-equality test is mirror-theater — mirroring both sides of the equality preserves the blind spot. test/git-remote.test.ts — split snapshot test into GIT_SSRF_FLAGS (3 global -c entries) and GIT_SSRF_SUBCOMMAND_FLAGS (--no-recurse-submodules). cloneRepo + pullRepo argv tests now assert the subcommand flag appears AFTER the verb index. Pre-v0.34 the pinned argv slice prefix included --no-recurse-submodules, which baked the bug into the test suite (codex catch). test/resolvers.test.ts — recursive walk over both inputSchema AND outputSchema for builtin resolvers (xHandleToTweetResolver, urlReachableResolver). Explicit imports rather than getDefaultRegistry(), which starts empty until commands/resolvers.ts runs — codex catch on a hollow-walk failure mode. Dedicated case pins candidates items shape including required + additionalProperties. Reference legacyInlineMap in mcp-tool-defs.test.ts mirrors the new recursive paramDefToSchema helper. No current op uses nested arrays so the byte-equality test stays green for every existing operation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(e2e): raise rerank timeouts for ZE live cold-start The first rerank call of a CI run hits ZeroEntropy's cold-start latency (observed ~5-6s on Tier 2 LLM Skills runners; subsequent calls < 500ms). Two timeouts fired simultaneously at ~5s: 1. bun:test's default 5000ms per-test timeout caused (fail). 2. gateway.rerank's DEFAULT_RERANK_TIMEOUT_MS = 5000 fired right after, reported as "Unhandled error between tests". The next rerank test (top_n=2) ran in 409ms because the API was already warm. Cold-start is the only issue. Pass explicit timeoutMs to each rerank() call and a longer per-test timeout (30s) on both ZE rerank tests. Production DEFAULT_RERANK_TIMEOUT_MS stays at 5s for the search hot path — these E2E tests bypass it locally without changing the default that protects user latency. Unrelated to the fix-wave in this PR (mcp-tool-defs + git-remote + resolver guards). Lands here to keep Tier 2 LLM Skills green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v0.35.2.0) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: sync for v0.35.2.0 Update CLAUDE.md Key files annotations for the v0.35.2.0 fix wave: - src/mcp/tool-defs.ts: document new exported recursive paramDefToSchema helper and the three-consumer centralization (stdio MCP, HTTP MCP tools/list, subagent registry). - src/core/minions/tools/brain-allowlist.ts: paramsToInputSchema now consumes the shared helper. - src/commands/serve-http.ts: tools/list handler now consumes the shared helper (closes the HTTP MCP items-dropped bug class). - src/core/git-remote.ts: new entry. Documents the GIT_SSRF_FLAGS (global config, pre-verb) vs GIT_SSRF_SUBCOMMAND_FLAGS (subcommand-scoped, post-verb) split, the 7-month silent regression, and the position-anchored regression guard in test/git-remote.test.ts. Regenerated llms-full.txt to match. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: rebump version to v0.35.3.0 Queue moved while this PR was open — v0.35.2.0 was claimed by master's v0.35.1.0 sibling work. Advancing one slot. No code changes; only: - VERSION + package.json: 0.35.2.0 → 0.35.3.0 - CHANGELOG.md: rewritten header + inline references - CLAUDE.md: rewritten 4 key-file annotations - llms-full.txt + llms.txt: regenerated to mirror CLAUDE.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

garrytan · 2026-05-18T19:51:04Z

Thanks @DmitryBMsk — extract_facts.entity_hints (and the other MCP array-schema items fixes) already ship in master as of v0.35.3.0. The shared mapper is paramDefToSchema in src/mcp/tool-defs.ts (used by stdio MCP, HTTP MCP, and the subagent registry); test/mcp-tool-defs.test.ts walks every array param and fails the suite if any lacks items.type. If your install is still hitting this on next gbrain upgrade, please reopen with the output of gbrain doctor --json.

Closing as already-shipped. Real appreciation for chasing this — the same bug was independently reported by ~12 contributors which is exactly the kind of signal that gets the structural fix prioritized.

DmitryBMsk added 2 commits May 12, 2026 10:34

garrytan mentioned this pull request May 16, 2026

v0.35.3.0 fix wave: extract_facts items + git --no-recurse-submodules placement #1053

Merged

8 tasks

garrytan closed this May 18, 2026

garrytan mentioned this pull request May 26, 2026

Fix git submodule flag placement #846

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add missing items field to two array schemas (Gemini Pro strict JSON Schema compat)#910

fix: add missing items field to two array schemas (Gemini Pro strict JSON Schema compat)#910
DmitryBMsk wants to merge 2 commits into
garrytan:masterfrom
DmitryBMsk:fix/array-schema-missing-items

DmitryBMsk commented May 12, 2026 •

edited by blacksmith-sh Bot

Loading

Uh oh!

garrytan commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DmitryBMsk commented May 12, 2026 • edited by blacksmith-sh Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Repro

Fix

src/core/operations.ts — extract_facts.params.entity_hints

src/core/resolvers/builtin/x-api/handle-to-tweet.ts — outputSchema.candidates

Why not a typedef-driven generator?

Test plan

Notes

Uh oh!

garrytan commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DmitryBMsk commented May 12, 2026 •

edited by blacksmith-sh Bot

Loading

`src/core/operations.ts` — `extract_facts.params.entity_hints`

`src/core/resolvers/builtin/x-api/handle-to-tweet.ts` — `outputSchema.candidates`