Skip to content

fix(mcp): extract_facts entity_hints array schema missing items field (rejected by OpenAI strict validator) #831

@bautrey

Description

@bautrey

Repro

OpenAI's strict JSON-schema validator (used by gpt-5.5 / gpt-5-codex and any client that runs tools/list through the official OpenAI tool-schema validator) rejects the entire MCP tool list with:

LLM request rejected: Invalid schema for function 'gbrain__extract_facts': In context=('properties', 'entity_hints'), array schema missing items.

When this fires, every gbrain MCP tool becomes unavailable to that agent — not just extract_facts — because the strict validator rejects the whole tool list when any single tool def fails validation.

Root cause

src/core/operations.ts:2396 declares entity_hints as a JSON-schema array without an items field:

entity_hints: { type: 'array', description: 'Existing canonical entity slugs the agent has already resolved. Helps the extractor pick the right slug.' },

Per JSON Schema (and OpenAI's strict-mode validator), an array MUST declare what its items are. The handler at line 2421 already treats the value as string[]:

entityHints: Array.isArray(p.entity_hints) ? (p.entity_hints as string[]) : undefined,

so the right items schema is { type: 'string' }.

The ParamDef type already supports this — src/core/operations.ts:198 has items?: ParamDef, and the only other array-typed MCP param in the file (pages_updated at line 1692) sets it correctly:

pages_updated: { type: 'array', required: true, items: { type: 'string' } },

buildToolDefs in src/mcp/tool-defs.ts already passes items through to the emitted JSON schema when present, so this is purely a missing-field bug at the operation declaration site.

Fix

-    entity_hints: { type: 'array', description: 'Existing canonical entity slugs the agent has already resolved. Helps the extractor pick the right slug.' },
+    entity_hints: { type: 'array', items: { type: 'string' }, description: 'Existing canonical entity slugs the agent has already resolved. Helps the extractor pick the right slug.' },

Scope check (other array params)

I scanned src/core/operations.ts and src/mcp/ for type: 'array' props missing items: and found only this one in MCP-exposed tool definitions. The only other hit (candidates: { type: 'array' } in src/core/resolvers/builtin/x-api/handle-to-tweet.ts:102) is an internal resolver outputSchema that doesn't go through buildToolDefs, so it doesn't reach OpenAI's tool validator — but it's still technically invalid JSON Schema and worth fixing if you want a clean sweep. I'm leaving that one out of this issue to keep scope tight; happy to file separately if useful.

Verified

I patched our pinned install (commit 9c60b3a, v0.31.3) on a live pod, restarted the MCP server, ran tools/list, and confirmed:

  • The emitted extract_facts.inputSchema.properties.entity_hints now has "items": {"type": "string"}.
  • All 61 gbrain tools pass a recursive "every type: array has items" check.
  • OpenAI strict-validator rejection no longer reproduces.

Happy to send a PR if helpful — just wanted to surface this with the analysis first since it's a one-character-class change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions