Skip to content

chore: preserve one more schema layer during large tool compaction#27084

Merged
celia-oai merged 1 commit into
mainfrom
dev/cc/schema-change
Jun 8, 2026
Merged

chore: preserve one more schema layer during large tool compaction#27084
celia-oai merged 1 commit into
mainfrom
dev/cc/schema-change

Conversation

@celia-oai

@celia-oai celia-oai commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Summary

Some customer MCP tools expose large input schemas that exceed Codex's compact schema budget even after description stripping. Today, the final compaction pass collapses complex schemas starting at depth 2, which can erase important shallow call structure such as small anyOf branches, required fields, and help-mode entry points. In one reported case, this degraded a tool schema into query: any | any, leaving the model without enough structure to discover the required help call.

This change raises the deep-schema collapse boundary from depth 2 to depth 3. That preserves one additional layer of the tool contract while still collapsing deeper expensive subtrees to {} when a schema remains over budget.

What Changed

  • Increased MAX_COMPACT_TOOL_SCHEMA_DEPTH from 2 to 3.
  • Updated the schema compaction traversal test to assert the new collapse boundary.
  • The resulting compacted shape keeps useful shallow structure, for example:
    • top-level argument names
    • shallow anyOf branches
    • required object fields
    • nested property names one level deeper than before

Validation

  • Ran just test -p codex-tools: 81 tests passed.
  • Ran a golden schema corpus comparison over 214 discovered tool input schemas under golden_schemas/*/mcp_tools/*/input_schema.json.
    • Depth 2 and depth 3 had identical percentile token counts across the corpus.
    • Both ended with 0 / 214 schemas over 1k tokens.
    • Both ended with 0 / 214 schemas over the 4,000-byte compact JSON budget.
    • Only one golden schema changed, increasing from 49 to 56 tokens, so this does not appear to introduce a meaningful corpus-wide regression.

Corpus percentile results:

Percentile Depth 2 Depth 3
p0 9 9
p10 31 31
p25 54 54
p50 81 81
p75 143 143
p90 290 290
p95 431 431
p99 600 600
max 832 832

@celia-oai celia-oai changed the title changes chore: preserve one more schema layer during large tool compaction Jun 8, 2026
@celia-oai celia-oai marked this pull request as ready for review June 8, 2026 22:54
@celia-oai celia-oai requested a review from pakrym-oai June 8, 2026 22:54

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0d1f9d3e58

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// schema budget.
const MAX_COMPACT_TOOL_SCHEMA_BYTES: usize = 4_000;
const MAX_COMPACT_TOOL_SCHEMA_DEPTH: usize = 2;
const MAX_COMPACT_TOOL_SCHEMA_DEPTH: usize = 3;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep large object schemas under the compaction budget

For MCP tools whose large schema has many second-level object properties, raising the collapse boundary to 3 preserves every property's type/properties wrapper and there is no later fallback for pure object schemas: the next pass only prunes compositions. A schema like event -> f000..f199 -> nested -> leaf compacted below the 4KB budget with depth 2, but stays around 10KB after this change, so the code can now send tool schemas well over the stated compact-schema budget instead of degrading further.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fine given the golden schema test

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is best effort not restrictive anyways

@celia-oai celia-oai enabled auto-merge (squash) June 8, 2026 23:05
@celia-oai celia-oai merged commit 6042e58 into main Jun 8, 2026
31 checks passed
@celia-oai celia-oai deleted the dev/cc/schema-change branch June 8, 2026 23:07
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants