feat(tools): defer low-frequency built-in tools to reduce initial prompt size#4022
Conversation
…mpt size Mark Monitor, SendMessage, Skill, TaskStop, TodoWrite, and WebFetch as shouldDefer=true so their full schemas are excluded from the initial function-declaration list. The model discovers them on demand via ToolSearch, aligning with Claude Code's deferral strategy for infrequently used tools.
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
wenshao
left a comment
There was a problem hiding this comment.
wenshao
left a comment
There was a problem hiding this comment.
…tem prompt
TodoWrite is mandated as high-frequency by the core system prompt
("Use VERY frequently", "IMPORTANT: Always use the TODO_WRITE tool to
plan and track tasks throughout the conversation"). Deferring it forces
a ToolSearch round-trip before every required call, adding latency and
hurting prompt compliance.
SkillTool's full description carries the dynamically generated
<available_skills> listing and the BLOCKING invocation rules
("invoke the relevant Skill tool BEFORE generating any other response").
The deferred-tool summary truncates this, so the model loses both the
list of skills and the activation contract from the initial prompt.
Keep Monitor / SendMessage / TaskStop / WebFetch deferred — those are
genuinely infrequent and have no system-prompt requirement.
wenshao
left a comment
There was a problem hiding this comment.
Review Summary
The 4 tool-deferral markers in this PR are correct — each tool gets appropriate shouldDefer=true, alwaysLoad=false, and a searchHint. The previously-flagged Skill and TodoWrite deferrals were correctly reverted. All 322 unit tests pass.
Suggestions (infrastructure — not in this diff, already in main)
These apply to the tool-search/prompts/tool-registry infrastructure merged previously. File against the main branch or address in a follow-up:
-
select:comma splitting (tool-search.ts):split(',')would break on MCP tool names containing commas when the model pastes the JSON-quoted form. Low probability (MCP tool names rarely contain commas) but worth hardening. -
JSON.stringify round-trip gap:
buildDeferredToolsSectionrenders names asJSON.stringify(name), butselect:mode in tool-search only strips outer quotes — names with embedded"or\\won't resolve viaselect:. -
MCP descriptions semantically unfiltered: Structural defenses (JSON.stringify, truncation, guard prompt) prevent syntax escape but not semantic prompt injection via malicious MCP tool descriptions.
-
No observability:
revealDeferredTool/getFunctionDeclarationshave no production logs — deferred-tool state changes are silent, making headless debugging difficult.
— deepseek-v4-pro via Qwen Code /review
tanzhenxin
left a comment
There was a problem hiding this comment.
Review
The deferral lands cleanly. Constructor positional args match DeclarativeTool's signature exactly, BaseDeclarativeTool doesn't override it, and the three fallback paths (subagent surfaces via filtered declarations, resume-history reveal, eager-reveal when tool_search is excluded) all check out. The call to hold TodoWrite and Skill eager is right — the second commit shows you caught that Skill's description carries dynamic <available_skills> content that would be lost when truncated in the deferred-tools section.
Verdict
APPROVE.
Summary
shouldDefer=true: Monitor, SendMessage, TaskStop, and WebFetchsearchHintfor ToolSearch keyword matchingMotivation
PR #3589 introduced the ToolSearch deferred-tool mechanism but only deferred Cron, AskUserQuestion, ExitPlanMode, LSP, and MCP tools. Claude Code defers many more built-in tools. This PR closes the gap to further reduce initial prompt token usage.
Tools deferred
monitor watch tail log stream backgroundsend message task communicate notifytask stop cancel kill backgroundweb fetch url http download contentNot deferred (reverted in ca7e1a2):
<available_skills>listing and BLOCKING invocation rules that would be lost in the truncated deferred-tool summarySafety
getFunctionDeclarationsFiltered(used by subagents) bypassesshouldDeferfiltering — subagents still get full tool schemasclient.ts)Test plan
npx vitest run packages/core/src/tools/— 1071 tests passednpx vitest run packages/core/src/core/client.test.ts— 96 tests passednpx vitest run packages/core/src/core/prompts.test.ts— 60 tests passednpx tsc --noEmit— no type errors