fix(tool-registry): add lazy factory registration with inflight concurrency dedup#3297
Conversation
- Add inflight map to deduplicate concurrent ensureTool() calls - stop() awaits inflight promises before disposing tools - copyDiscoveredToolsFrom() iterates source.tools directly - getAllTools/getFunctionDeclarationsFiltered warn when factories pending - registerLazy catches isToolEnabled() exceptions gracefully - Add concurrency, retry, stop() inflight, and getAllToolNames() tests Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
When ensureTool() finds a cached tool, delete any leftover factory entry for the same name so warmAll() does not attempt to re-load an already instantiated tool. Also replace @ts-expect-error with explicit (scheduler as any) cast in coreToolScheduler.test.ts to fix the TS2578 unused-directive error. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
📋 Review SummaryThis PR implements a lazy tool registration system in 🔍 General Feedback
🎯 Specific Feedback🔴 CriticalFile: File: 🟡 HighFile: File: File: 🟢 MediumFile: File: File: 🔵 LowFile: File: File: File: ✅ Highlights
|
docs/startup-optimization-analysis.md is an internal research artifact and should not be part of the public PR. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
…ttled in warmAll - Narrow `inflight` map type from Promise<AnyDeclarativeTool | undefined> to Promise<AnyDeclarativeTool> (factory always resolves to a tool or throws) - Change warmAll() from Promise.all to Promise.allSettled so a single factory failure no longer prevents other factories from being awaited; failures are logged as warnings instead of propagated as exceptions Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
70b43b4 to
1ffefdf
Compare
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
Resolve PR #3297 conflicts with main while preserving lazy tool factory registration. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Resolve latest conflicts in core lazy tool registration paths and agent runtime integration. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
wenshao
left a comment
There was a problem hiding this comment.
[Suggestion] warmAll() now swallows factory/import/constructor failures via Promise.allSettled(...); debugLogger.warn(...), but Config.initialize() still treats await this.toolRegistry.warmAll() as startup preparation for built-in tools. A broken core tool can now fail to load during startup while session initialization still succeeds, leaving users with a partially initialized toolset and delayed runtime failures. Consider making startup warming fail fast for built-in/core tool factories, or adding a strict warm path used by Config.initialize().
— gpt-5.4 via Qwen Code /review
1. warmAll() strict mode: add `{ strict?: boolean }` option so
Config.initialize() can fail fast when a built-in tool factory
throws, instead of silently leaving the session partially
initialised.
2. Backward-compatible type re-exports: restore `export type { ... }`
for all tool classes removed from the package root (EditTool,
ShellTool, AgentTool, cron tools, etc.). TypeScript consumers
retain type compatibility at zero runtime cost. Runtime value
imports of these classes remain a documented breaking change and
must use the direct module path.
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
Fixed in 63ccc39. Added |
Add three test cases for warmAll({ strict: true }):
- throws when a factory fails
- does not throw in non-strict mode (default)
- still loads successful tools before throwing in strict mode
Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ — gpt-5.4 via Qwen Code /review
…rrency dedup (QwenLM#3297) Closes QwenLM#3221. Introduces a lazy factory API on ToolRegistry (registerFactory, ensureTool, warmAll, getAllToolNames) as infrastructure for future esbuild code-splitting (QwenLM#3226). With the current single-bundle build, the lazy API does not change startup time on its own — the primary immediate value is fixing three pre-existing bugs uncovered while designing it. Bug fixes: - Concurrent instantiation (P0): the original ensureTool had no concurrency protection around `await factory()` — two concurrent calls for the same tool both passed the cache check and each ran the factory, producing two instances. AgentTool and SkillTool register SubagentManager listeners in their constructors, so the extra instance leaked listeners. Fix: a per-name `inflight: Map<string, Promise<Tool>>` so concurrent ensureTool() calls share a single promise. On factory rejection the inflight entry is cleared so a subsequent call can retry. - stop() resource leak: stop() only disposed tools already in `this.tools`; tools still loading in `inflight` when stop() ran finished afterward and were never disposed. Fix: await Promise.allSettled(inflight.values()) before the dispose loop. - Cache hit left stale factory: ensureTool's cache-hit branch did not delete the factory entry, so warmAll() would re-invoke the factory for an already-loaded tool. Fix: delete the factory on cache hit. Additional hardening in response to review feedback: - warmAll({ strict?: boolean }): strict mode re-throws the first factory failure rather than swallowing it. Config.initialize() uses strict: true so a broken built-in tool fails startup fast instead of silently leaving a partially initialized registry; runtime-path callers (GeminiChat, agent runtime, etc.) continue to use the non-strict default and log failures via debugLogger. - getAllTools() and getFunctionDeclarationsFiltered() emit a debug warning when called while unloaded factories remain, nudging callers toward warmAll() without hard-breaking existing code paths. - copyDiscoveredToolsFrom() now iterates source.tools.values() directly instead of source.getAllTools() — the copy path deals only with already-discovered MCP/command tools and should not trigger the unloaded-factory warning. - MemoryTool and SkillTool config parsing was extracted into memory-config.ts and skill-utils.ts so a factory can resolve tool metadata without importing the tool module. Tests: - tool-registry.test.ts adds 128 lines covering: concurrent ensureTool runs the factory exactly once, warmAll and ensureTool overlap, retries succeed after a prior factory failure, stop() disposes tools that finish loading after stop was called, and warmAll strict vs default behavior. - 33 existing call sites across cli, core, agents, and subagents were updated to await warmAll() before bulk tool access.
Closes #3221
Summary
本 PR 在
ToolRegistry中引入了惰性工厂注册机制,主要解决以下问题:核心问题修复
1. 并发实例化 bug(P0)
原来的
ensureTool()在await factory()期间没有并发保护:两个同时到达的调用都能通过缓存检查,各自执行一次 factory,导致同一工具被实例化两次。AgentTool和SkillTool的构造函数会注册 SubagentManager 变更监听器,重复实例化会泄漏第一个实例的监听器。修复: 引入
inflight: Map<string, Promise<...>>—— 同名工具的并发加载复用同一个 promise,factory 只执行一次。失败时清理 inflight 以允许后续重试。2.
stop()资源泄漏stop()原来只 dispose 已加载的this.tools,但正在 inflight 中加载的工具完成后不会被 dispose。修复:
stop()先await Promise.allSettled(this.inflight.values()),再遍历 dispose,确保无遗漏。3. 缓存命中后 factory 残留
ensureTool()在缓存命中时未删除对应 factory,导致warmAll()会尝试重新加载已实例化的工具。修复: 缓存命中分支增加
this.factories.delete(name)。新增 API
registerFactory(name, factory)ensureTool(name)warmAll()getAllToolNames()说明
Test plan
Level 1 — Unit Tests
Expected: 306 tests passed, 2 skipped.
Key test cases:
tool-registry.test.tsensureTool concurrency > two concurrent calls run factory only oncetool-registry.test.tsensureTool concurrency > warmAll and ensureTool overlaptool-registry.test.tsensureTool concurrency > retries after factory failuretool-registry.test.tsstop > disposes tools that finish loading after stop is calledLevel 2 — Startup Verification
Expected: REPL starts without errors, no
Failed to warm tool factorywarnings.Level 3 — Per-tool Functional Verification
Send the following prompts in sequence to verify each tool module loads and executes correctly:
读取 package.json 的内容read_file在 /tmp 创建文件 test-lazy.txt,内容为 hellowrite_file把 /tmp/test-lazy.txt 里的 hello 改成 worldedit执行 echo $PWDshell在 packages/core/src 里搜索关键词 ensureToolgrep/ripgrep列出 packages/core/src/tools 目录的文件ls抓取 https://example.com 页面的标题web_fetch人工验证:






读取 package.json 的内容
在 /tmp 创建文件 test-lazy.txt,内容为 hello
执行 echo $PWD
在 packages/core/src 里搜索关键词 ensureTool
列出 packages/core/src/tools 目录的文件
Level 4 — Concurrent Tool Loading
Expected: all three results returned successfully, no duplicate-instantiation warnings.

Level 5 — Agent and Skill Tools
AgentToolandSkillToolregister SubagentManager listeners in their constructors — the primary risk area for the listener-leak bug.Expected: skill list displayed without errors. If subagents are configured, trigger one and confirm it completes without listener-leak warnings.

Level 6 — MCP Tools (if configured)
copyDiscoveredToolsFrom()was modified. Verify MCP tools still appear in the tool list and are callable.Level 7 — Full Regression
npm testNote: failures in
integration-tests/sdk-typescript/tool-control.test.tsare pre-existing — they require a live API endpoint and are unrelated to this PR.🤖 Generated with Claude Code