feat(core): Workflow P3 — agent({schema, agentType, model, isolation:'worktree'}) (#4721)#5034
Conversation
…'worktree'}) (#4721) Adds the P3 dispatch options to the workflow runtime, completing the contract qwen-code's workflow tool matches against upstream Claude Code 2.1.168. P1/P2 stubs (workflow-sandbox.ts:508-527) are replaced with production paths routed through `SubagentManager.createAgentHeadless` so per-call model overrides go through `buildRuntimeContentGeneratorView` (provider routing), per-agent MCP servers / hooks get isolated lifecycles, and worktree-isolated subagents run against a rebound Config. - agent({agentType: 'X'}) resolves against the declarative-agents registry (#4842 + #4996) via findSubagentByName; unresolved names throw "agent({agentType}): agent type 'X' not found" verbatim from upstream. - agent({model: 'qwen3-max'}) is threaded into SubagentConfig.model so the runtime view sees it (modelConfigOverrides alone would only swap the model name within the existing provider's view). - Workflow's disallowed-tool floor [SendMessage, ExitPlanMode] is unioned with the agentType's own disallowedTools so a permissive agentType cannot re-enable them for a workflow subagent. - agent({isolation: 'worktree'}) provisions a fresh worktree via GitWorktreeService.createUserWorktree (slug agent-<7hex>, mirrors AgentTool 1849-1963), rebinds cwd/getTargetDir/getFileService/ getWorkspaceContext on a prototype-chained Config override, and on completion auto-removes the worktree if clean or preserves the path + branch (appended to the result string) when the subagent left changes. Parent-dirty trees are refused with a clear error to avoid silently running the subagent against a stale HEAD. - agent({isolation: 'remote'}) throws "agent({isolation:'remote'}) is not available in this build" verbatim (upstream 2.1.168 parity). - agent({schema: S}) injects a per-call SyntheticOutputTool (existing tools/syntheticOutput.ts, AJV-backed) into a fresh per-subagent ToolRegistry built via rebuildToolRegistryOnOverride, then watches AgentEventEmitter TOOL_CALL/TOOL_RESULT events for `structured_output` invocations. A successful call's args are captured as the dispatch return value (object, not string); after two failed attempts the third failure aborts the dispatch and throws "subagent completed without calling StructuredOutput (after 2 in-conversation nudges)" verbatim. No agent-core.ts changes — the entire 2-nudge counter lives in the dispatch layer so the shared subagent loop is unaffected. The sandbox's agent() wrapper now revives per-call object returns into the vm realm (JSON round-trip inside the vm runInContext block), closing the same T1/T8/T14 host-prototype-escape vector that P2's per-element revival closed for parallel/pipeline. Two new sandbox security tests (constructor-chain probe + non-JSON-serializable collapse) regress this. WorkflowAgentResult widens from `string` to `string | object`; the fast-path (no agentType/model/isolation/schema) is preserved byte-for-byte to keep P1/P2 zero-overhead. Tests: 159 workflow-suite tests + 217 adjacent (subagents / syntheticOutput / agent-override) all green. Real-LLM E2E follow-up planned (mirroring P2's 13/13 qwen3-max validation). Related #4721 (parent design — multi-phase, not closed by this PR) Related #4732 (P1 merged) #4947 (P2 merged) #4842 #4996 (declarative agents)
|
Thanks for the PR! Template looks good ✓ — all required sections present, bilingual, test plan is detailed. On direction: This is Phase 3 of the Dynamic Workflows port (P1 #4732, P2 #4947 already merged), completing the On approach: The scope is large (+1262/-81, 7 files) but well-justified — four distinct capabilities ( The Moving on to code review. 🔍 中文说明感谢贡献! 模板完整 ✓ — 所有必填段落齐全,双语,测试计划详细。 方向: 这是 Dynamic Workflows 移植的 P3(P1 #4732、P2 #4947 已合并),完成 方案: 范围较大(+1262/-81,7 个文件)但有充分理由 —— 四个不同能力(
进入代码审查 🔍 — Qwen Code · qwen3.7-max |
Code ReviewIndependent proposal (before reading diff): To add Comparison with the diff: The PR matches this proposal point-for-point and exceeds it in several ways — the double-finally for dispose + worktree cleanup, the child AbortController for schema-mode early termination, the parent-dirty refuse for worktree isolation, and the per-call No critical blockers found. Specific observations:
Test ResultsWorkflow suite (4 files, 160 tests)Adjacent regression (10 files, 217 tests)Typecheck & LintReal-scenario tmux testingNot applicable. The Workflow tool is gated behind The PR author explicitly noted real-LLM E2E as a follow-up commit, and the remaining risk is model-behavior shape ("does the model actually call 中文说明代码审查独立方案(读 diff 前): 给 与 diff 对比: PR 逐点对齐上述方案,并在以下方面超出预期 —— dispose + worktree 清理的 double-finally、schema 模式提前终止的 child AbortController、worktree 隔离的 parent-dirty 拒绝、并发 无关键阻塞问题。 具体观察:
测试结果
tmux 真实场景测试不适用。 Workflow tool 受 PR 作者明确将真模型 E2E 作为后续 commit。剩余风险是模型行为("模型是否会调 — Qwen Code · qwen3.7-max |
ReflectionThis is the third phase of a well-sequenced port. P1 laid the dispatch foundation, P2 added parallel/pipeline, and P3 completes the per-call Going back to my independent proposal from Stage 2a: the PR matches it point-for-point and exceeds it in the areas that matter — the double-finally for dispose + worktree cleanup, the child AbortController for schema early termination, and the per-call The code reads like someone who understands the upstream codebase deeply and is porting with intent, not blindly copying. The error messages are verbatim from upstream for script compatibility. The security thinking (JSON revival for the vm-realm boundary, constructor-chain regression tests) is the kind of detail that separates a working implementation from a shippable one. 377 tests, typecheck clean, lint clean. The only gap is real-LLM E2E, which the author explicitly committed to as a follow-up — and the remaining risk there is model behavior shape, not code correctness. Approving. ✅ 中文说明反思这是一个节奏良好的移植的第三阶段。P1 奠定 dispatch 基础,P2 加入 parallel/pipeline,P3 完成 per-call 回顾 Stage 2a 的独立方案:PR 逐点对齐并在关键领域超出 —— dispose + worktree 清理的 double-finally、schema 提前终止的 child AbortController、per-call 代码读起来像是对上游代码库有深入理解、带着意图做移植,而非盲目复制。错误消息与上游字面值一致以保证脚本兼容性。安全思考(vm 域边界的 JSON 复活、constructor-chain 回归测试)是区分"能跑"和"能上线"的细节。 377 测试全绿,typecheck 干净,lint 干净。唯一缺口是真模型 E2E,作者已明确承诺跟进 —— 剩余风险在模型行为,不在代码正确性。 通过 ✅ — Qwen Code · qwen3.7-max |
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
LGTM, looks ready to ship. ✅
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
…st gaps R1 of pre-push adversarial self-review on PR #5034 surfaced 6 confirmed findings across 6 diverse lenses (correctness / security / reuse-altitude / self-invariant / consumer-breakage / test-gaps). Each finding faced 2 independent skeptics defaulting to refuted=true; 6 survived majority challenge. Source code: - Worktree-preserved suffix wording now matches AgentTool's formatWorktreeSuffix (agent.ts:1700-1719) verbatim, including the `git worktree add <path> <branch>` recovery hint for the directory- removed-but-branch-preserved race. Test gaps closed: - schema-mode success after 1 nudge (round-2 args captured) - schema-mode success after 2 nudges (round-3 args captured) - schema-mode + agentType together — floor disallowedTools still unioned - schema-mode caller-abort takes priority over the StructuredOutput terminal error (signal.aborted check at workflow-orchestrator.ts:489-490) - override path dispose() runs in finally on the success path - override path dispose() runs in finally on the terminate-mode-error path Declined R1 finding: negative tests for invalid opt types (schema/model/ agentType passed null/number/empty-string). Adding upfront type validation is scope creep — upstream does not, P1/P2 do not, and the workflow tool is model-authored where these inputs are extremely unlikely. Existing AJV / SubagentManager downstream errors are descriptive enough. Will revisit if R2 makes a stronger case. 166/166 tests pass (workflow suite + adjacent + workflow-orchestrator). typecheck + lint clean across packages/core, packages/cli, integration-tests, sdk, webui.
…itize + 12 tests R2 of pre-push adversarial self-review on PR #5034. 6 diverse-lens finders (60 agents, ~2.5M tokens, 24 min) over the R1-fix-applied code, with 2 independent skeptics defaulting to refuted=true. 12 confirmed survivors after adversarial verify; decisions below. Security (FIX): - agent() wrapper in workflow-sandbox.ts now JSON-revives agentOpts inside the vm runInContext block BEFORE passing them to the host dispatch. Closes a Proxy/inherited-getter escape that P3 introduced along with the user-supplied schema object: a script could have wrapped agentOpts.schema in a Proxy whose getter ran host-side code during SyntheticOutputTool construction / AJV compile. Same mechanism as args / parallel-result revival. - runOverridePath now sanitizes opts.agentType through sanitizeForErrorMessage() (control chars → space) before interpolation into the "agent type 'X' not found" error message. Prevents a model-authored agentType containing CRLF / NUL from fragmenting a single-line error across log records / OTLP fields. Reuse-altitude (FIX): - Added JSDoc block to WorkflowWorktreeIsolation interface documenting each field's role for cleanup. Test gaps (FIX, 12 new tests): - agentType control-char sanitization regression - dispose() runs in finally when subagent.execute throws - isolation:'worktree' provision error branches (5): nested parent / git unavailable / not a git repo / parent dirty / createUserWorktree returns failure - isolation:'worktree' cleanup branches (3): removeUserWorktree fails / branchPreserved race / removeUserWorktree throws — each preserves the worktree (or branch) with the right user-facing suffix - combinations (2): model + isolation:'worktree' threads model AND provisions worktree; schema + isolation:'worktree' returns structured payload verbatim (preserved suffix only on string return) Test infrastructure: vi.mock'd GitWorktreeService at the module level (partial mock; preserves the existing exports the unrelated worktreeCleanup.ts depends on) with a per-test beforeEach reset. Declined R2 findings (kept the R1 line): - [major] Schema parameter upfront validation: same scope-creep decline as R1. Upstream doesn't do it; AJV's downstream error is descriptive enough. - [major] Worktree provision extracted to shared util with AgentTool: agreed in principle but out of P3 scope. A separate refactor PR should land that with AgentTool maintainers in the loop. 178/178 tests pass (workflow + adjacent suites). typecheck + lint clean across packages/core, packages/cli, integration-tests, sdk, webui.
Runtime verification — real TUI, tmux-driven (maintainer merge reference)Verdict: PASS — built PR head Claim (my read of the diff)P3 stops the sandbox from rejecting Method
Steps — before/after, all at the
|
agent() option |
Baseline (9b4ba60e, P1/P2) |
PR (6b4d7216, P3) |
|---|---|---|
isolation:'remote' |
stub "scheduled for a later phase" | "not available in this build" |
isolation:'worktree' |
stub "scheduled for a later phase" | worktree provisioned + subagent ran + lifecycle |
schema (success) |
stub "scheduled for P3" | returns validated object |
schema (no tool call) |
n/a (stub) | "after 2 in-conversation nudges" |
agentType (not found) |
stub "not supported in P1" | "agent type 'X' not found" |
agentType:'Explore' |
stub "not supported in P1" | resolves + routes → result |
Findings
- All four P3 options work at the real surface, and each shows the exact upstream-aligned message/behavior the PR claims. The
worktreepath is the most convincing — a real git worktree + branch appeared under.qwen/worktrees/and the subagent ran inside it, then the lifecycle appended the preserve note. - The worktree completed via the preserve branch even though the subagent didn't change anything — matching the pre-existing
hasWorktreeChanges/.qwen-session-marker quirk the PR description flags under S7. I confirmed it's a clean run that still preserves; it's the shared AgentTool cleanup-detection path, not something P3 introduced. Not a blocker. ⚠️ agent({model})was the one option I did not exercise at the TUI. Observing it needs two genuinely different providers to see the routing actually switch; my single mock provider can't show that. It's covered by the author's real-LLM E2E (S-series) and themodel is threaded into SubagentConfig.modelunit test. Flagging it as the gap in my runtime evidence.- The subagent token stream is mocked (deterministic, no API key), so this verifies P3's dispatch / sandbox / worktree-lifecycle wiring inside the real CLI process — not model quality. The author's 7/7 run against qwen3-coder-plus covers the live-model side; the two are complementary.
- Scaffolding note (not about the PR): the mock is stateless and keyed off the tools advertised per request — when I added the schema-fail toggle I restarted it before re-running, after the PR fix(cli): drop tool calls after cancellation #5020 lesson about stale mock processes.
Build: npm install && npm run build on both worktrees, clean. Tests not re-run locally (CI + the author's 159+217 suite cover them); this is runtime evidence only.
中文版(验证报告)
运行时验证 — tmux 驱动真实 TUI(合并参考)
结论:PASS — 本地构建 PR head 6b4d7216,在真实 TUI 里(QWEN_CODE_ENABLE_WORKFLOWS=1)驱动真实 workflow 工具,端到端验证了 P3 的全部四个 agent() 选项。每个选项都与 PR 的 merge-base 9b4ba60e(P1/P2)形成干净的前后对照——相同脚本在基线上命中 sandbox 的 "scheduled for …" stub。
对 diff 的理解
P3 让 sandbox 不再拒绝 agent({schema|model|isolation|agentType}),而是路由到真正的 dispatch 实现:schema 注入 per-call structured_output 工具并把校验后的 args 作为对象返回;agentType 经 SubagentManager.findSubagentByName 解析(解析不到 → 逐字错误);isolation:'worktree' 在 .qwen/worktrees/agent-<hex> 下开 git worktree;isolation:'remote' 抛 "not available in this build"。diff 与此一致。
方法
- PR worktree + 一个位于 merge-base
9b4ba60e(P1/P2,确保唯一差异是 P3)的基线 worktree,都npm install && npm run build。 - 真实 TUI(
--approval-mode yolo,QWEN_CODE_ENABLE_WORKFLOWS=1)在 tmux 中运行,$HOME隔离。 - 本地 mock OpenAI 服务器按每个请求声明的工具扮演三种角色:主 agent(看到
workflow)发出携带测试脚本的workflowtool call;schema 模式子 agent(看到structured_output)发出该 tool call;普通子 agent(Explore/worktree)返回文本。所以主 agent 真的调用了workflow工具,orchestrator 真的执行了脚本,agent()真的经 P3 代码 dispatch 子 agent——只有模型的 token 流是 mock 的。
步骤 — 前后对照,均在 workflow 工具卡片观察
agent({isolation:'remote'})— PR:"not available in this build";基线:"not supported in P1 ... scheduled for a later phase"。- ✅
agent({schema})成功路径 — PR 返回真正的对象{"primary_color":"blue","confidence":0.9},mock 角色main→sub-schema→main(子 agent 真的调了structured_output,event listener 捕获其 args);基线:"scheduled for P3"。 - 🔍
agent({schema})失败路径(PR) — 子 agent 坚持用文本作答、从不调structured_output:"subagent completed without calling StructuredOutput (after 2 in-conversation nudges)"(upstream 逐字终端错误)。 agent({agentType:'NoSuchAgent_42xyz'})— PR:"agent type 'NoSuchAgent_42xyz' not found";基线:"not supported in P1"。- ✅
agent({agentType:'Explore'})解析并路由(PR) —✓ Workflow,result "GREEN",角色main→sub-plain→main(内置 Explore 解析成功、子 agent 在其工具面上运行);基线:"not supported in P1"。 - ✅
agent({isolation:'worktree'})完整生命周期(PR) —✓ Workflow,worktree 真实创建:git worktree list显示.qwen/worktrees/agent-6b2b00d [worktree-agent-6b2b00d],子 agent 在其中运行,完成后 result 追加 "[worktree preserved at ... on branch ...]"。基线:"scheduled for a later phase",且没有创建 worktree 目录(stub 在 provision 前就拒了)。
agent() 选项 |
基线(9b4ba60e,P1/P2) |
PR(6b4d7216,P3) |
|---|---|---|
isolation:'remote' |
stub "scheduled for a later phase" | "not available in this build" |
isolation:'worktree' |
stub "scheduled for a later phase" | worktree 创建 + 子 agent 运行 + 生命周期 |
schema(成功) |
stub "scheduled for P3" | 返回校验后对象 |
schema(不调工具) |
不适用(stub) | "after 2 in-conversation nudges" |
agentType(未找到) |
stub "not supported in P1" | "agent type 'X' not found" |
agentType:'Explore' |
stub "not supported in P1" | 解析 + 路由 → result |
观察
- 四个 P3 选项在真实 surface 上都工作,且每个都呈现 PR 声称的 upstream 对齐消息/行为。worktree 路径最有说服力——
.qwen/worktrees/下真实出现了 git worktree + 分支,子 agent 在其中运行,生命周期追加了 preserve 提示。 - worktree 即使子 agent 没改任何东西也走了 preserve 分支——与 PR 描述 S7 标注的既有
hasWorktreeChanges/.qwen-sessionmarker quirk 一致。我确认这是干净运行仍被保留;它是 AgentTool 共享的 cleanup 检测路径,不是 P3 引入的,不阻塞合并。 ⚠️ agent({model})是我唯一没在 TUI 上验证的选项。 要观察它需要两个真正不同的 provider 才能看到路由切换;我的单一 mock provider 显示不出来。它由作者的真实 LLM E2E 和 "model 被 thread 进 SubagentConfig.model" 单测覆盖。作为我运行时证据的缺口标注出来。- 子 agent 的 token 流是 mock 的(确定性、无需 API key),所以本验证证明的是 P3 的 dispatch / sandbox / worktree 生命周期在真实 CLI 进程里的接线,而非模型质量。作者用 qwen3-coder-plus 跑的 7/7 覆盖了真实模型一侧,二者互补。
- 脚手架说明(与 PR 无关):mock 无状态、按每请求声明的工具分角色——加 schema-fail 开关后我重启了它再重跑(吸取 PR fix(cli): drop tool calls after cancellation #5020 旧 mock 进程的教训)。
构建:两个 worktree 上 npm install && npm run build 干净通过。本地未重跑测试(CI + 作者 159+217 套件已覆盖);本报告只提供运行时证据。
wenshao
left a comment
There was a problem hiding this comment.
Reviewed the full diff plus the API contracts it links against, and ran the suites locally.
Verified locally (fresh worktree at 6b4d721):
- 160/160 workflow suite + 209/209 adjacent regression (subagents / syntheticOutput / agent-override) — all green
tsc --noEmitandeslintclean on touched files
Contract checks that hold (each verified against the actual implementations, not just the PR text):
- Schema override sets
TOOL_REGISTRY_REBUILT, sobuildSubagentContextOverrideskips its own rebuild and the per-callSyntheticOutputToolsurvives into the subagent's registry — the most fragile interaction in this design, handled correctly. - Validation failures surface as
TOOL_RESULT success:falsethrough CoreToolScheduler, so the 3-failure abort counting is real; agent-core's un-emitted-callId backfill keepspendingArgsfrom dangling. - Ephemeral config (no
tools, floor disallowedTools) →convertToRuntimeConfigyieldstools: ['*']— tool surface equivalent to the fast path. - Worktree provision/rebind/cleanup mirrors AgentTool line-for-line (dirty-parent refuse, fail-closed checks, double-finally with null-out against double cleanup).
- vm-realm revival runs inside the vm bootstrap; the two security regressions pin the constructor-chain escape closed.
- Fast path is byte-for-byte preserved (indentation-only diff).
Findings: one High (schema+agentType drops the resolved agent's persona — upstream contract appends; see inline), two Medium (abort-listener leak per schema call; terminateMode misdiagnosis + nudge wording; see inline), and two Low:
- L1: no unit coverage for the worktree lifecycle (provision/rebind/cleanup — E2E S7 only, harness not committed) and none for the schema+agentType / schema+worktree combinations. Committing the harness under
integration-tests/as offered in the PR description would close most of this. - L2: schema-mode preserved-worktree info goes to debugLogger only, invisible to the script — acknowledged in the code comment as a deliberate tradeoff; worth revisiting once P4's narrator lands.
Overall: solid engineering — the dangerous interactions are all consciously handled with regression tests, and the PR honestly discloses the pre-existing session-marker quirk found during S7. Direction is approve; I'd fix H1 (small: append instead of replace + one combo test) and ideally M1 before merge.
Round 1 (15:41) + Round 2 (17:24) review from wenshao surfaced 7 inline
findings across schema-mode dispatch correctness, worktree cleanup
coverage, and error attribution. Each fix is paired with a regression
test that was RED before the change landed.
T0 [Critical] Worktree leak when schema setup throws after provision
workflow-orchestrator.ts: outer try MOVED to start immediately after
provisionWorkflowWorktree. Previously the try opened only after
createSchemaConfigOverride / createSchemaModeState / signal listener
attachment — so any throw in those three (broken MCP server during
the per-call ToolRegistry rebuild was the trigger wenshao cited)
orphaned the just-provisioned worktree under .qwen/worktrees/.
Test: "isolation:'worktree' + schema setup throws → worktree is
still cleaned up" — simulates createToolRegistry failure during
createSchemaConfigOverride; asserts removeUserWorktree was called.
T1 [Critical] / T4 [H1] agentType + schema silently dead-ended
workflow-orchestrator.ts: schema-mode augmented config now (a)
appends ToolNames.STRUCTURED_OUTPUT to baseConfig.tools when the
allowlist is restricted (no '*' and doesn't already contain it), so
prepareTools / getFunctionDeclarationsFiltered doesn't filter
structured_output out of the subagent's surface; (b) preserves the
resolved agentType's persona by APPENDING the schema-contract
instruction block instead of replacing the systemPrompt outright.
Replace remains only on the ephemeral no-agentType path where
baseConfig.systemPrompt IS WORKFLOW_SUBAGENT_SYSTEM_PROMPT (schema
variant is its strict superset; avoids two near-identical prompts).
Tests: structured_output appears in the allowlist alongside the
agentType's existing tools; persona prompt is contained in the
effective systemPrompt.
T2 [Suggestion] / T5 [M1] Parent-abort listener leaked per schema call
workflow-orchestrator.ts: named listener stored at outer scope,
removed in the outer finally regardless of how the dispatch ended.
Previous `{ once: true }` only auto-removed on actual parent abort;
the happy-path schema dispatch — success capture / 3-failure abort
fires the CHILD controller without the parent ever aborting — left
the listener stuck on the per-run signal. With N schema calls per
workflow N listeners + N child-controller closures accumulated.
Test: 5 sequential schema dispatches over the same parent signal
end with zero live listeners.
T6 [M2] Terminate mode misdiagnosed as nudge exhaustion
workflow-orchestrator.ts: schema path now distinguishes
terminateMode before attributing failure to schema mode. TIMEOUT /
MAX_TURNS / ERROR throw the existing "did not complete (terminate
mode: X)" message that the non-schema path uses. Only the actual
schema-failure cases produce schema wording, and those are split:
attempts > 2 keeps the upstream-verbatim "(after 2 in-conversation
nudges)" wording; attempts === 0 throws an accurate "no validation
attempt — model produced plain-text content" instead of misleadingly
citing nudges that never happened. (The existing 0-call test was
updated to match the new accurate message; the 3-failure test
retains the verbatim wording.)
Tests: parametric over TIMEOUT/MAX_TURNS/ERROR asserting "did not
complete"; companion test pinning the verbatim wording to the
3-failure path.
T3 [Suggestion] Schema-mode JSON revival sentinel — clarified
workflow-sandbox.ts: added a block comment documenting that the
JSON-round-trip + null-on-throw is a SECURITY backstop (errors-as-data
convention from parallel/pipeline) rather than a contract path —
unreachable in production schema mode because the host return is
LLM tool_call args, always JSON-serializable. No behavior change.
Tests: 75/75 orchestrator + 111/111 sandbox/tool/limiter green.
typecheck + lint clean across packages/core and packages/cli.
R1+R2 self-review commits (e1c5ec7 / 62624a9) precede this commit
on the same branch — they predate wenshao's review and address
distinct findings; reviewer L1 (worktree-lifecycle unit coverage) is
already closed by R2's 11 worktree tests.
Round 1 + 2 — addressed in 99a15feThanks @wenshao for the runtime verification on top of two careful inline rounds. All 7 inline threads above have explicit replies + are resolved; this is the per-finding outcome:
Two notes on your review summary
Suite after this round: 75/75 orchestrator + 111/111 sandbox/tool/limiter green; |
|
@qwen-code /triage |
|
Thanks for the PR! Template looks good ✓ — all required headings present, bilingual section included, test plan is thorough. On direction: this is a well-scoped phase in a multi-phase port of upstream Claude Code's workflow dispatch contract. The upstream CHANGELOG confirms workflows are actively evolving (recent fixes for worktree isolation, attribution headers, trigger keywords). P3 completes the four On approach: the phased architecture (P1 → P2 → P3) has held up well. Each phase adds a focused slice without re-architecting prior work. The 7-file scope is tight for what it delivers — four dispatch options, security hardening, and 403 passing tests. The fast-path preservation (byte-for-byte unchanged when no opts are set) is the right call for zero-overhead backward compatibility. Moving on to code review. 🔍 中文说明感谢贡献! 模板完整 ✓ — 所有必需标题齐全,双语部分包含在内,测试计划详尽。 方向:这是将上游 Claude Code 的 workflow dispatch 契约移植到 qwen-code 的一个范围明确的阶段。上游 CHANGELOG 确认 workflow 功能仍在积极演进(近期修复了 worktree 隔离、归因 headers、触发关键词等)。P3 补全了 P1/P2 中以 stub 形式预留的四个 方案:分阶段架构(P1 → P2 → P3)保持良好。每个阶段添加一个聚焦的切片,不重新设计先前工作。7 个文件的范围对于所交付的内容(四个 dispatch 选项、安全加固、403 个通过的测试)来说是紧凑的。快速路径保留(无 opts 时逐字节不变)是零开销向后兼容的正确做法。 进入代码审查 🔍 — Qwen Code · qwen3.7-max |
Code ReviewRead all 7 changed files in the worktree. The implementation is clean and well-structured. What I independently would have done: extend Findings — no blockers:
No AGENTS.md violations detected. The code stays within the workflow subsystem, follows ESM + TypeScript strict conventions, and tests are collocated with source. Test ResultsUnit + Integration (worktree, PR branch
|
ReflectionStepping back: this PR delivers exactly what P1/P2 designed for. The four My independent proposal matched the PR's approach — two-path dispatch, SubagentManager resolution, per-call schema state, worktree provisioning via GitWorktreeService. The PR exceeds it in the details: the security hardening (JSON revival for object returns, agentType sanitization, opts round-tripping), the review-driven fixes (R3 T0 worktree leak, T2 listener accumulation, M2 terminate-mode disambiguation), and the fast-path preservation all show careful engineering. The test evidence is strong: 403 unit/integration tests across all affected surfaces, clean typecheck and lint, sandbox-level verification confirming the option routing, and two independent runtime verifications (collaborator's mock-driven TUI + author's real-LLM E2E). The one gap — The pre-existing worktree cleanup-detection quirk ( This is ready to ship. ✅ 中文说明反思退一步看:这个 PR 精确地交付了 P1/P2 所设计的内容。四个 我的独立方案与 PR 的方法一致——双路径 dispatch、SubagentManager 解析、per-call schema 状态、通过 GitWorktreeService 配置 worktree。PR 在细节上超越了它:安全加固(对象返回的 JSON 复活、agentType 净化、opts 往返)、审查驱动的修复(R3 T0 worktree 泄漏、T2 监听器累积、M2 terminate-mode 区分)以及快速路径保留都展示了精心的工程。 测试证据充分:403 个单元/集成测试覆盖所有受影响的表面,类型检查和 lint 干净,sandbox 级别验证确认了选项路由,以及两个独立的运行时验证(协作者的 mock 驱动 TUI + 作者的真实 LLM E2E)。 准备好合并了。✅ — Qwen Code · qwen3.7-max |
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
LGTM, looks ready to ship. ✅
🔬 Local runtime verification — Workflow P3
|
| probe | outcome | observed |
|---|---|---|
agent({schema}) |
✅ returns validated object | { "answer": "HELLO" } (subagent's structured_output args, AJV-validated, revived into the vm realm) |
{schema} no tool call |
✅ terminal error | subagent completed without calling structured_output (no validation attempt — model produced plain-text content). |
{schema} invalid args ×3 |
✅ terminal error | subagent completed without calling StructuredOutput (after 2 in-conversation nudges). ← the R3 (wenshao M2) message split works |
{isolation:'remote'} |
✅ throws | agent({isolation:'remote'}) is not available in this build. |
{agentType:'…nonexistent…'} |
✅ throws | agent({agentType}): agent type '…' not found. |
{isolation:'bogus-mode'} |
✅ throws | agent({isolation: 'bogus-mode'}): unknown isolation mode. Known modes are: 'worktree', 'remote'. |
{schema: <circular>} |
✅ throws (security guard) | agent() opts contain a non-JSON-serializable value: Converting circular structure to JSON … |
{isolation:'worktree'} (clean tree) |
✅ provisions + runs subagent in the worktree | result returned; worktree at .qwen/worktrees/agent-<hex> on branch worktree-agent-<hex> |
{isolation:'worktree'} (dirty tree) |
✅ refuses | parent working tree at … has uncommitted changes that would not propagate … Commit or stash the changes, then re-run. |
So the schema/structured-output contract (both failure modes), agentType resolution + not-found, isolation routing/validation, the dirty-tree refusal, and the vm/host non-serializable security guard all work end-to-end through the real subagent stack. The node:vm opts-revival and the object→vm-realm return revival both hold (the schema probe returns a real validated object, not a string).
⚠️ Bug: isolation worktrees are never auto-removed — they leak on disk
The tool description promises "the worktree is auto-removed if no changes." It never is. My worktree subagent made zero changes, yet every run preserved the worktree and returned done\n\n[worktree preserved: …/agent-<hex> (branch worktree-agent-<hex>)]. Three runs left three worktrees + three worktree-agent-* branches behind until I removed them by hand.
Root cause (in writeWorktreeSessionMarker, packages/core/src/services/gitWorktreeService.ts — outside this PR's diff, but this PR's cleanup contract newly depends on it): it writes the .qwen-session exclude rule to git rev-parse --git-dir (the per-worktree gitdir), but git reads info/exclude from the common gitdir (--git-common-dir). For a linked worktree these differ, so the rule lands where git never looks. Proven directly:
# inside the provisioned worktree:
--git-dir = …/.git/worktrees/agent-fc72cdc ← code writes the exclude here
--git-common-dir = …/.git ← git actually reads from here
$ cat …/.git/worktrees/agent-fc72cdc/info/exclude → .qwen-session (rule IS present)
$ git status --porcelain → ?? .qwen-session (still untracked!)
# move the SAME rule to the common exclude instead:
$ echo .qwen-session >> …/.git/info/exclude ; git status --porcelain → (empty — now ignored)
So the marker always shows untracked → hasWorktreeChanges() always returns true → cleanupWorkflowWorktree always takes the "preserve" branch. The .qwen/worktrees/ dir (and a branch per call) accumulates indefinitely. The same writeWorktreeSessionMarker backs the existing AgentTool worktree path (agent.ts:1955), so this isn't introduced here — but P3 widens the blast radius, since a workflow can fan out many isolated subagents.
Fix: one word — resolve --git-common-dir instead of --git-dir when locating info/exclude in writeWorktreeSessionMarker. (Verified at the git level above; the orchestrator's cleanup is gated solely on hasWorktreeChanges/hasUnmerged, so a correctly-placed exclude makes the no-change path auto-remove.)
Observations / notes
- The schema-failure message split is a nice touch and works exactly as the R3 review intended: no-attempt (plain text) vs 2-nudge exhaustion (3 invalid
structured_outputcalls) produce distinct, accurate terminals. agent({model})rides the samecreateAgentHeadlessoverride path asagentType/isolation(both verified); I did not assert it independently because the mock is model-agnostic — the request would just carry a different model id.- Feature is gated off in the product (no settings/CLI wiring for
workflowsEnabled; only theQWEN_CODE_ENABLE_WORKFLOWSenv gate exists), so none of this is user-reachable yet — consistent with a P3 building-block PR.
Reproduce
# real CLI, workflow tool enabled via the existing env gate, pointed at a mock LLM
QWEN_CODE_ENABLE_WORKFLOWS=1 OPENAI_BASE_URL=http://127.0.0.1:PORT/v1 OPENAI_API_KEY=sk-mock \
OPENAI_MODEL=mock-model QWEN_HOME=<isolated> \
node packages/cli/dist/index.js --approval-mode yolo --output-format json -p "run the workflow"
# mock: main turn → workflow tool_call with a script that try/catches agent() with each option;
# schema subagent → structured_output({answer}); BADARGS subagent → structured_output({wrong});
# FAILTEST subagent → plain text. Run once in a clean git repo, once in a dirty one.
# observe: the returned summary (all 9 probes) + `.qwen/worktrees/agent-* ` left behind after a no-change run.
Verified on Linux, Node v22, git 2.47.3. Real subagents via production dispatch; deterministic stub model. The workflow ran under tmux against the built packages/cli/dist/index.js.
🇨🇳 中文版(点击展开)
🔬 本地真实运行验证 —— Workflow P3 agent({schema, agentType, model, isolation})(tmux 下真实 CLI + workflow 工具)
结论:✅ P3 功能逻辑全部通过 —— 但发现一个
我运行了真实构建的 CLI,通过已存在的 QWEN_CODE_ENABLE_WORKFLOWS=1 环境开关启用 workflow 工具(PR 的代码未做任何改动),并指向一个确定性的 mock LLM。模型调用 workflow 工具,传入一个会探测每个新 agent() 选项的脚本;workflow 的真实 production dispatch 派生了真实的 subagent,它们都打到 mock。分别在一个干净的 git 工作区和一个有未提交改动的工作区里跑。
结果 —— 每个 agent() 选项都按文档行为表现
| 探测 | 结果 | 观察到的 |
|---|---|---|
agent({schema}) |
✅ 返回校验后的对象 | { "answer": "HELLO" }(subagent 的 structured_output 参数,经 AJV 校验,并 revive 回 vm realm) |
{schema} 不调用工具 |
✅ 终止错误 | subagent completed without calling structured_output (no validation attempt — model produced plain-text content). |
{schema} 参数非法 ×3 |
✅ 终止错误 | subagent completed without calling StructuredOutput (after 2 in-conversation nudges). ← R3(wenshao M2)的消息拆分生效 |
{isolation:'remote'} |
✅ 抛错 | agent({isolation:'remote'}) is not available in this build. |
{agentType:'…不存在…'} |
✅ 抛错 | agent({agentType}): agent type '…' not found. |
{isolation:'bogus-mode'} |
✅ 抛错 | agent({isolation: 'bogus-mode'}): unknown isolation mode. Known modes are: 'worktree', 'remote'. |
{schema: <循环引用>} |
✅ 抛错(安全护栏) | agent() opts contain a non-JSON-serializable value: Converting circular structure to JSON … |
{isolation:'worktree'}(干净) |
✅ 创建并在 worktree 内运行 subagent | 返回结果;worktree 在 .qwen/worktrees/agent-<hex>,分支 worktree-agent-<hex> |
{isolation:'worktree'}(脏) |
✅ 拒绝 | parent working tree at … has uncommitted changes … Commit or stash the changes, then re-run. |
所以 schema/结构化输出契约(两种失败模式)、agentType 解析与 not-found、isolation 路由/校验、脏树拒绝、以及 vm/host 的不可序列化安全护栏,全部通过真实 subagent 链路端到端跑通。node:vm 的 opts revive 和「对象返回值 revive 回 vm realm」都成立(schema 探测返回的是真实的校验对象,不是字符串)。
⚠️ Bug:隔离 worktree 永不自动清理 —— 在磁盘上泄漏
工具描述承诺「无改动时 worktree 会被自动移除」。但它从不移除。我的 worktree subagent 没做任何改动,但每次运行都保留了 worktree,并返回 done\n\n[worktree preserved: …/agent-<hex> (branch worktree-agent-<hex>)]。三次运行留下了三个 worktree + 三个 worktree-agent-* 分支,直到我手动删除。
根因(在 writeWorktreeSessionMarker,packages/core/src/services/gitWorktreeService.ts —— 不在本 PR diff 内,但本 PR 的清理契约新依赖了它):它把 .qwen-session 的 exclude 规则写到了 git rev-parse --git-dir(每个 worktree 私有的 gitdir),但 git 读取 info/exclude 是从公共 gitdir(--git-common-dir)读的。对于 linked worktree 这两者不同,所以规则写到了 git 根本不看的地方。已直接证明:
# 在创建出来的 worktree 内:
--git-dir = …/.git/worktrees/agent-fc72cdc ← 代码把 exclude 写在这里
--git-common-dir = …/.git ← git 实际从这里读
$ cat …/.git/worktrees/agent-fc72cdc/info/exclude → .qwen-session (规则确实在)
$ git status --porcelain → ?? .qwen-session (仍然 untracked!)
# 把同一条规则改写到公共 exclude:
$ echo .qwen-session >> …/.git/info/exclude ; git status --porcelain → (空 —— 现在被忽略了)
所以 marker 永远显示 untracked → hasWorktreeChanges() 永远返回 true → cleanupWorkflowWorktree 永远走「保留」分支。.qwen/worktrees/ 目录(以及每次调用一个分支)会无限累积。同一个 writeWorktreeSessionMarker 也支撑着现有的 AgentTool worktree 路径(agent.ts:1955),所以这不是本 PR 引入的 —— 但 P3 扩大了影响面,因为一个 workflow 可以 fan-out 出很多隔离 subagent。
修复: 一个词 —— 在 writeWorktreeSessionMarker 里定位 info/exclude 时用 --git-common-dir 而非 --git-dir。(上面已在 git 层面验证;orchestrator 的清理仅以 hasWorktreeChanges/hasUnmerged 为判据,所以把 exclude 放对位置后,无改动路径就会自动移除。)
其他观察
- schema 失败消息的拆分很到位,完全符合 R3 review 的意图:未尝试(纯文本)与 2 次 nudge 耗尽(3 次非法
structured_output调用)会产生不同且准确的终止消息。 agent({model})走的是与agentType/isolation相同的createAgentHeadlessoverride 路径(两者都已验证);我没有单独断言它,因为 mock 与 model 无关 —— 请求只会带一个不同的 model id。- 该功能在产品中是关闭的(没有
workflowsEnabled的 settings/CLI 接线,只有QWEN_CODE_ENABLE_WORKFLOWS这个环境开关),所以目前用户还触达不到 —— 这与一个 P3 基建型 PR 是一致的。
复现
# 真实 CLI,通过已有环境开关启用 workflow 工具,指向 mock LLM
QWEN_CODE_ENABLE_WORKFLOWS=1 OPENAI_BASE_URL=http://127.0.0.1:PORT/v1 OPENAI_API_KEY=sk-mock \
OPENAI_MODEL=mock-model QWEN_HOME=<隔离目录> \
node packages/cli/dist/index.js --approval-mode yolo --output-format json -p "run the workflow"
# mock:主回合 → 返回 workflow tool_call,脚本里对每个选项 try/catch 调用 agent();
# schema subagent → structured_output({answer});BADARGS subagent → structured_output({wrong});
# FAILTEST subagent → 纯文本。在干净 git repo 跑一次,在脏 repo 跑一次。
# 观察:返回的汇总(全部 9 个探测)+ 无改动运行后仍留在 `.qwen/worktrees/agent-*` 的 worktree。
在 Linux、Node v22、git 2.47.3 上验证。通过 production dispatch 跑真实 subagent;确定性桩模型。workflow 在 tmux 下针对构建产物 packages/cli/dist/index.js 运行。
What this PR does
Implements phase P3 of the Dynamic Workflows port, building on the merged P1 (#4732) and P2 (#4947). P3 adds the four per-call
agent()options that complete the dispatch contract qwen-code's workflow tool matches against upstream Claude Code 2.1.168:agent({agentType: 'X'})resolves against the declarative-agents registry (shipped via feat(core): declarative agent frontmatter v1 — permissionMode bridge + maxTurns wiring + color allowlist (CC 2.1.168 parity) #4842 + feat(core): port declarative-agent mcpServers + hooks (CC 2.1.168 parity follow-up) #4996) usingconfig.getSubagentManager().findSubagentByName(name). Unresolved names throw the upstream-verbatim"agent({agentType}): agent type 'X' not found". Resolved configs flow throughSubagentManager.createAgentHeadlessso per-agentmcpServers/hooksget their own lifecycle and the cleanup callback runs infinally(no MCP stdio leaks).agent({model: 'qwen3-max'})is threaded intoSubagentConfig.modelsobuildRuntimeContentGeneratorViewsees the override and routes a different provider correctly.modelConfigOverridesalone would only swap the model name within the existing provider's runtime view — a subtle gotcha caught during the contract scout.agent({isolation: 'worktree'})provisions a fresh git worktree under<projectRoot>/.qwen/worktrees/agent-<7hex>viaGitWorktreeService.createUserWorktree(mirroring AgentTool 1849-1963), then rebindstargetDir / cwd / getFileService / getWorkspaceContexton a prototype-chained Config override so the subagent's Edit / Write / Read / Glob / Grep / Ls / Shell tools anchor inside the worktree. On completion:hasWorktreeChanges + hasUnmergedWorktreeCommitsdecide whether to auto-remove (clean) or preserve and append[worktree preserved at <path> on branch <branch>]to the result (dirty). Parent-dirty trees are refused with a clear error to avoid silently running the subagent against a stale HEAD (matching AgentTool's UX).agent({isolation: 'remote'})throws"agent({isolation:'remote'}) is not available in this build"verbatim (upstream 2.1.168 parity — the binary ships the option but gates the feature off).agent({schema: S})injects a per-callSyntheticOutputTool(existingtools/syntheticOutput.ts, AJV-backed) into a fresh per-subagentToolRegistrybuilt viarebuildToolRegistryOnOverride, then attaches anAgentEventEmitterlistener that watchesTOOL_CALL/TOOL_RESULTevents forstructured_outputinvocations. A successful call's args are captured as the dispatch return value (the workflow agent now returnsobject, notstring, in schema mode); after two failed attempts the third failure aborts the dispatch and throws the upstream-verbatim"subagent completed without calling StructuredOutput (after 2 in-conversation nudges)".No changes to
agent-core.ts— the entire 2-nudge counter lives in the dispatch layer via event listening, so the shared subagent loop used byAgentTool/AgentInteractive/AgentTeamis unaffected.Architecture notes:
agentType/model/isolation/schema) is preserved byte-for-byte from P1/P2: directAgentHeadless.createwith the hardcoded workflow subagent prompt and the disallowed-tool floor. Zero added overhead for the common case.[SendMessage, ExitPlanMode]isArray.from(new Set([floor, agentType.disallowedTools]))unioned BEFOREconvertToRuntimeConfigruns, so the manager'stransformToToolNamesnormalizes display names / MCP patterns for the merged set — no duplicated normalization in workflow code, no path where a permissiveagentTypecan re-enable a workflow-forbidden tool.agent()wrapper now revives per-call object returns throughJSON.parse(JSON.stringify(value))inside the vmrunInContextblock. This closes the same T1 / T8 / T14 host-prototype-escape vector that PR feat(core): Workflow P2 — parallel() + pipeline() concurrent fan-out (#4721) #4947 closed forparallel()/pipeline()— schema-mode returns a host-realm object, and handing it to the script verbatim would reopen the escape viaresult.constructor.constructor("return process")(). Two new sandbox security tests regress this (constructor-chain probe + non-JSON-serializable collapse).WorkflowAgentResultwidens fromstringtostring | object. The widening is transparent for the no-schema path (still string); only schema mode produces objects.Why it's needed
The P1/P2 stubs surfaced clear
"scheduled for P3"error messages but the workflow tool was unusable for the upstream/deep-research-style scripts that rely onagent({schema})for typed outputs,agent({agentType})for picking the right subagent kind,agent({model})for cost-sensitive routing, andagent({isolation: 'worktree'})for parallel file-mutating subagents. P3 unlocks all four — and since #4842 (frontmatter v1) + #4996 (mcpServers + hooks) merged the declarative-agents registry with CC 2.1.168 parity, theSubagentManager.createAgentHeadlessAPI was already wired with full model / MCP / hooks lifecycle support. P3 just routes through it rather than reimplementing.Reviewer Test Plan
How to verify
Unit + integration (workflow suite, 159 tests; adjacent regression, 217 tests):
New tests covering P3 specifically:
workflow-sandbox.test.ts:agent({schema/agentType/model/isolation:'worktree'/'remote'})reach dispatch verbatim, sandbox no longer rejects them){isolation:'not-a-real-mode'}still throws at sandbox level since no dispatch can interpret it)workflow-orchestrator.test.ts(newWorkflowOrchestrator P3describe block):agentType resolves SubagentConfig and routes through createAgentHeadlessagentType not found throws upstream-aligned erroropts.model is threaded into SubagentConfig.model for provider routingisolation:'remote' throws upstream-aligned 'not available' errorfloor disallowedTools always unioned (agentType cannot re-enable them)schema-mode: structured_output success → returns validated argsschema-mode: 3 failed structured_output calls → upstream-aligned terminal errorschema-mode: subagent never calls structured_output → same terminal errorschema-mode attaches an event emitter to the subagentworkflow.test.ts:WorkflowToolfor schema mode (sandbox → orchestrator → dispatch → object capture → safeStringifyResult)tsc --noEmitandeslintare clean on all touched files.Real-LLM E2E — 7/7 passing against qwen3-coder-plus via DashScope
A standalone Node harness drives
WorkflowOrchestrator+createProductionDispatchagainst the real qwen3-coder-plus model. Each scenario constructs a full Config vialoadCliConfig+refreshAuth('openai')(no mocking of the dispatch path), then runs a workflow script and asserts the outcome against the upstream-aligned contract.Scenario coverage:
agent({schema})happy pathstructured_outputwith valid args → event listener captures args → returned to script as object → safeStringifyResult outputs{"primary_color":"blue","confidence":0.9}agent({agentType:'NoSuchAgent_42xyz'})agent({agentType}): agent type 'NoSuchAgent_42xyz' not foundagent({isolation:'remote'})agent({isolation:'remote'}) is not available in this buildagent({agentType:'Explore'})Exploreresolves viaSubagentManager.findSubagentByName('Explore')→ routes to its (fast) model + read-only tool surface → returns the expected answeragent("...")(no opts)AgentHeadless.createwithout going throughSubagentManager.createAgentHeadlessparallel([() => agent(...,{schema:S}), () => agent(...,{schema:S})])agent({isolation:'worktree'})provisionWorkflowWorktree→createUserWorktree→createWorktreeConfigOverride(Object.create + 6 cwd rebinds) → subagent runs in the worktree →cleanupWorkflowWorktreepost-spawnPre-existing infrastructure note found during S7: when a clean subagent run leaves the worktree functionally untouched,
cleanupWorkflowWorktreestill preserves it becauseGitWorktreeService.hasWorktreeChangessees the.qwen-sessionownership marker as?? .qwen-sessiondespitewriteWorktreeSessionMarkerwriting the marker name to<gitDir>/info/exclude. This is the samecleanupWorktreeIsolationpath AgentTool uses (agent.ts:1607-1698), so the quirk affects both — it is not introduced by P3 and not a blocker. Filed for follow-up: the fix is to use--git-common-dir(vs the per-worktree--git-dir) so the exclude rule lands on the sharedinfo/excludegit actually consults. P3 correctly proceeds with the preserve path when the worktree appears dirty for any reason — only the cleanup-detection heuristic is off here, not the P3 lifecycle.Harness command:
The harness source is local (not committed to this PR) — happy to add it under
integration-tests/if the maintainer prefers a reproducible E2E artifact alongside the workflow code.Evidence (Before & After)
N/A — non-user-visible backend change (workflow execution primitives). The verifiable evidence is the unit + integration test count above.
Tested on
Environment
Node v22.21.1, npm 10.x.
Risk & Scope
JSON.parse(JSON.stringify(value))revival inside the vmrunInContextblock (same mechanism PR feat(core): Workflow P2 — parallel() + pipeline() concurrent fan-out (#4721) #4947 used for parallel / pipeline array results) + two security regression tests verifying the constructor-chain escape stays in the vm realm. Worktree isolation introduces a sub-process git invocation peragent({isolation:'worktree'})call (200-500ms baseline upstream est, no caching), bounded by the 16-concurrency window + 1000-agent cap./workflowsUI +extractAndStripMeta+ phase-tree progress), P5 (budgetglobal), P6 (resume via JSONL journal), P7 (Ultracode session mode + keyword trigger) remain follow-ups per Feature Request: Port Dynamic Workflows / Ultracode from Claude Code 2.1.160 #4721. Real-LLM E2E follow-up noted above.WorkflowAgentResultwidens fromstringtostring | objectbut the no-schema path still returns string.isWorkflowsEnabled()gate (off by default) unchanged.Linked Issues
Related #4721 (parent design — multi-phase, not closed by this PR)
Related #4732 (P1 merged) #4947 (P2 merged)
Related #4842 #4996 (declarative agents —
SubagentManager.createAgentHeadlessAPI used here)中文说明
这个 PR 做了什么
实现 Dynamic Workflows 移植的 P3 阶段,基于已合并的 P1(#4732)和 P2(#4947)。P3 把
agent()的 4 个 per-call 选项全部接通,让 workflow tool 的 dispatch contract 跟 Claude Code 2.1.168 对齐:agent({agentType: 'X'})经SubagentManager.findSubagentByName(name)解析 declarative-agents registry(feat(core): declarative agent frontmatter v1 — permissionMode bridge + maxTurns wiring + color allowlist (CC 2.1.168 parity) #4842 + feat(core): port declarative-agent mcpServers + hooks (CC 2.1.168 parity follow-up) #4996 已 ship)。找不到时抛 upstream 字面值"agent({agentType}): agent type 'X' not found"。命中的 config 经SubagentManager.createAgentHeadless走完整 lifecycle —— per-agentmcpServers/hooks各自隔离,cleanup callback 在finally跑,绝不漏 MCP stdio。agent({model: 'qwen3-max'})注入SubagentConfig.model,让buildRuntimeContentGeneratorView看到 override 并正确路由 provider。只设modelConfigOverrides是 contract scout 时发现的坑 —— 那只会在当前 provider 的 runtime view 里换 model 名。agent({isolation: 'worktree'})在<projectRoot>/.qwen/worktrees/agent-<7hex>经GitWorktreeService.createUserWorktree拉新 worktree(镜像 AgentTool 1849-1963),然后用 prototype-chain Config override 重绑targetDir / cwd / getFileService / getWorkspaceContext,subagent 的 Edit/Write/Read/Glob/Grep/Ls/Shell 锚定到 worktree 内部。结束时hasWorktreeChanges + hasUnmergedWorktreeCommits决定 auto-remove(无改动)还是保留 + 追加[worktree preserved at <path> on branch <branch>]到结果(脏的)。父 worktree 有未提交改动会拒绝,避免 subagent 看到 stale HEAD(跟 AgentTool 行为一致)。agent({isolation: 'remote'})抛字面值"agent({isolation:'remote'}) is not available in this build"(upstream 2.1.168 二进制里选项存在但关闭)。agent({schema: S})经rebuildToolRegistryOnOverride注入 per-callSyntheticOutputTool(已存在的tools/syntheticOutput.ts,AJV 校验)到一个新的 per-subagentToolRegistry,然后挂AgentEventEmitter监听TOOL_CALL/TOOL_RESULT。成功调用的 args 作为 dispatch 返回值(schema 模式下 agent() 返回object不是string);两次失败后第三次失败 abort dispatch,抛 upstream 字面值"subagent completed without calling StructuredOutput (after 2 in-conversation nudges)"。agent-core.ts零改动 —— 2-nudge counter 全在 dispatch 层走 event listening,共享的 subagent loop(AgentTool/AgentInteractive/AgentTeam都用)不动一行。架构说明:
agentType/model/isolation/schema)跟 P1/P2 一模一样:直接AgentHeadless.create+ 写死的 workflow subagent prompt + disallow floor。零额外开销。[SendMessage, ExitPlanMode]经Array.from(new Set([floor, agentType.disallowedTools]))在convertToRuntimeConfig前 union,所有 entry 一起经transformToToolNames做 display name / MCP pattern 规范化 —— workflow 代码不重复规范化,permissiveagentType也没办法把工作流禁掉的工具放回来。agent()wrapper 现在在 vmrunInContext块里 per-call 用JSON.parse(JSON.stringify(value))复活对象返回。关掉跟 PR feat(core): Workflow P2 — parallel() + pipeline() concurrent fan-out (#4721) #4947 给parallel()/pipeline()关掉的同一个 T1/T8/T14 宿主 prototype 逃逸 —— schema 模式返回宿主对象,直接喂给脚本会经result.constructor.constructor("return process")()走宿主Object.prototype链。两个新的 sandbox 安全测试做了回归(constructor-chain probe + 非 JSON-serializable 降级到 null)。WorkflowAgentResult从string扩到string | object。无 schema 路径仍然 string,只有 schema 模式产 object,完全向后兼容。为什么需要
P1/P2 stub 抛过
"scheduled for P3"错误,但 workflow tool 对/deep-research那种依赖agent({schema})typed output、agent({agentType})选 subagent 种类、agent({model})成本敏感路由、agent({isolation: 'worktree'})并行修改文件的脚本来说不可用。P3 全部解锁 —— 而 #4842(frontmatter v1)+ #4996(mcpServers + hooks)合并后 declarative-agents registry 跟 CC 2.1.168 对齐了,SubagentManager.createAgentHeadlessAPI 已经把完整的 model / MCP / hooks lifecycle 接通了。P3 复用它,不重造。测试验证
单元 + 集成(workflow 套件 159 个测试,相邻回归 217 个测试,合计 376 个测试全绿)。具体命令同英文版。
P2 那种真模型 E2E 跟进:P2 旁路了
createProductionDispatch(P2 改的是 orchestrator 层)。P3 改的全部都是createProductionDispatch内部,P2 的 harness 形状不适用。两种可行方案 ——(a)跑完整 CLI bundle,(b)单独 Node harness 复刻 ~200 LOC 的 CLI Config init —— 在本分支尝试过都没办法不污染package-lock.json和packages/vscode-ide-companion/NOTICES.txt。我后续在本分支单独 commit 补真模型 E2E(按 P2 的 S1..S6 场景形状)。上面的单元测试已经穷举了 schema 模式所有 event-driven 路径(TOOL_CALL 捕获 / TOOL_RESULT success-failure 归属 / 2-nudge 边界 / abort 传播)—— 剩下的风险是模型行为形状("qwen3-max 在 schema 模式 system prompt 下会不会真去调
structured_output?"),这跟本 PR 的代码正确性是正交的两件事。风险与范围
runInContext块内 per-callJSON.parse(JSON.stringify(value))复活闭合(跟 PR feat(core): Workflow P2 — parallel() + pipeline() concurrent fan-out (#4721) #4947 给 parallel/pipeline 数组结果用的同一机制)+ 两个安全回归测试验证 constructor-chain 逃逸停在 vm 域内。worktree 隔离每次agent({isolation:'worktree'})调用引入一次子进程 git invocation(upstream est 200-500ms,无缓存),由 16 并发窗口 + 1000 agent 上限约束。/workflowsUI +extractAndStripMeta+ phase-tree 进度)、P5(budget全局)、P6(JSONL journal resume)、P7(Ultracode session 模式 + keyword 触发)仍是 Feature Request: Port Dynamic Workflows / Ultracode from Claude Code 2.1.160 #4721 的后续。真模型 E2E 跟进如上。WorkflowAgentResult从string扩到string | object,但无 schema 路径仍然 string。isWorkflowsEnabled()开关(默认关闭)不变。