fix(skills): use full YAML parser for frontmatter to support block scalars#4870
Conversation
…alars The custom `yaml-parser.ts` does not handle YAML block scalar syntax (`>` folded, `|` literal). When a SKILL.md uses `description: >` with indented continuation lines, only the `>` character is captured as the description value. Switch both skill loading paths (`skill-load.ts` and `skill-manager.ts`) to use the `yaml` npm package (already installed for hooks parsing) as the primary parser, with a fallback to the simple parser for malformed YAML that the strict parser rejects. This also simplifies `skill-manager.ts` by removing the special-case re-parse for `hooks:` — the full parser now handles all fields uniformly. Closes #4869
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
…ed parser
- Add `yaml` to `packages/core/package.json` explicit dependencies
- Move the full-YAML-with-fallback logic into `yaml-parser.ts` as the
single `parse()` entry point, eliminating the duplicated wrapper
- Add null/type guard so `yaml.parse('')` (returns null) falls back
cleanly instead of throwing TypeError
- Add debug logging when the full parser fails and fallback triggers
- Tighten `|` block scalar test to use `toBe` instead of `toContain`
- Move block scalar tests to `yaml-parser.test.ts` (no mock interference)
- Add `>-` strip chomping test
- `parseSimple` is an internal fallback, no external caller needs it — stop exporting it via `index.ts`'s `export *` - Remove the outdated "yaml package would need to be added" comment in `subagent-manager.ts` since `yaml` is now an explicit dependency
There was a problem hiding this comment.
Pull request overview
This PR fixes skill frontmatter parsing for YAML block scalars (> folded / | literal) by upgrading the shared YAML parsing utility to use the full yaml npm parser (with a fallback to the legacy simple parser on parse failure). It also simplifies SkillManager by removing the previous “re-parse hooks with full YAML” special case, since the shared parser now supports nested YAML uniformly.
Changes:
- Update
utils/yaml-parser.tsto preferyaml.parse()(full YAML support) with a fallback to the existing simple parser. - Add unit tests covering folded/literal block scalars and strip chomping (
>-) behavior. - Remove the now-unnecessary hooks re-parse logic in
skill-manager.ts, and addyamlas an explicit core dependency.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| packages/core/src/utils/yaml-parser.ts | Switch to full YAML parsing via yaml with fallback to the legacy parser. |
| packages/core/src/utils/yaml-parser.test.ts | Add coverage for > / ` |
| packages/core/src/subagents/subagent-manager.ts | Remove outdated comment about needing to add the yaml dependency. |
| packages/core/src/skills/skill-manager.ts | Drop special-case hooks re-parse; rely on the shared parser output for hooks. |
| packages/core/package.json | Add yaml dependency explicitly for the core package. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Use `schema: 'core'` to prevent date-like strings (e.g. `2024-01-01`) from being coerced into Date objects, which would break parse→stringify roundtrips in claude-converter and subagent-manager - Add yaml-parser unit tests for: Date non-coercion, bare keys returning null, explicit null/tilde, yes/no as strings (YAML 1.2), empty input, comment-only input, malformed YAML fallback - Expand skill-load real-parser integration tests: literal block scalar, strip-chomped folded, allowedTools array, complex mixed frontmatter, malformed YAML graceful fallback
|
Thanks for the PR! 👋 Template check: the body uses On direction: clearly aligned. This is a real, reproducible bug — On approach: scope feels exactly right for the problem. The minimum change would have been "call One small question worth thinking about (not a blocker): the new Moving on to code review + testing. 🔍 中文说明感谢贡献! 模板: PR 正文用了 方向: 对齐。这是真实可复现的 bug—— 方案: 范围正好。最小改动就是"两个调用点换成 一个非阻塞的小问题:新的 进入代码审查与测试 🔍 — Qwen Code · qwen3.7-max |
2a. Code ReviewIndependent proposal (before reading diff): the obvious fix is to call What the PR actually does: matches my proposal almost exactly.
No critical blockers. Typecheck clean, all three test suites green (21 yaml-parser + 63 skill-load + 72 skill-manager). The One minor observation: the fallback is logged at 2b. Real-Scenario TestingSet up a test skill with ---
name: triage-block-scalar-test
description: >
This is a folded multiline description that should
appear as a single paragraph, not as the literal `>` character.
when_to_use: when the user asks to test YAML block scalar parsing
---Ran a harness that imports each version's Before (main — custom parser only)Bug reproduces exactly as reported in #4869: the custom parser sees the line After (this PR — full parser + fallback)Bug fixed. The folded description is parsed per YAML 1.2 spec (lines joined with spaces, trailing newline from the implicit clip). Full tmux capture中文说明2a. 代码审查独立方案(先看 diff 之前): 显而易见的做法是在 PR 实际做法: 和我的方案几乎一致。
无关键阻塞。 类型检查通过,三个测试套件全绿(21 个 yaml-parser + 63 个 skill-load + 72 个 skill-manager)。 一个小观察:回退日志是 2b. 真实场景测试准备了一个 ---
name: triage-block-scalar-test
description: >
This is a folded multiline description that should
appear as a single paragraph, not as the literal `>` character.
when_to_use: when the user asks to test YAML block scalar parsing
---跑了一个 harness:通过 修复前(main - 仅自定义解析器): bug 与 #4869 报告完全一致——自定义解析器看到 修复后(本 PR - 完整解析器 + 回退): bug 修复。折叠描述按 YAML 1.2 规范解析(行之间用空格拼接,隐式 clip 带来尾部换行)。 — Qwen Code · qwen3.7-max |
|
Stepping back — this one is clean across every axis I'd want to check:
The only mild reservation is the same one I flagged in Stage 1 and 2: fallback is logged at Approving. Ship it. ✅ 中文说明退一步看——这个 PR 在我想检查的每个维度上都干净:
唯一一点保留和 Stage 1、2 一样:回退用 批准。可以合并 ✅ — Qwen Code · qwen3.7-max |
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
LGTM, looks ready to ship. ✅ The before/after harness unambiguously shows the bug fixed, tests green, scope tight.
`yaml.parse` returns `null` for bare keys (`hooks:` with no value), while the old simple parser returned `''`. The existing `!== undefined` guards don't catch null, which would cause a TypeError in parseHooksConfig and a misleading error for allowedTools. Change to `!= null` (catches both null and undefined) in both skill-load.ts and skill-manager.ts.
… guards Move the null→undefined normalization into `parse()` via `stripNullValues` so that callers keep using strict `!== undefined` checks. This avoids the loose `!= null` pattern and fixes the contract for all five call sites (skill-load, skill-manager, subagent-manager, claude-converter, rulesDiscovery) in one place.
- Prevent prototype pollution via Object.create(null) in both stripNullValues and parseSimple (fixes __proto__ injection vector) - Recursively sanitize nested objects (Date/Uint8Array/null/__proto__) - Upgrade fallback/non-object log level from debug to warn - Filter !!timestamp/!!binary explicit tags (belt-and-suspenders) - Extract shared parseAllowedToolsField helper to deduplicate skill-load.ts and skill-manager.ts - Add runtime type guard for hooksRaw in skill-manager.ts - Fix fallback tests to use genuinely invalid YAML input
The fallback path returned parseSimple() output directly, bypassing stripNullValues(). This caused inconsistent output: the main path stripped null values so callers could use `!== undefined`, but the fallback preserved them. Wrap the fallback return in stripNullValues() and add a test verifying null-stripping consistency across both paths.
|
Qwen Code review did not complete successfully: Qwen review timed out after 55 minutes. See workflow logs. |
Local runtime verification — real TUI driven via tmux (Linux) ✅Built and ran this PR's head (
Before / After (
|
| Step | Result |
|---|---|
description: > folded → full one-line description in /skills panel + popup (base: bare >) |
✅ |
description: | literal → both lines shown, newline preserved (base: bare |) |
✅ |
description: >- chomped variant → clean folded text (base: bare >-) |
✅ |
Invoked /multiline-folded end-to-end → skill body reached the model, exact marker reply SKILL_BODY_EXECUTED multiline-folded |
✅ |
Fallback path: duplicate description: key (full parser throws uniqueKeys) → skill still loads via simple-parser fallback, last-wins value displayed |
✅ |
description: 2024-01-15 → stays the plain string 2024-01-15 in the panel (no Date conversion) |
✅ |
metadata: { __proto__: … } pollution-shaped frontmatter → skill loads, CLI healthy |
✅ |
Bare description: (null) → skill absent from the panel on both builds (13 skills each) — behavior parity, no new rejection |
✅ |
Subagent surface: .qwen/agents/folded-agent.md with description: > → /agents → View Agent shows the full folded description |
✅ |
| Regression: CLI launched in the qwen-code repo itself → all 14 project skills + 7 bundled = 21/21 listed with intact descriptions, none lost to the parser swap | ✅ |
Notes (non-blocking)
- The fallback is doing real work, not just insurance: the duplicate-key fixture is rejected by
yaml.parse(duplicate keys throw in YAML 1.2) and would have vanished withoutparseSimple— it loaded with the old parser's last-wins semantics, matching base behavior exactly. - A literal-block (
|) description renders its embedded newline in the completion popup (second line uncolumned under the name). Correct YAML semantics, slightly irregular visually — only worth a thought if skill authors start using|widely;>is the natural choice for descriptions. - Skills with a bare
description:are skipped silently (no entry, no warning in the panel) — true on base as well, so not from this PR; just noting it since the new null-stripping path is what enforces it now. - Not covered here: the VS Code extension surface mentioned in the test plan (CLI only) and Windows (Linux run; macOS was covered by the author).
Merge reference
- The fix is observable exactly as claimed at every CLI surface where skill descriptions appear, the fallback preserves loading of frontmatter the strict parser rejects, and a repo-wide regression pass lost zero skills. Branch was
MERGEABLEagainstmainat verification time. - Full step-by-step readable tmux logs (3 sessions: base, PR, repo-regression) are saved locally; happy to attach any section on request.
中文摘要
在 Linux 上用 tmux 驱动真实构建的 CLI(PR head bad7f19 与 base 各一个会话),在用户真实可见的界面验证:/skills 面板、斜杠补全弹窗、真实技能调用、以及 /agents 对话框(subagent frontmatter 共用同一个 parser)。
- Base:
description: >/|/>-在面板和弹窗里只显示>、|、>-一个字符,后续行被静默丢弃 —— bug 复现。 - PR:同样的 fixture 全部显示完整描述;
>折叠为一行,|保留换行,>-去尾换行;/multiline-folded真实调用成功(模型按 body 要求精确回复标记);.qwen/agents中使用>的 subagent 描述同样完整显示。 - 探针全部通过:重复 key(full parser 抛错)经 fallback 仍以 last-wins 语义加载(与 base 行为一致);
2024-01-15保持字符串;__proto__形状的 metadata 不影响加载;裸description:两个版本都同样跳过(行为对齐);在 qwen-code 仓库本体运行回归,21/21 个技能全部正常列出,无一丢失。 - 未覆盖:VS Code 扩展界面(仅 CLI)与 Windows(本次为 Linux;作者已测 macOS)。结论:修复在所有 CLI 表面按声明生效,可作为合并参考。
|
@qwen-code /triage |
|
Thanks for the PR! 👋 Template check: headings match On direction: clearly aligned. This is a real, reproducible bug — On approach: scope feels right for the problem. The minimum change would have been "call Moving on to code review + testing. 🔍 中文说明感谢贡献! 模板: 标题与 方向: 对齐。这是真实可复现的 bug—— 方案: 范围合理。最小改动就是"两个调用点换成 进入代码审查与测试 🔍 — Qwen Code · qwen3.7-max |
2a. Code ReviewIndependent proposal (before reading diff): call What the PR actually does: matches and exceeds the independent proposal.
No critical blockers. Typecheck clean, all four test suites green (33 + 63 + 72 + 7 = 175 tests). CI green on macOS, Windows, and Linux. Lint and CodeQL pass. 2b. Real-Scenario TestingCreated a test SKILL.md with Before (main — custom parser only)Bug reproduces exactly as reported in #4869: the custom parser sees After (this PR — full parser + fallback)Bug fixed. The folded description is parsed per YAML 1.2 spec. 中文说明2a. 代码审查独立方案(先看 diff 之前): 在 PR 实际做法: 与独立方案一致甚至更好。
无关键阻塞。 类型检查通过,四个测试套件全绿(33 + 63 + 72 + 7 = 175 个测试)。CI 在 macOS、Windows、Linux 全绿。Lint 和 CodeQL 通过。 2b. 真实场景测试准备了 修复前(main): bug 与 #4869 报告完全一致——自定义解析器返回 修复后(本 PR): bug 修复。折叠描述按 YAML 1.2 规范正确解析。 — Qwen Code · qwen3.7-max |
|
Stepping back — this PR has improved materially since the first triage run. Two rounds of review with wenshao addressed every concern raised:
No reservations. Ship it. ✅ 中文说明退一步看——这个 PR 自第一次 triage 以来有了实质性的改进。与 wenshao 的两轮 review 解决了所有提出的问题:
没有保留意见。可以合并 ✅ — Qwen Code · qwen3.7-max |
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
LGTM, looks ready to ship. ✅
| */ | ||
| export function parse(yamlString: string): Record<string, unknown> { | ||
| try { | ||
| const result = yaml.parse(yamlString, { |
There was a problem hiding this comment.
[Critical] Missing uniqueKeys: false — by default the yaml library throws Map keys must be unique on duplicate keys. This triggers the silent fallback to parseSimple, which corrupts block scalars (the exact problem this PR fixes). A SKILL.md with an accidental duplicate key would load with a garbled description: '>' instead of the actual text, with no visible error in production (debugLogger.warn is a no-op without an active debug session).
| const result = yaml.parse(yamlString, { | |
| const result = yaml.parse(yamlString, { | |
| schema: 'core', | |
| uniqueKeys: false, |
— qwen3.7-plus via Qwen Code /review
| expect(typeof result['created']).toBe('string'); | ||
| }); | ||
|
|
||
| it('should sanitize nested objects recursively', () => { |
There was a problem hiding this comment.
[Suggestion] The PR description calls out "Date/Uint8Array coercion guards" but only !!timestamp/Date has test coverage. The !!binary → Uint8Array path through sanitizeValue() is untested. Add a test alongside this one:
it('should not resolve !!binary explicit tags', () => {
const input = 'name: test\ndata: !!binary SGVsbG8=';
const result = parse(input);
expect(typeof result['data']).toBe('string');
});Without this, the Uint8Array branch in sanitizeValue() (line 64) is a security guard with no regression test.
— qwen3.7-plus via Qwen Code /review
The "known limitations" tests pinned garbled output from the simple YAML parser, with a comment saying they would fail once js-yaml landed. PR QwenLM#4870 integrated the full yaml library, so these tests now expect the correct nested-YAML parse results. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The "known limitations" tests pinned garbled output from the simple YAML parser, with a comment saying they would fail once js-yaml landed. PR QwenLM#4870 integrated the full yaml library, so these tests now expect the correct nested-YAML parse results. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ity follow-up) (#4996) * fix(core): replace yaml-parser stringify with eemeli/yaml for safe nested round-trip PR #4870 swapped `parse` over to the `yaml` library so block scalars and nested structures load correctly, but left the hand-rolled `stringify` in place. The hand-rolled serializer only walks one level of nesting and emits `[object Object]` for any value below — so any caller that does `parse → modify → stringify` (e.g. SubagentManager saving a frontmatter file, or a Claude-Code-format converter writing back a `.qwen/agents/*.md`) silently corrupts nested fields like `mcpServers` or `hooks` on disk. This commit: - Delegates `stringify` to `yaml.stringify` with `lineWidth: 0`, matching the parse side's library choice and unlocking arbitrary-depth round-trip. - Drops the now-unused `formatValue` helper. - Replaces the byte-exact escape-sequence assertions with property-based `parse(stringify(x)) === x` round-trip checks — the previous tests pinned hand-rolled quote output that eemeli/yaml legitimately chooses differently from. The contract that matters at the public API boundary is round-trip, not stable bytes. - Adds two CC-shape nested round-trip tests (mcpServers + hooks) that would have been impossible to express under the old serializer. The parse-side safety guards added in PR #4870 (schema 'core', timestamp / binary tag filtering, `Object.create(null)`, Date/Uint8Array sanitize, parseSimple fallback) are preserved untouched. * feat(core): port declarative-agent mcpServers + hooks end-to-end Builds on PR #4842 (which deferred these two fields) and PR #4870 (which made `parse` nested-safe) by adding the remaining surface + runtime wiring so a `.qwen/agents/*.md` with per-agent MCP servers and hooks works the same way as the equivalent `.claude/agents/*.md`. ## Schema layer `agent-frontmatter-schema.ts` gains two lenient DL7-parity parsers: - `parseAgentMcpServers` — keeps a record-of-records shape and drops per-key entries whose value is a scalar / array / null (mirrors CC's `gS8` shallow validation; per-spec union is enforced later by the MCP loader). - `parseAgentHooks` — keeps a record-of-arrays shape and drops events whose value isn't an array (mirrors CC's `TKO` / `_u`). Both return `undefined` when no entries survive shape filtering, so the caller can omit the field entirely rather than emit an empty object. ## SubagentConfig surface - `types.ts`: adds optional `mcpServers?: Record<string, unknown>` and `hooks?: Record<string, unknown>`. - `subagent-manager.ts` `parseSubagentContent`: extracts both fields with warn-and-drop on top-level shape failure, matching the existing posture for `permissionMode` / `maxTurns` / `color`. - `subagent-manager.ts` `serializeSubagent`: emits both back to YAML. The previous skip-list carve-out in `claude-converter.ts` (`NESTED_FIELDS_NOT_ROUND_TRIPPABLE`) is gone — round-trip is safe now that the YAML stringifier is eemeli/yaml. - `claude-converter.ts`: emits `mcpServers` (was missing) when converting from a CC plugin agent file. ## Runtime wiring - `hookRegistry.ts`: new public `addAgentHooks(hooks, scopeId): () => void` appends ephemeral entries tagged with `agentScope`, runs them through the same per-definition validation pipeline as session/user/project hooks (so a malformed entry is logged + dropped instead of breaking the spawn), and returns an unregister callback. The duplicate-detection key includes `agentScope` so identical hooks from different subagents — or from a subagent and the session — coexist instead of swallowing one another. v1 scope limitation: while a subagent's entries live in the registry they fire for every event of their declared type regardless of which agent is currently active. Per-agent scope filtering at firing time is a follow-up; the limitation is documented in the user-facing doc. - `subagent-manager.ts` `buildSubagentContextOverride`: now takes the `SubagentConfig` and, when per-agent `mcpServers` are present, overrides `getMcpServers()` on the subagent's Config wrapper to return the union of session + agent servers (agent wins on key collision, matching CC's `scope: 'agent'` semantics). The skip-rebuild optimization is bypassed in this case — without a fresh `rebuildToolRegistryOnOverride` anchored on the override Config, the pre-existing wrapper-owned `McpClientManager` would still see only the session set and the discovery loop below would silently no-op. After rebuild, the loop explicitly discovers each per-agent server so its tools land in the subagent's registry before `AgentHeadless.run`. - `subagent-manager.ts` `createAgentHeadless`: when `config.hooks` is set, registers via `HookRegistry.addAgentHooks` with a per-spawn scope ID (`agent:<name>:<uuid>`) and wraps the caller-provided `AgentHooks.onStop` so the unregister callback fires after the caller's handler, in a `finally` block to survive a throwing user handler. Errors from `AgentHeadless.create` itself unregister the hooks before re-throwing. ## Tests - `agent-frontmatter-schema.test.ts`: +8 tests covering shape filtering, drop-on-bad-top-level, drop-when-empty for both new parsers. - `subagent-manager.test.ts`: +7 tests covering parse, serialize, and drop-on-malformed for both fields. - `subagent-manager-override.test.ts`: +2 tests for the session+agent MCP merge and the no-override pass-through case. - `hookRegistry.test.ts`: +4 tests for `addAgentHooks` — scope tag, no-collision with existing same-identity entries, two concurrent agents keep their own copies, empty payload no-op. * docs(core): declarative agents follow-up + yaml-parser audit - `docs/yaml-parser-replacement.md` is the new audit doc covering the PR #4870 → eemeli/yaml decision (parse-side), the security probe results (`maxAliasCount` default, `!!js/function` becomes literal string + warning, merge keys disabled by default, custom-tag filter for timestamp/binary), and the stringify-side gap this follow-up closes. - `docs/declarative-agents-port.md` status table now marks `mcpServers` and `hooks` as **shipped (follow-up)** with a one-line note on the runtime wiring strategy + v1 scope limitation for hooks. The reverse-engineering record below the table is unchanged and remains the reference for the still-deferred fields (`effort`, `memory`, `isolation`, `initialPrompt`, `skills`). - `docs/users/features/sub-agents.md` adds `mcpServers` + `hooks` rows to the CC-compatibility table and a full example frontmatter showing all four shipped fields composed together. The v1 hooks scope limitation is called out as a blockquote so users picking up per-agent hooks know to prefer fire-globally-safe handlers (logging) over behavior-mutating ones until the firing-time scope filter lands. * fix(core): self-review round 1 — proto-pollution defense + parallel MCP discovery Audited the three commits adversarially across four lenses (correctness, security, reuse, test quality). Two real findings, two test gaps. ## Real bugs fixed ### `parseAgentMcpServers` / `parseAgentHooks` could pollute the result's prototype Both helpers wrote into a plain `{}` while iterating an input object that yaml-parser hands back as null-prototype. A YAML key of literal `__proto__` survives that null-prototype guarantee as an own property, so `Object.entries(record)` walks it, and the assignment `result['__proto__'] = spec` triggers the inherited setter — silently re-wiring `result`'s prototype chain to point at the attacker's spec. Object.prototype itself stays untouched (the setter is per-instance) and the downstream spread `{ ...result }` only copies own enumerables, so the pollution doesn't directly leak through current callers. But returning an object with a hijacked prototype is a latent footgun: a future caller that uses `for…in`, `result.someKey`, or any property access that misses the own table would pick up values from the polluted chain. Switching to `Object.create(null)` makes the assignment a plain own property — matching the null-prototype invariant the rest of `yaml-parser.ts` already maintains. Two regression tests pin the defense (one per parser), each constructing a null-prototype input the way `yaml.parse` actually produces it (`{ __proto__: … }` in an object literal triggers the setter at construction and so does NOT reproduce the attack input shape). ### Per-agent MCP discovery serialised through every server `buildSubagentContextOverride` called `discoverToolsForServer` in a sequential `for…of` loop with `await`. A misbehaving server (stdio command that hangs at startup, remote endpoint that times out at the MCP layer's default 30s) blocks every following server, so the subagent spawn paid the sum of every per-server timeout instead of the max. Switched to `Promise.allSettled` over the server list. Each call still carries the MCP layer's own per-server timeout (`stdio` default 30s, remote default 5s, per-spec `discoveryTimeoutMs` override); `allSettled` only removes the serialisation between siblings. Rejections still log-and-drop so a single bad server doesn't block its siblings' tools from landing in the subagent's registry. ## Test gaps closed ### `addAgentHooks` coexistence test only asserted count The "coexists with session/user hooks of the same identity" test asserted `getAllHooks()` had length 2 after the add but did NOT verify which entries were present. A regression that dropped `agentScope` from the dedup key would still leave two entries by ordering luck and silently break concurrent-agent isolation. Replaced the bare count check with explicit assertions for `(source: User, agentScope: undefined)` + `(source: Session, agentScope: 'agent:test:def')`. ### Empty-record edge cases for `parseAgentMcpServers` / `parseAgentHooks` The existing "returns undefined when nothing survives shape filtering" tests covered the case where every key had a bad shape. The empty input case `parseAgent…({})` was on the same code path but never exercised — callers rely on the `undefined → omit field` behaviour for both. Added one test per parser. * fix(core): pr #4996 review round 1 — leak fixes via explicit dispose contract Reviewer flagged two Criticals + one Suggestion + one Nice-to-have. The two Criticals share a root cause (subagent execute()'s inner try/finally doesn't fire on every exit path), so they fold into a single API change. ## [Critical] Hook cleanup leak on AgentHeadless.execute() early exits `wrapAgentHooksForCleanup` relied on `onStop` firing inside execute()'s inner try/finally. Two early-exit paths bypass that finally: 1. `createChat()` returning null at agent-headless.ts:224-226 — returns before the outer `try` at 233 is even entered. 2. `prepareTools()` throwing at 234 — propagates through the outer `finally` at 335, which only calls `abortController.abort()` and never reaches the inner finally that fires `onStop`. The pre-fix `catch` block only guarded `AgentHeadless.create()`, not `execute()`. Leaked HookRegistry entries fire globally for every matching event in the session, polluting unrelated tool calls. ## [Critical] Per-agent MCP server processes leak after every spawn `discoverToolsForServer` connects real MCP clients (stdio child processes, HTTP/SSE sockets) in the force-rebuilt subagent ToolRegistry. Nothing stopped that registry: `Config.shutdown` only reaches the root's `this.toolRegistry`, and AgentTool's existing `agentConfig.getToolRegistry() .stop()` (fg + bg + resume finally blocks) only stops the parent's registry, not the override's distinct fresh one. Every subagent invocation that declared `mcpServers` orphaned a child process for the rest of the host process's lifetime. ## Shared fix — caller-driven `dispose` contract `SubagentManager.createAgentHeadless` now returns `{ subagent, dispose }`. Callers MUST invoke `dispose()` in the same `finally` block that wraps `subagent.execute()`. That `finally` lives in the caller's scope (AgentTool fg/bg, BackgroundAgentResumeService), which is reachable on every execute() exit — including the two early- exit paths the previous `onStop` hook never reached. `dispose` is a single closure that calls the previously-separate cleanup callbacks in order: 1. `unregisterAgentHooks` returned from `HookRegistry.addAgentHooks` (when per-agent hooks were registered). 2. `disposeRegistry` returned alongside the new `buildSubagentContextOverride` return shape `{ context, disposeRegistry }` (set only when this call force-rebuilt the registry for `mcpServers`). Both cleanups are wrapped in `try/finally` that logs and re-arms so an exception in one path doesn't block the other and doesn't double-fire. The pre-existing constructor-failure catch in `createAgentHeadless` now runs the same closure directly — the caller never received the return value, so it cannot fire `dispose` itself. The three callers gain one variable + one `void dispose?.().catch()` inside their existing finally: - `agent.ts:2039` (foreground) — finally at `agent.ts:2904` - `agent.ts:2131` (background) — finally at `agent.ts:2502` - `background-agent-resume.ts:630` (resume) — finally at `:852` Fork subagents share the parent's lifecycle; their `dispose` stays undefined and the `?.()` is a no-op. `wrapAgentHooksForCleanup` is removed — it was load-bearing only for the happy + inner-reasoning-loop-failure paths and is now obsolete. ## [Suggestion] Repeated guard condition `config.hooks && Object.keys(config.hooks).length > 0` no longer appears in both `if` / `else if` arms. Single outer guard + nested branch on `hookRegistry`. ## [Nice-to-have] claude-converter.ts:290 missing `.trim()` `stringifyYaml(newFrontmatter).trim()` brings the converter into line with `subagent-manager.ts:651`. Without trim, eemeli/yaml's trailing newline produced an extra blank line before the closing `---` delimiter — cosmetic (both readers tolerate it) but the asymmetry between the two writers was a real consistency bug. ## Tests 3 new RED-first tests in `subagent-manager.test.ts` pin the dispose contract: 1. `returns { subagent, dispose }; dispose unregisters per-agent hooks` 2. `dispose unregisters even when execute() never runs (early-exit leak fix)` — the case where `createChat()` → null or `prepareTools()` throws 3. `dispose is a safe no-op when neither hooks nor mcpServers are declared` Override test helper updated to destructure the new `buildSubagentContextOverride` return shape. AgentTool + BackgroundAgent test mocks updated to return `{ subagent, dispose }`. All 2087 in-scope tests pass. * refactor(core): /simplify cleanup pass on the dispose contract Three cleanups from a multi-angle quality review (reuse / simplification / altitude lenses, all on commit 720f0e4): 1. Drop the null-out guard inside `runCleanup`. Both inner callbacks are already idempotent at the source (`HookRegistry.addAgentHooks` filters removal by `agentScope`; `ToolRegistry.stop` documents itself idempotent), so the outer `unregisterAgentHooks = undefined` / `disposeSubagentRegistry = undefined` finally blocks were buying nothing beyond a marginal short-circuit on duplicate `dispose()` calls — at the cost of 8 LOC and a "is this load-bearing?" question for readers. Comment now states the idempotency guarantee explicitly. 2. Rename the `buildSubagentContextOverride` return field `disposeRegistry` → `cleanup`. Pairs the sibling override builder `createApprovalModeOverride` whose return shape is `ApprovalModeOverrideHandle = { config, cleanup }`. The `context` field stays as-is because `config` would shadow the same-named parameter inside this method's scope (the parent helper doesn't take a `config` parameter, which is why it can use that name). 3. Add a 4th test pinning the constructor-failure cleanup path. The `try { ... } catch { await runCleanup(); throw }` block at the end of `createAgentHeadless` runs when `AgentHeadless.create` rejects — at which point the caller has not received `{ subagent, dispose }` and cannot run cleanup itself. The three existing tests covered the happy-path and execute()-never-runs scenarios; this one closes the "constructor blows up after hooks were registered" gap. No behavior change to callers — same `{ subagent, dispose }` return shape, same dispose semantics. Findings skipped: - Parallelizing the two cleanups inside `runCleanup`: synchronous unregister + async registry stop, the `await` only blocks the registry stop; the order has no measurable cost. - Parallelizing the parent/agent registry stops at the 3 call sites: they already run concurrently because the call sites use `void X.stop().catch(...)` (fire-and-forget), not `await`. - Extracting a `executeHeadlessSubagent` helper that owns the dispose lifecycle: real win against future call-site drift, but reaches well outside the round-1 review diff into AgentTool's three execution shapes (fg sync / bg fire-and-forget / resume embedded). - Fixing `AgentHeadless.execute()`'s early-exit paths upstream: the round-1 commit's explicit altitude choice; revisiting it would re-open a settled design call.
What this PR does
Switches the YAML frontmatter parser for SKILL.md files from a hand-rolled minimal parser (
yaml-parser.ts) to the fullyamlnpm package (v2.8.1, already installed) as the primary parser, with an automatic fallback to the old simple parser when the full parser rejects malformed input. This makes skill descriptions written with YAML block scalar syntax (>folded,|literal) parse correctly instead of showing the raw>or|character.Why it's needed
The custom
yaml-parser.ts(193 lines) only handles flatkey: valuepairs, single-level arrays, and one-level nested objects. It has no support for YAML block scalar indicators (>,>-,|,|-). When a SKILL.md uses standard YAML multiline syntax likedescription: >, only the>character is captured as the description — the continuation lines are silently dropped. Claude Code handles the same SKILL.md files correctly because it uses a full YAML parser.Reviewer Test Plan
How to verify
/test-multilineor check the skill picker — the full description should display instead of>npx vitest run packages/core/src/utils/yaml-parser.test.ts packages/core/src/skills/skill-load.test.ts packages/core/src/skills/skill-load.real-parser.test.ts— all 98 tests passEvidence (Before & After)
before:


qc:
cc:
Tested on
Environment (optional)
Local runtime:
npx vitest runfor unit/integration tests. CI covers all three platforms.Risk & Scope
yaml.parse()may return different types than the old simple parser for edge cases. Mitigated by usingschema: 'core'(prevents date-like strings from becoming Date objects) and adding!= nullguards (handles bare keys returning null instead of empty string).parse()inyaml-parser.tsis a shared utility, sosubagent-manager,claude-converter, andrulesDiscoveryalso switch to the full parser. I checked each call site — they all read fields via bracket access withArray.isArray/|| ''defaults, so the null-stripping and null-prototype objects introduced here are compatible. Net effect for those callers is they gain the same block scalar support.stringifyfunction was not upgraded — it remains the simple implementation. This is safe today since all current stringify call sites build objects from scratch, but a future parse→modify→stringify roundtrip with deeply nested YAML would needyaml.stringify().Linked Issues
Closes #4869
中文说明
将 SKILL.md 的 YAML frontmatter 解析器从手写的极简 parser 切换到完整的
yamlnpm 包作为主解析器,解析失败时自动回退到旧的简单 parser。这样使用 YAML block scalar 语法(>折叠、|保留换行)编写的 skill 描述能够正确解析,而不是只显示>或|字符。根因:自研的
yaml-parser.ts(193行)只支持平铺的key: value对、单层数组和一层嵌套对象,完全没有处理 YAML block scalar 语法。当 SKILL.md 使用description: >时,只有>字符被捕获为描述内容,后续的缩进行被静默丢弃。Claude Code 使用完整的 YAML 解析器处理相同的 SKILL.md 文件,所以能正常显示。主要改动:
yaml-parser.ts:新的parse()函数优先使用yaml.parse(input, { schema: 'core' })解析,失败时回退到旧的parseSimple()skill-manager.ts:移除冗余的 hooks 二次解析,统一使用新的 parser;!== undefined改为!= null防止空值穿透skill-load.ts:同样修复allowedToolsRaw的 null guardpackage.json:将yaml添加为显式依赖parse()是共享工具函数,subagent-manager、claude-converter、rulesDiscovery等调用方也同步切换到完整解析器。已逐一确认这些调用点的用法(方括号取值 +Array.isArray/|| ''兜底)与 null 剥离、null-prototype 对象兼容,它们同时获得 block scalar 支持