refactor(core): TaskBase envelope + foreground subagent persistence#3970
Conversation
Establishes a shared `TaskBase` envelope across the agent / shell / monitor task registries with a mandatory `outputFile` field. Brings the foreground subagent path into compliance with the new contract, so it now leaves the same JSONL transcript + meta sidecar on disk that backgrounded subagents have always produced — closing the only gap where a registered task wrote nothing. Renames the agent-task discriminator from `flavor: 'foreground' | 'background'` to claw-code's `isBackgrounded: boolean`; the deprecated names are kept as one-release type aliases. PR 1 of the task-registry-unification design. PR 2 will collapse the three per-kind registries into one thin TaskRegistry plus per-kind modules.
E2E test summaryHeadless + interactive runs, fresh temp workdir per group. Baseline = global Central assertion — foreground subagent now persists (irreproducible baseline contrast). Prompt: spawn an agent (no
Graceful foreground cancel (new, interactive). Spawn a foreground subagent on a long Background shell path (regression).
On-disk status strings are all lowercase canonical — the status-type rename did not perturb them. |
The alias only preserved the type name; no in-tree caller used it,
and after the field rename no realistic external consumer use survives
(reading entry.flavor / writing { flavor: ... } both fail at the use
site regardless of whether the alias resolves). Drop it instead of
carrying a hollow shim.
wenshao
left a comment
There was a problem hiding this comment.
Two Critical issues requiring attention before merge; details in inline comments below.
- Register before writing the meta sidecar so a register() failure can't leave an orphaned 'running' meta file behind. writeAgentMeta is best-effort and never throws, so the inverse failure mode (registry entry without sidecar) is a benign degradation. - Cache getGitBranch by cwd at the agent module level so foreground launches don't pay a fresh git rev-parse exec each time. Branches don't change within a process under normal use; the transcript annotation is best-effort audit metadata. - Document on cancel() that foreground entries take a partial path through the method — Map deletion is the caller's responsibility via unregisterForeground() in the tool-call's finally path.
…reground-persistence # Conflicts: # packages/core/src/agents/background-tasks.ts # packages/core/src/services/monitorRegistry.ts # packages/core/src/tools/agent/agent.ts # packages/core/src/tools/monitor.ts
wenshao
left a comment
There was a problem hiding this comment.
Additional findings (cannot map to diff lines)
[Suggestion] packages/core/src/tools/agent/agent.ts:~1244 — 后台子 agent 路径中 writeAgentMeta() 仍在 registry.register() 之前调用(此顺序在本次 PR 修改之前已存在)。前台路径已修正为先 register 后 writeMeta,建议后台路径同步调整以避免未来的孤儿 meta 文件风险。
[Suggestion] packages/core/src/agents/background-tasks.test.ts:1222 — 测试名称 'default flavor (absent) behaves as background' 及注释引用了已移除的 flavor 字段。isBackgrounded 现在是必填 boolean,不存在"缺失"场景,建议更新测试名称。
— deepseek-v4-pro via Qwen Code /review
…tor/task-base-foreground-persistence # Conflicts: # packages/core/src/agents/background-tasks.ts # packages/core/src/services/monitorRegistry.ts # packages/core/src/tools/agent/agent.ts # packages/core/src/tools/monitor.ts
The foreground finally block in agent.ts mapped any non-ERROR, non-CANCELLED terminate mode (including MAX_TURNS, TIMEOUT, SHUTDOWN) to 'completed' in the sidecar, so post-mortem readers and resume logic saw a successful status for runs that actually hit a guardrail. Flip the ternary to mirror the background path: GOAL -> completed, CANCELLED -> cancelled, else -> failed. Also reorder the background launch so registry.register() runs before writeAgentMeta(), matching the foreground path. Both paths now share the same orphaned-meta guarantee.
The "default flavor (absent) behaves as background" test name and its backwards-compat comment referenced the old optional flavor field, but the registration shape has required isBackgrounded for a while now — there is no "absent" path to exercise. Rename it to describe what the assertion actually covers: that background entries fire a task- notification on complete.
|
Re: the additional findings review —
Done in d67db4c — background launch now registers before writing the meta sidecar, matching the foreground order. Worth noting that
Done in 441e3cc — renamed to |
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
…ithub.com/QwenLM/qwen-code into refactor/task-base-foreground-persistence
wenshao
left a comment
There was a problem hiding this comment.
Additional review findings (cannot map to diff lines):
[Suggestion] monitorRegistry.ts:444,583 — dispatchOwnerLifecycleWake and dispatchNotification still use the deprecated type alias MonitorEntry in their signatures while every other method in the file has migrated to MonitorTask. Functionally identical but creates inconsistency.
[Suggestion] monitorRegistry.ts:297 — cancel() has asymmetric ordering between its two branches (notify=false: settle-before-abort vs default: abort-before-settle). The asymmetry is intentional but undocumented at the call site. Consider adding a JSDoc note.
[Suggestion] background-tasks.ts:717 — The <output-file> XML tag in the task notification envelope includes the full absolute outputFile path, exposing filesystem structure and session IDs to the LLM. Pre-existing behavior, but foreground persistence extends the exposure surface.
— deepseek-v4-pro via Qwen Code /review
The local `BackgroundTaskStatus` union was byte-identical to the new shared `TaskStatus` defined in `tasks/types.ts`. Replace it with a `@deprecated` type alias so external consumers (notably `nonInteractiveCli.ts`) keep compiling unchanged while the canonical name lives in one place.
…rdering Two small consistency wins flagged in review: 1. `dispatchOwnerLifecycleWake` and `dispatchNotification` were the only methods on the registry still typed with the deprecated `MonitorEntry` alias. Rename their parameters to `MonitorTask` to match every other signature in the file. 2. `cancel()` orders `settle()` and `abort()` differently between its two branches, which is intentional (silent cancel locks the terminal status before abort listeners run; default cancel lets a naturally-completing operation settle through its own terminal path). Document that asymmetry in a JSDoc on the method so the next reader doesn't have to reverse-engineer it.
|
Re: the additional review findings —
Fixed in 97ca23b — renamed both signatures to
Also fixed in 97ca23b — added a JSDoc on
Leaving this one alone in this PR. As you noted it's pre-existing behavior, and the parent agent already has access to the session-keyed transcript root (it has to, to locate outputs in the first place), so the absolute path doesn't add capability the LLM lacks. Reshaping the envelope to a relative token is the kind of change that wants its own design pass rather than a quiet tack-on here. Happy to file a follow-up issue for it if you'd like to track it separately. |
…reground-persistence
The `BackgroundTaskStatus` alias was introduced in 91b59a8 as a `@deprecated` synonym for external SDK consumers (notably `nonInteractiveCli.ts`). New internal references in this PR's own file kept the old name; migrate them so the only remaining usage of the deprecated alias is the alias declaration itself. No behavior change — the alias is `= TaskStatus` so the union is identical.
The foreground finally block maps GOAL→completed, CANCELLED→cancelled, and everything else (ERROR, MAX_TURNS, TIMEOUT, SHUTDOWN) → failed. Only the GOAL branch was asserted; the failed-mode fallback had no coverage even though the same mapping recently regressed (d67da6d50) and had to be fixed by review. Adds a table-driven case mocking getTerminateMode to ERROR / MAX_TURNS / TIMEOUT and asserting patchAgentMeta receives status: 'failed'. CANCELLED is already covered by the "foreground CANCELLED prefixes the partial result" test below.
wenshao
left a comment
There was a problem hiding this comment.
Incremental review since last pass (97ca23b→84ad2b8d): 2 new commits — BackgroundTaskStatus→TaskStatus alias cleanup + foreground failed-branch test coverage. All previously flagged issues are resolved.
Note: 9 tsc type errors were detected in the worktree but all are in files NOT modified by this PR's commits (config.test.ts, InputPrompt.test.tsx, ModelStatsDisplay.test.tsx). These are pre-existing base-branch issues and should not block this PR.
— DeepSeek/deepseek-v4-pro via Qwen Code /review
|
Re: the round-7 review — thanks for the pass. Windows CI failure noted; investigating separately and it doesn't look related to the diff in this PR. Confirmed the 9 tsc errors you flagged are pre-existing on |
wenshao
left a comment
There was a problem hiding this comment.
Incremental review since last pass (97ca23b→84ad2b8d): 2 new commits — BackgroundTaskStatus→TaskStatus alias cleanup + foreground failed-mode terminal status coverage. All previously flagged Critical issues have been resolved. Build and all 388 tests pass. Remaining findings are test-coverage Suggestions.
Extends the foreground terminate-mode it.each to assert that CANCELLED is recorded as `cancelled` in the on-disk sidecar — the existing cancel-prefix test only verified the LLM-visible payload, leaving the patchAgentMeta mapping uncovered. A regression flipping CANCELLED → 'failed' would now fail this case.
|
Re: the round-8 review — thanks for the pass. Round-8 disposition of the 5 suggestions:
Windows CI failure still tracked separately — looking unrelated so far. |
wenshao
left a comment
There was a problem hiding this comment.
Review Summary (Round 6)
No new issues found. All previously-identified concerns from rounds 1-5 have been addressed in follow-up commits.
Verified
flavor→isBackgroundedrename: all gate polarities correctly flipped across agent registry, UI dialog, resume service, and tests- Foreground persistence wiring: cleanup ordering is correct — JSONL writer released before meta sidecar patched, ensuring flushed transcript by the time sidecar reports terminal status
fgTerminalStatusmapping: GOAL → completed, CANCELLED → cancelled, else → failed (covers ERROR/MAX_TURNS/TIMEOUT/SHUTDOWN)- Register-before-write ordering:
registry.register()runs beforewriteAgentMeta()on both foreground and background paths, preventing orphaned meta files getGitBranchmemoization: cached per cwd, eliminating per-launchexecSync- Deprecated type aliases:
BackgroundTaskEntry,BackgroundTaskFlavor,AgentDialogEntry,BackgroundTaskStatusall preserved as one-release aliases
Deterministic
| Check | Result |
|---|---|
| Build | ✅ Pass |
| Tests (220) | ✅ All pass |
| eslint | 0 issues |
| tsc | 9 type errors in test files (appear pre-existing; config.test.ts, InputPrompt.test.tsx, ModelStatsDisplay.test.tsx) |
CI
Windows test failure is pre-existing and not caused by this PR.
— DeepSeek/deepseek-v4-pro via Qwen Code /review
The outputFile/metaPath regexes hardcoded forward slashes, so the foreground JSONL+meta reservation test failed on Windows where paths use backslashes. Accept either separator.
|
Re: the round-6 review — thanks for the pass, and good news on the only blocker. The Windows CI failure was not pre-existing — it was caused by this PR. The Root-caused and fixed in db7d695 — the two directory-separator tokens now accept either separator ( Once the Windows check goes green, this should be clear to restore to Approve. Could you take another look when CI completes? |
…putFile contract A throwing register() subscriber in executeBackground() would leak the already-spawned child + open output stream, unreachable by /tasks / task_stop. Mirror the promote path's defensive try/catch: abort the entry's controller, destroy the stream, and rethrow so the launch fails visibly. Also correct the TaskBase.outputFile contract: agent JSONL is materialized on the writer's first append, which is the launch prompt at attach time — not the first runtime event. A subagent cancelled before any event still leaves a prompt-only JSONL plus meta, not meta alone.
|
E2E verification (db7d695): ✅ A/B/C all green on
Local pre-flight (PR branch
Note on the Repro env: macOS, Node 22.22.2, two isolated cwds |
…th regex `expect.stringMatching(/subagents\/test-session-id\/agent-...$/)` fails on Windows because path.join() uses backslashes there. Switch to `[\\/]` to accept either separator, matching the existing pattern at line 1741. The test (added in #3970) has been silently red on Windows since that merge; nobody noticed until this PR's rebase made CI run. Caught by the CI loop on PR #3982.
…th regex `expect.stringMatching(/subagents\/test-session-id\/agent-...$/)` fails on Windows because path.join() uses backslashes there. Switch to `[\\/]` to accept either separator, matching the existing pattern at line 1741. The test (added in #3970) has been silently red on Windows since that merge; nobody noticed until this PR's rebase made CI run. Caught by the CI loop on PR #3982.
…th regex `expect.stringMatching(/subagents\/test-session-id\/agent-...$/)` fails on Windows because path.join() uses backslashes there. Switch to `[\\/]` to accept either separator, matching the existing pattern at line 1741. The test (added in #3970) has been silently red on Windows since that merge; nobody noticed until this PR's rebase made CI run. Caught by the CI loop on PR #3982.
Summary
TaskBaseenvelope (id,kind,description,status,startTime,endTime?,outputFile,outputOffset,notified,abortController) that the agent, shell, and monitor task types now intersect into. The agent-task discriminator changes fromflavor: 'foreground' | 'background'toisBackgrounded: boolean; the old name is kept as a one-release deprecated type alias for external SDK consumers. The foreground subagent path now wires the same JSONL transcript writer and meta sidecar that backgrounded subagents already had, so every registered task writes to a path on disk.outputFileforces the foreground gap closed as a direct consequence of the new contract — post-mortem of a cancelled or crashed foreground subagent now has on-disk evidence beyond what made it into the parent's tool result. The rename aligns vocabulary with the upstream design (isBackgrounded: boolean) and removes a string-union discriminator that every read site was already evaluating as a binary fact.flavor→isBackgroundedrename is mechanical but the gates flip polarity — confirm each rewritten gate evaluates to the same boolean per entry as before (background path still emits the XML notification, still pins the headless holdback; foreground path still skips both). Second, the foreground persistence wiring's cleanup ordering — the JSONL writer is released before the meta sidecar is patched to its terminal status, so a post-mortem reader sees a flushed transcript by the time the sidecar reportscompleted/failed/cancelled. Third, the deprecatedBackgroundTaskEntryandBackgroundTaskFlavoraliases — they preserve type names for one release; runtime payloads need to migrate to the newisBackgroundedfield (all in-tree callers already have).Validation
run_in_background: true. Run against the globalqwen0.15.8 binary as a baseline and againstnode dist/cli.jsfor verification.subagents/<sessionId>/agent-<id>.jsonlandagent-<id>.meta.jsonwith a terminal status; background subagent run unchanged; no<task-notification>envelope for foreground; envelope still emitted for background.node dist/cli.js "Use the agent tool to spawn a subagent that reads package.json and lists top-level keys" --approval-mode yolo --output-format json. After it returns, check~/.qwen/projects/<sanitized-cwd>/subagents/<sessionId>/— there should be a.jsonlwith at least one record and a.meta.jsonwithstatus: "completed". Repeat withrun_in_background: trueand confirm the same files plus a<task-notification>envelope in the headless stream.Scope / Risk
unregisterForeground); not addressed here. The cleanup ordering in the foregroundfinallyblock is load-bearing — closing the writer before patching the meta is what guarantees post-mortem readers see a flushed transcript matching the reported status.AgentTerminateMode.ERROR) is not externally reproducible from a model-controlled prompt —ERRORrequires chat-init failure or a thrown exception inside the reasoning loop, neither of which the prompt surface can trigger. The code path is exercised by unit tests and shares the samepatchAgentMetaternary as the verified A1 (completed) and A3 (cancelled) branches.BackgroundTaskEntryis renamed toAgentTask;BackgroundTaskFlavoris no longer used. Both are kept as deprecated type aliases for one release for external SDK consumers; the alias resolves to the new shape, so any consumer constructing the oldflavor-shaped payload at runtime needs to migrate toisBackgrounded: boolean. All in-tree callers (agent tool's foreground/background register sites, the resume service, dialog/hook render paths) are updated.Testing Matrix
Testing matrix notes: tested on Linux against
node dist/cli.js. Other platforms / install paths not exercised; no platform-specific code in this PR.