feat(cli): headless support and SDK task events for background agents by tanzhenxin · Pull Request #3379 · QwenLM/qwen-code

tanzhenxin · 2026-04-17T03:17:48Z

TLDR

Second PR in the background-subagent stack. The base PR (#3076) added the run_in_background parameter and interactive notification delivery. This PR takes three next steps:

Unifies cron and background notifications into a single typed queue so they render consistently in the UI and drain through the same loop.
Extends background agents to headless mode. Headless runs now hold the process open until background agents finish, drain their notifications as additional turns, and exit cleanly on SIGINT/SIGTERM.
Emits structured SDK events (task_started, task_notification) on the JSON output stream so programmatic consumers can track background agents without parsing display text.

Also includes a round of review-driven fixes for the base feature's permission handling, cancellation, error propagation, and a subtle assignment-order race that made the headless notification drain silently stall.

Screenshots / Video Demo

N/A — this is a headless/SDK-facing change plus a UI consistency fix. The observable differences:

Interactive: cron fires now render as a single ● Cron: <label> notification line, matching background agent notifications, instead of ● Cron: … followed by a duplicate > … user-message line.
Headless: a run that launches a background agent now waits for the agent to finish (and for the parent to respond to the resulting notification) before exiting, and the JSON stream carries system/task_started and system/task_notification events alongside the existing user message.

Dive Deeper

Why unify cron and background notifications

The base PR built a dedicated notification queue for background agents while cron stayed on its own separate queue. Two queues meant two drain loops, two idle-detection paths, two places where rendering logic had to stay in sync, and two places where bugs had to be fixed. The unified queue carries a typed discriminator so producers stay distinct but consumers share one pipeline. The consistent ● <label> rendering is a direct consequence.

Why headless deserves full support

Background agents are most useful to headless / SDK consumers: a script that spawns a long-running agent and polls or streams its progress is the exact scenario the feature was built for. The original implementation rejected run_in_background in non-interactive mode because the notification delivery loop only existed in the interactive path. That gap is now closed: headless runs have a cron-and-notification drain loop and a terminal hold-back phase that waits for running background agents before emitting the final result. SIGINT/SIGTERM cleanly aborts any in-flight agents rather than pinning the process until they complete on their own.

Why SDK events in addition to user messages

The notification already arrives as a user-role message carrying the display line. That's fine for conversational use but awkward for SDK consumers who want to know when an agent started, when it finished, what its status was, and what resources it used — without parsing free-form text. The new task_started and task_notification system events expose that data as a structured payload with task_id, status, and usage fields. Agent output still lands in the user message for conversational history; the structured event is additive.

Bug fixes rolled into this PR

Review and E2E testing surfaced several correctness issues that needed to land alongside the new surface area:

Background agents were losing the parent's resolved approval mode — the foreground path uses a config override so that trusted folders or agent-level approvalMode settings apply correctly; the background path originally bypassed that and fell back to the raw session config.
Non-interactive cron drains could race. Concurrent cron fires started overlapping drain loops, which in turn let the final-result emission fire while a cron turn was still streaming. The drain is now single-flight, and the cron-wait check won't resolve while a drain is active.
Queued-turn tool execution was dropping the stream-json approval update callback. In that input mode, permission-gated tools invoked by a notification follow-up would hang waiting for approvals that never reached the SDK.
Queued-turn API errors were being silently formatted as assistant text. Text-mode failures now surface as non-zero exit codes, matching the main turn loop.
Interactive cron prompts lost @/slash/shell preprocessing. The unification refactor accidentally short-circuited cron alongside plain notifications. Only notifications skip preprocessing; cron goes through the normal prompt path.
Cron prompts rendered twice (● Cron: plus a > user message) once preprocessing was restored. The user-message history item is now suppressed for cron specifically, so preprocessing runs but rendering stays single.
SIGINT during the cron wait phase was ignored. Recurring cron jobs never drop the scheduler size to zero, so the previous abort path was unreachable. An abort listener now resolves the cron promise so the hold-back loop can run.
The notification drain silently stalled in headless mode. A JavaScript assignment-order race in the single-flight drain wrapper left the drain reference stuck on a resolved promise forever, so callbacks that pushed notifications onto the queue never got processed. The fix moves the reference-clearing into a microtask that runs after the outer assignment.

Design-level decisions worth noting

Forks keep the full parent tool list. The fork path intentionally passes the parent's tool declarations verbatim (including agent and cron tools) so the fork shares the parent's DashScope cache prefix exactly. Forks are context-sharing extensions, not isolated subagents — the general subagent exclusion list doesn't apply. Recursive forks are still blocked by the ALS-based guard.
Background agents inherit the resolved approval mode. If a trusted folder escalates the parent from default to auto-edit, the background agent sees auto-edit too. auto-edit's auto-approval still only applies to edit-like confirmation types; anything else (notably shell) hits the background-specific deny-by-default, which is the only path that makes sense when there is no UI to prompt.
Background forks use the fork construction path. The base PR's background branch constructed a plain headless agent even when the tool call was an implicit fork, which threw away the parent-context fork setup. Background forks now go through the fork construction path too.

Reviewer Test Plan

The most impactful behaviors to verify manually:

Interactive cron rendering. With QWEN_CODE_ENABLE_CRON=1, ask the model to schedule a one-minute cron and wait for a fire. The fire should appear as a single ● Cron: <label> line, followed by the model's response. No duplicate user-prompt line above it.
Interactive mixed producers. Schedule a cron, then launch a background agent before the first cron fire. Both producers should deliver ●-prefixed notifications in FIFO order, each triggering its own model turn.
Cron @file / slash preprocessing. Schedule a cron whose prompt uses @<path>; when it fires, verify the file contents are expanded into the prompt exactly as they would be for a typed user message.
Headless background agent happy path. In --output-format json, ask the model to launch a background agent and say LAUNCHED. The process should exit 0 after the agent completes; the stream should contain system/task_started, the drain's user message, system/task_notification with the structured payload, and the drain-turn assistant response before the final result/success.
Headless failed / graceful-failure background agent. Same shape, but with a prompt that forces the subagent to handle a missing path. The task_notification payload's status should reflect reality (typically completed because the subagent handles the error gracefully).
Headless SIGTERM during cron. With a recurring cron plus a background agent, send SIGTERM. The process should exit cleanly within a couple of seconds rather than pinning until SIGKILL.
Interactive background agent permissions. From a default-mode session in a trusted folder, launch a background agent that only needs shell access. Shell confirmations should be auto-denied (no UI available); the notification should reflect that. In the same session, an edit-type tool should succeed because the resolved approval mode is auto-edit.

Unit suite:

cd packages/cli && npx vitest run src/nonInteractiveCli.test.ts src/ui/hooks/useGeminiStream.test.tsx
cd packages/core && npx vitest run src/agents/background-tasks.test.ts src/tools/agent/agent.test.ts

Testing Matrix

	🍏	🪟	🐧
npm run	❓	❓	✅
npx	❓	❓	❓
Docker	❓	❓	❓
Podman	❓	-	-
Seatbelt	❓	-	-

Linked issues / bugs

Stacks on #3076.

Migrate cron from its own queue (cronQueueRef / cronQueue) to the shared notification queue used by background agents. Both producers now push the same item shape { displayText, modelText, sendMessageType } and a single drain effect / helper processes them in FIFO order. Cron fires render as HistoryItemNotification (● prefix) instead of HistoryItemUser (> prefix), with a "Cron: <prompt>" display label. Records use subtype 'cron' for clean resume and analytics separation. Lift the non-interactive rejection for background agents. Register a notification callback in nonInteractiveCli.ts with a terminal hold-back phase (100ms poll) that keeps the process alive until all background agents complete and their notifications are processed.

Emit `task_started` when a background agent registers and `task_notification` when it completes, fails, or is cancelled, so headless/SDK consumers can track lifecycle without parsing display text. Model-facing text is now structured XML with status, summary, truncated result, and usage stats. Completion stats (tokens, tool uses, duration) are captured from the subagent and included in both the SDK payload and the model XML.

- Background subagents now inherit the resolved approval mode from agentConfig instead of the raw session config, so a subagent with `approvalMode: auto-edit` (or execution in a trusted folder) keeps that override when it runs asynchronously. - Non-interactive cron drains are single-flight: concurrent cron fires now await the same in-flight drain, and the cron-done check gates on it, preventing the final result from being emitted while a cron turn is still streaming. - Background forks go through createForkSubagent so they retain the parent's rendered system prompt and inherited history instead of degrading to a plain FORK_AGENT.

…rain - Hold-back loop now reacts to SIGINT/SIGTERM: when the main abort signal fires it calls registry.abortAll() so background agents with their own AbortControllers stop promptly instead of pinning the process open. - Queued-turn tool execution forwards the stream-json approval update callback (onToolCallsUpdate) so permission-gated tools inside a background-notification follow-up emit can_use_tool requests. - Queued-turn stream loop mirrors the main loop's text-mode handling of GeminiEventType.Error, writing to stderr and throwing so provider errors produce a non-zero exit code instead of silently succeeding. - Interactive cron prompts go through the normal slash/@-command/shell preprocessing again; only Notification messages skip that path.

Cron prompts already render as a `● Cron: …` notification via the queue drain, so adding them again as a `USER` history item produced a duplicate `> …` line.

The non-interactive cron phase awaits a Promise that resolves only when scheduler.size reaches 0 and no drain is in flight. Recurring cron jobs never drop the scheduler size to 0 on their own, so the previous abort handling (added to the hold-back loop) was unreachable — the process hung indefinitely after SIGINT/SIGTERM. Attach an abort listener inside the promise so abort stops the scheduler and resolves immediately, allowing the hold-back loop to run and the process to exit cleanly.

Plumb the scheduler's callId into AgentToolInvocation via an optional setCallId hook on the invocation, detected structurally in buildInvocation. The agent tool forwards it as toolUseId on the BackgroundTaskRegistry entry so completion notifications can carry a <tool-use-id> tag and SDK task_started / task_notification events can emit tool_use_id — letting consumers correlate background completions back to the original Agent tool-use that spawned them.

drainLocalQueue wrapped its body in an async IIFE and cleared the promise reference via finally. When the queue is empty the IIFE has no awaits, so its finally runs synchronously as part of the RHS of the assignment `drainPromise = (async () => {...})()` — clearing drainPromise BEFORE the outer assignment overwrites it with the resolved promise. The reference then stayed stuck on that fulfilled promise forever, so later calls short-circuited through `if (drainPromise) return drainPromise` and never processed queued notifications. Symptom: in headless `--output-format json` (and `stream-json`), task_started emitted but task_notification never did, even after the background agent completed. The process sat in the hold-back loop until SIGTERM. Fix: move the null-clearing out of the async body into an outer `.finally()` on the returned promise. `.finally()` runs as a microtask after the current synchronous block, so it clears the latest drainPromise reference instead of the pre-assignment null.

tanzhenxin · 2026-04-17T03:20:08Z

E2E Test Report

Plan: knowledge/qwen-code/e2e-tests/background-subagent.md (12 test cases — trimmed from 16, dropped one schema smoke test and three strict derivatives).

Environment: Linux, bundled build (node dist/cli.js, qwen-code 0.14.5), model glm-5.1 via DashScope compatible-mode endpoint, approval mode yolo except where noted.

Results

Group	Test	Mode	Result
B (Execution)	B1: Background agent completes and notifies parent	Interactive	✅ PASS
B (Execution)	B2: Multiple background agents can run concurrently	Interactive	✅ PASS
C (Error)	C1: Background agent failure delivers error notification	Interactive	✅ PASS
D (Cleanup)	D1: Background agents cleaned up on session exit	Interactive	✅ PASS
E (Permissions)	E1: Background agent denies tool calls needing interactive approval	Interactive	✅ PASS
F (Headless)	F1: Background agent accepted in headless mode	Headless	✅ PASS
F (Headless)	F2: Hold-back waits for background agent before exit	Headless	✅ PASS
G (Cron render)	G1: Cron fire renders as notification, not user message	Interactive	✅ PASS
H (Mixed)	H1: Cron and background agent notifications coexist	Interactive	✅ PASS
I (Headless mixed)	I1: Both producers drain through shared queue in headless	Headless	✅ PASS
J (XML)	J1: Notification contains XML with task metadata	Interactive	✅ PASS
K (SDK events)	K1: task_started and task_notification events on stream	Headless	✅ PASS

All 12 tests pass against the final branch state.

Regressions found and fixed during this pass

Several of the commits in this PR came out of the E2E run itself:

G1/H1 cron dual-render. Restoring @/slash preprocessing for cron prompts accidentally re-added a user-role history item alongside the ● Cron: notification. Fix suppresses the user-message addItem for cron type only.
I1 SIGTERM hang. A recurring cron keeps scheduler.size > 0 forever, so the cron-wait await new Promise had no path to resolve on SIGTERM. The hold-back abort path was unreachable until this was fixed. Now an abort listener on the cron promise resolves it so the hold-back loop can run and emitResult can fire.
K1/K2/L1 notification drain stall. The single-flight drainLocalQueue had a JavaScript assignment-order race: when the queue was empty, the async IIFE ran to completion synchronously (no awaits), and its finally { drainPromise = null } ran during the RHS of drainPromise = (async () => {...})(), so the outer assignment overwrote the null with the resolved promise. The reference got stuck forever, future calls short-circuited without ever draining, and notifications pushed after the first call were silently lost. The fix moves the null-clearing into a .finally() microtask attached to the promise after the outer assignment.

A structured-debugging journal for the drain race is in knowledge/qwen-code/investigations/task-notification-not-emitted-headless.md.

Representative K1 output after the fix

Fresh run, 15s total:

system/init
assistant/
assistant/
user/
system/task_started
user/
assistant/
user/
system/task_notification
assistant/
assistant/
result/success

task_notification.data:

{
  "task_id": "Explore-1776395495122",
  "tool_use_id": "tool-d4b991a27c684e72be65fb3df4796025",
  "status": "completed",
  "usage": { "total_tokens": 26355, "tool_uses": 5, "duration_ms": 10197 }
}

What was dropped from the plan (compared to the earlier working copy)

A1 (schema has run_in_background) — B1 covers it implicitly.
F3 (multiple bg agents in headless) — covered by B2 + F1/I1.
K2 (failed bg agent → task_notification) — identical code path to K1, only the status field differs. Verified once inline after the drain-race fix; payload shape matches.
L1 (completion stats) — same code path as K1; verified once as the primary repro case for the drain-race fix.

github-actions · 2026-04-17T03:20:36Z

📋 Review Summary

This PR successfully extends the background-subagent feature to headless mode, unifies cron and background notifications into a single queue, and adds structured SDK events for programmatic consumers. The implementation is well-architected with careful attention to race conditions, cleanup, and consistency between interactive and non-interactive modes. Several important bug fixes from the base PR are included, particularly around approval mode propagation and fork subagent context preservation.

🔍 General Feedback

Strong architectural decision: Unifying cron and background notifications into a single typed queue eliminates code duplication and ensures consistent rendering
Excellent race condition handling: The drainPromise single-flight pattern with .finally() null-clearing shows deep understanding of async timing issues
Comprehensive headless support: Background agents now work in non-interactive mode with proper hold-back logic and SIGINT/SIGTERM handling
Well-structured SDK events: The task_started and task_notification system events provide clean structured data without breaking conversational history
Thoughtful bug fixes: Approval mode propagation, fork context preservation, and toolUseId tracking address critical gaps from the base PR

🎯 Specific Feedback

🔴 Critical

packages/cli/src/nonInteractiveCli.ts:548-560 - The drainLocalQueue function's race condition prevention is clever but fragile. The comment explains the issue well, but this pattern should be extracted into a reusable helper with tests. Consider creating a SingleFlightDrainQueue<T> class that encapsulates this logic.
packages/core/src/tools/agent/agent.ts:1020-1030 - Background agents in non-interactive mode now work, but the comment states "PermissionRequest hooks still run and can override the denial" - this needs verification that the hook system actually functions correctly in headless mode where there's no UI to present prompts.

🟡 High

packages/cli/src/nonInteractiveCli.ts:470-475 - The localQueue.push for cron jobs creates a simple string label, but the background agent path includes rich metadata (sdkNotification). Consider making the cron queue items structurally consistent with notification items for easier future maintenance.
packages/core/src/agents/background-tasks.ts:230-245 - The XML notification format is good for model consumption, but the MAX_RESULT_LENGTH truncation at 2000 chars could cut off important error context. Consider making this configurable or at least logging when truncation occurs.
packages/cli/src/ui/hooks/useGeminiStream.ts:1928 - The unified queue drain effect fires on every notificationTrigger change. If multiple notifications arrive rapidly, this could cause excessive re-renders. Consider debouncing or batching.

🟢 Medium

packages/core/src/agents/background-tasks.ts:68-77 - The registerCallback try-catch silently swallows errors. At minimum, this should log the error (which it does), but consider whether failing the callback should fail the registration or at least surface to the caller.
packages/core/src/tools/agent/agent.ts:1027-1038 - The fork vs non-fork background agent creation is duplicated logic that could be extracted. The comment helps, but a private method like createBackgroundSubagent() would improve readability.
packages/cli/src/nonInteractiveCli.ts:706-713 - The cleanup in finally block catches and ignores errors, which is reasonable, but consider logging at debug level if cleanup fails for troubleshooting.
packages/core/src/services/chatRecordingService.ts:318-322 - The recordNotificationLike helper is good, but the subtype parameter could be a union type that's extended when new notification types are added, providing compile-time safety.

🔵 Low

packages/cli/src/nonInteractiveCli.ts:268 - The LocalQueueItem interface is defined inline. Consider moving this to a shared types file if it might be reused or referenced elsewhere.
packages/core/src/agents/background-tasks.ts:25 - The AgentCompletionStats interface uses toolUses but the XML output uses tool_uses (line 241). Pick one naming convention for consistency.
packages/cli/src/ui/utils/resumeHistoryUtils.ts:262-267 - The fallback text for cron vs notification is a nice touch, but these strings should be i18n'd if the project has internationalization.
packages/core/src/tools/agent/agent.ts:410 - The callId property is optional but has no JSDoc explaining when it might be absent.

✅ Highlights

Excellent comment quality: The detailed comments explaining the drainPromise race condition (lines 524-536) and the fork subagent context preservation (lines 1027-1034) are exemplary - they explain the "why" clearly
Proper cleanup patterns: Signal handler cleanup, callback nulling in finally blocks, and registry cleanup show good resource management discipline
Test coverage: The new tests for toolUseId propagation (both presence and absence cases) demonstrate thorough edge case thinking
Consistent rendering: Cron notifications now match background agent notifications visually (● <label> format), improving UX consistency
Structured event design: The SDK events cleanly separate conversational history (user messages) from programmatic tracking (system events) without duplication

…sn't erase the line Headless text mode wrote `resultMessage.result` without a trailing newline. In a TTY, zsh themes that use PROMPT_SP (powerlevel10k, agnoster, …) detect the missing `\n` and emit `\r\033[K` before drawing the next prompt, which wipes the final line off the screen. Pipe-captured output was unaffected, so the bug only surfaced for interactive shell users — most visibly in the background-agent flow where the drain-loop's final assistant message is the *only* stdout write in text mode. Append `\n` to both the success (stdout) and error (stderr) writes.

Mirror the simplified blurb from .claude/skills/structured-debugging/SKILL.md (knowledge repo). Drops the round-by-round narrative; keeps the contradiction + two lessons.

…neralized path, value-logging guidance) Mirror of knowledge repo commit 38eb28d into the qwen-code .qwen/skills copy.

…ging/ Mirrors knowledge/.claude/skills/structured-debugging/examples/ headless-bg-agent-empty-stdout.md so the .qwen copy of the skill links resolve.

…-notifications # Conflicts: # packages/core/src/tools/agent/agent.ts

Three regressions surfaced by Codex review of feat/background-subagent: - Cron drain rejections were dropped by a bare `void`, so a failing queued turn left the outer Promise unresolved and hung the run. Route drain failures through the Promise's reject so they propagate to the outer catch. - The background-agent registry entry was inserted before `createForkSubagent()` / `createAgentHeadless()` was awaited. Failed init returned an error from the tool call but left a phantom `running` entry, and the headless hold-back loop (`registry.getRunning()`) waited forever. Register only after init succeeds. - SIGINT/SIGTERM during the hold-back phase aborted background tasks, then fell through to `emitResult({ isError: false })`, so a cancelled `qwen -p ...` exited 0 with the prior assistant text. Route through `handleCancellationError()` so cancellation exits non-zero, matching the main turn loop.

`feadf052f` appended `\n` to text-mode `emitResult` output, but the nonInteractiveCli tests still asserted the pre-change strings. Update the 11 affected assertions to expect the trailing newline.

Four additional issues from the PR review that the prior regression-fix commit didn't cover: - Escape XML metacharacters when interpolating `description`, `result`, `error`, `agentId`, `toolUseId`, and `status` into the task-notification envelope. Subagent output (which itself may carry untrusted tool output, fetched HTML, or another agent's notification) could contain `</result>` or `</task-notification>` and forge sibling tags the parent model would treat as trusted metadata. Truncate result text *before* escaping so the truncation never slices through an entity like `&`. - Emit the terminal notification from `cancel()` and `abortAll()`. The fire-and-forget `complete()`/`fail()` from the subagent task is guarded by `status !== 'running'` and was no-op'd after cancellation, so SDK consumers saw `task_started` with no matching `task_notification`, breaking the contract this PR establishes. Updated two race-guard tests that asserted the old behavior. - Call `adapter.finalizeAssistantMessage()` before the abort-triggered early return inside `drainOneItem`'s stream loop. Without it, `startAssistantMessage()` had already been called, so stream-json mode left `message_start` unpaired. - Enforce `config.getMaxSessionTurns()` in `drainOneItem` for symmetry with the main turn loop. Cron fires and notification replies otherwise bypass the budget cap in headless runs.

* feat(core): add run_in_background support for Agent tool Enable sub-agents to run asynchronously via `run_in_background: true` parameter. Background agents execute independently from the parent, which receives an immediate launch confirmation and continues working. A notification is injected into the parent conversation when the background agent completes. Key changes: - BackgroundTaskRegistry tracks lifecycle of background agents - Agent tool gains async execution path with fire-and-forget semantics - Background agents use YOLO approval mode to prevent deadlock - Independent AbortControllers survive parent ESC cancellation - CLI bridges notifications via useMessageQueue for between-turn delivery - State race guards prevent complete/fail after cancellation - Session cleanup aborts all running background agents * feat(background): improve notification formatting and UI handling - Add prefix/separator protocol to distinguish background notifications from user input - Show concise summary in UI while sending full details to LLM - Add 'notification' history item type with specialized display - Add 'background' agent status for background-running agents - Prevent notifications from polluting prompt history (up-arrow) - Truncate long descriptions in display text This improves the UX for background agents by showing cleaner, more concise notifications while preserving full context for the LLM. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(background): reject run_in_background in non-interactive mode Headless mode skips AppContainer, so the notification callback is never registered and background agent results would be silently dropped. Return an error prompting the model to retry without run_in_background. * refactor(background): replace prefix/separator protocol with typed notification queue Replace the stringly-typed \x00__BG_NOTIFY__\x00 prefix/separator encoding with a typed notification path using SendMessageType.Notification. - Add SendMessageType.Notification to the enum - Change BackgroundNotificationCallback to emit (displayText, modelText) - Move notification queue from AppContainer into useGeminiStream (mirrors the cron queue pattern): register on registry, queue structured items, drain on idle via submitQuery - prepareQueryForGemini short-circuits for Notification type (skips slash commands, shell mode, @-commands, prompt history logging) - Remove BACKGROUND_NOTIFICATION_PREFIX/SEPARATOR constants * refactor(background): move abortAll to Config.shutdown Background agent cleanup belongs in Config.shutdown() alongside other resource teardown (skillManager, toolRegistry, arenaRuntime), not in AppContainer's registerCleanup. This also ensures headless mode gets cleanup for free. * fix(background): persist notification items for session resume Background agent notifications were missing after session resume because they were never recorded in the chat history. The model text was absent from the API history and the display item was lost. - Add recordNotification() to ChatRecordingService — stores as user-role message with subtype 'notification' and displayText payload - Thread notificationDisplayText through submitQuery → sendMessageStream - Restore as HistoryItemNotification in resumeHistoryUtils * fix(background): replace YOLO with deny-by-default for background agents Background agents were using YOLO approval mode which auto-approves all tool calls — too permissive. Replace with shouldAvoidPermissionPrompts which auto-denies tool calls that need interactive approval, matching claw-code's approach. The permission flow for background agents is now: 1. L3/L4 permission rules (allow/deny) — same as foreground 2. Approval mode overrides (AUTO_EDIT for edits) — same as foreground 3. PermissionRequest hooks — can override the denial 4. Auto-deny — if no hook decided, deny because prompts are unavailable * fix(background): add missing getBackgroundTaskRegistry mock in useGeminiStream tests * refactor(core): move fork subagent params from execute() to construction time Identity-shaping fork inputs (parent history, generationConfig, tool decls, env-skip flag) were threaded through `AgentHeadless.execute()`'s options bag and re-passed by the SubagentStop hook retry loop. They belong on the agent's construction-time configs, not its per-invocation options. - PromptConfig gains `renderedSystemPrompt` (verbatim, bypasses templating and userMemory injection) and drops the `systemPrompt`/`initialMessages` XOR so fork can carry both. createChat skips env bootstrap when `initialMessages` is non-empty. - AgentHeadless.execute() shrinks to (context, signal?). Fork dispatch in agent.ts builds synthetic PromptConfig/ModelConfig/ToolConfig from the parent's cache-safe params and calls AgentHeadless.create directly (bypassing SubagentManager). Parent's tool decls flow through verbatim including the `agent` tool itself for cache parity. - Recursive-fork prevention switches from fork-side tool stripping to a runtime guard. The previous `isInForkChild(history)` helper was dead code (it scanned the main GeminiClient's history, not the fork child's chat). Replaced with `isInForkExecution()` backed by AsyncLocalStorage: the fork's background execution runs inside `runInForkContext`, and the ALS frame propagates through the standard async chain into nested AgentTool.execute() calls where the guard fires. * refactor(core): move agent tool files into dedicated tools/agent/ directory Move agent.ts, agent.test.ts, and fork-subagent.ts under tools/agent/ and update all import paths accordingly. * refactor(core): remove dead temp and top_p fields from ModelConfig These fields were never populated from subagent frontmatter and served no purpose in the fork path either. The ModelConfig interface retains only the actively-used model field. * refactor(core): read parent generation config directly instead of getCacheSafeParams Fork subagent now reads system instruction and tool declarations from the live GeminiChat via getGenerationConfig() instead of the global getCacheSafeParams() snapshot. This removes the cross-module coupling between the agent tool and the followup infrastructure. * fix(core): prevent duplicate tool declarations when toolConfig has only inline decls prepareTools() treated asStrings.length === 0 as "add all registry tools", which is correct when no tools are specified at all, but wrong when the caller provides only inline FunctionDeclaration[] (no string names). The fork path passes parent tool declarations as inline decls for cache parity, so prepareTools was adding the full registry set on top — duplicating every non-excluded tool. Add onlyInlineDecls.length === 0 to the condition so that pure-inline toolConfigs bypass the registry entirely. * feat(core): support agent-level `background: true` in frontmatter Subagent definitions can now declare `background: true` in their YAML frontmatter to always run as background tasks. This is OR'd with the `run_in_background` tool parameter — useful for monitors, watchers, and proactive agents so the LLM doesn't need to remember to set the flag. * fix(core): address background subagent lifecycle gaps - Inherit bgConfig from agentConfig so the resolved approval mode is preserved for background agents (foreground would run AUTO_EDIT but background fell back to DEFAULT, which combined with shouldAvoid- PermissionPrompts would auto-deny every permission request). - Honor SubagentStop blocking decisions in background runs by looping on hook output up to 5 iterations, matching runSubagentWithHooks. - Check terminate mode before reporting completion; non-GOAL modes (ERROR, MAX_TURNS, TIMEOUT) are now reported as failures instead of emitting a success notification for an incomplete run. - Exclude SendMessageType.Notification from the UserPromptSubmit hook guard so background completion messages are not rewritten or blocked as if they were user input. * feat(cli): headless support and SDK task events for background agents (#3379) * feat(cli): unify notification queue for cron and background agents Migrate cron from its own queue (cronQueueRef / cronQueue) to the shared notification queue used by background agents. Both producers now push the same item shape { displayText, modelText, sendMessageType } and a single drain effect / helper processes them in FIFO order. Cron fires render as HistoryItemNotification (● prefix) instead of HistoryItemUser (> prefix), with a "Cron: <prompt>" display label. Records use subtype 'cron' for clean resume and analytics separation. Lift the non-interactive rejection for background agents. Register a notification callback in nonInteractiveCli.ts with a terminal hold-back phase (100ms poll) that keeps the process alive until all background agents complete and their notifications are processed. * feat(cli): emit SDK task events for background subagents Emit `task_started` when a background agent registers and `task_notification` when it completes, fails, or is cancelled, so headless/SDK consumers can track lifecycle without parsing display text. Model-facing text is now structured XML with status, summary, truncated result, and usage stats. Completion stats (tokens, tool uses, duration) are captured from the subagent and included in both the SDK payload and the model XML. * fix: address codex review issues for background subagents - Background subagents now inherit the resolved approval mode from agentConfig instead of the raw session config, so a subagent with `approvalMode: auto-edit` (or execution in a trusted folder) keeps that override when it runs asynchronously. - Non-interactive cron drains are single-flight: concurrent cron fires now await the same in-flight drain, and the cron-done check gates on it, preventing the final result from being emitted while a cron turn is still streaming. - Background forks go through createForkSubagent so they retain the parent's rendered system prompt and inherited history instead of degrading to a plain FORK_AGENT. * fix(cli): restore cancellation, approval, and error paths in queued drain - Hold-back loop now reacts to SIGINT/SIGTERM: when the main abort signal fires it calls registry.abortAll() so background agents with their own AbortControllers stop promptly instead of pinning the process open. - Queued-turn tool execution forwards the stream-json approval update callback (onToolCallsUpdate) so permission-gated tools inside a background-notification follow-up emit can_use_tool requests. - Queued-turn stream loop mirrors the main loop's text-mode handling of GeminiEventType.Error, writing to stderr and throwing so provider errors produce a non-zero exit code instead of silently succeeding. - Interactive cron prompts go through the normal slash/@-command/shell preprocessing again; only Notification messages skip that path. * fix(cli): skip duplicate user-message item for cron prompts Cron prompts already render as a `● Cron: …` notification via the queue drain, so adding them again as a `USER` history item produced a duplicate `> …` line. * fix(cli): honor SIGINT/SIGTERM during cron scheduler wait The non-interactive cron phase awaits a Promise that resolves only when scheduler.size reaches 0 and no drain is in flight. Recurring cron jobs never drop the scheduler size to 0 on their own, so the previous abort handling (added to the hold-back loop) was unreachable — the process hung indefinitely after SIGINT/SIGTERM. Attach an abort listener inside the promise so abort stops the scheduler and resolves immediately, allowing the hold-back loop to run and the process to exit cleanly. * feat(core): propagate tool-use id through background agent notifications Plumb the scheduler's callId into AgentToolInvocation via an optional setCallId hook on the invocation, detected structurally in buildInvocation. The agent tool forwards it as toolUseId on the BackgroundTaskRegistry entry so completion notifications can carry a <tool-use-id> tag and SDK task_started / task_notification events can emit tool_use_id — letting consumers correlate background completions back to the original Agent tool-use that spawned them. * fix(cli): drain single-flight race kept task_notification from emitting drainLocalQueue wrapped its body in an async IIFE and cleared the promise reference via finally. When the queue is empty the IIFE has no awaits, so its finally runs synchronously as part of the RHS of the assignment `drainPromise = (async () => {...})()` — clearing drainPromise BEFORE the outer assignment overwrites it with the resolved promise. The reference then stayed stuck on that fulfilled promise forever, so later calls short-circuited through `if (drainPromise) return drainPromise` and never processed queued notifications. Symptom: in headless `--output-format json` (and `stream-json`), task_started emitted but task_notification never did, even after the background agent completed. The process sat in the hold-back loop until SIGTERM. Fix: move the null-clearing out of the async body into an outer `.finally()` on the returned promise. `.finally()` runs as a microtask after the current synchronous block, so it clears the latest drainPromise reference instead of the pre-assignment null. * fix(cli): append newline to text-mode emitResult so zsh PROMPT_SP doesn't erase the line Headless text mode wrote `resultMessage.result` without a trailing newline. In a TTY, zsh themes that use PROMPT_SP (powerlevel10k, agnoster, …) detect the missing `\n` and emit `\r\033[K` before drawing the next prompt, which wipes the final line off the screen. Pipe-captured output was unaffected, so the bug only surfaced for interactive shell users — most visibly in the background-agent flow where the drain-loop's final assistant message is the *only* stdout write in text mode. Append `\n` to both the success (stdout) and error (stderr) writes. * docs(skill): tighten worked-example blurb in structured-debugging Mirror the simplified blurb from .claude/skills/structured-debugging/SKILL.md (knowledge repo). Drops the round-by-round narrative; keeps the contradiction + two lessons. * docs(skill): mirror SKILL.md improvements (reframing failure mode, generalized path, value-logging guidance) Mirror of knowledge repo commit 38eb28d into the qwen-code .qwen/skills copy. * docs(skill): mirror worked example into .qwen/skills/structured-debugging/ Mirrors knowledge/.claude/skills/structured-debugging/examples/ headless-bg-agent-empty-stdout.md so the .qwen copy of the skill links resolve. * docs(skill): mirror generalized side-note path guidance * fix(cli): harden headless cron and background-agent failure paths Three regressions surfaced by Codex review of feat/background-subagent: - Cron drain rejections were dropped by a bare `void`, so a failing queued turn left the outer Promise unresolved and hung the run. Route drain failures through the Promise's reject so they propagate to the outer catch. - The background-agent registry entry was inserted before `createForkSubagent()` / `createAgentHeadless()` was awaited. Failed init returned an error from the tool call but left a phantom `running` entry, and the headless hold-back loop (`registry.getRunning()`) waited forever. Register only after init succeeds. - SIGINT/SIGTERM during the hold-back phase aborted background tasks, then fell through to `emitResult({ isError: false })`, so a cancelled `qwen -p ...` exited 0 with the prior assistant text. Route through `handleCancellationError()` so cancellation exits non-zero, matching the main turn loop. * test(cli): update stdout/stderr assertions for trailing newline `feadf052f` appended `\n` to text-mode `emitResult` output, but the nonInteractiveCli tests still asserted the pre-change strings. Update the 11 affected assertions to expect the trailing newline. * fix: address review comments on background-agent notifications Four additional issues from the PR review that the prior regression-fix commit didn't cover: - Escape XML metacharacters when interpolating `description`, `result`, `error`, `agentId`, `toolUseId`, and `status` into the task-notification envelope. Subagent output (which itself may carry untrusted tool output, fetched HTML, or another agent's notification) could contain `</result>` or `</task-notification>` and forge sibling tags the parent model would treat as trusted metadata. Truncate result text *before* escaping so the truncation never slices through an entity like `&`. - Emit the terminal notification from `cancel()` and `abortAll()`. The fire-and-forget `complete()`/`fail()` from the subagent task is guarded by `status !== 'running'` and was no-op'd after cancellation, so SDK consumers saw `task_started` with no matching `task_notification`, breaking the contract this PR establishes. Updated two race-guard tests that asserted the old behavior. - Call `adapter.finalizeAssistantMessage()` before the abort-triggered early return inside `drainOneItem`'s stream loop. Without it, `startAssistantMessage()` had already been called, so stream-json mode left `message_start` unpaired. - Enforce `config.getMaxSessionTurns()` in `drainOneItem` for symmetry with the main turn loop. Cron fires and notification replies otherwise bypass the budget cap in headless runs. * fix: address codex review comments for background subagents - Wrap background fork execute() in runInForkContext so the recursive-fork guard (AsyncLocalStorage-based) fires when a background fork's child model calls `agent` again. Previously only the foreground fork path was wrapped, so background forks could spawn nested implicit forks. - Emit queued terminal task_notifications on SIGINT/SIGTERM before handleCancellationError exits. abortAll() enqueues cancellation notifications via the registry callback, but the process was exiting before the drain loop had a chance to flush them — leaving stream-json consumers that already saw task_started without a matching terminal task_notification. Extracted the SDK-emit block into a shared emitNotificationToSdk helper reused by the normal drain and the cancellation flush. - Skip notification/cron subtypes in ACP HistoryReplayer. These records are persisted as type: 'user' so the model's chat history keeps them for continuity, but they were never user input — replaying them leaked raw <task-notification> XML (and cron prompts) back into the ACP session as if the user typed them. * test(cli): sync JsonOutputAdapter text-mode assertions with trailing newline Commit 0da1182 appended a newline to text-mode emitResult output (zsh PROMPT_SP fix) and updated the nonInteractiveCli tests, but four assertions in JsonOutputAdapter.test.ts were missed. Update them to expect the trailing newline so CI passes. * refactor: simplify background subagent plumbing - Extract the SubagentStop hook blocking-decision loop into a runSubagentStopHookLoop helper so the foreground and background paths no longer duplicate the iteration/abort/log scaffolding. - Unify BackgroundTaskRegistry.abortAll to delegate to cancel, removing copy-pasted abort/notification bookkeeping. - Drop the unused findByName and BackgroundAgentEntry.name field. - In nonInteractiveCli drain, hoist inputFormat and toolCallUpdateCallback out of the inner tool loop, and drop the unreachable try/catch around the readonly registry. - Trim boilerplate doc/narration comments while keeping load-bearing WHY comments. * fix: address codex review comments for background subagents - Use tool callId (or short random suffix) instead of Date.now() for background agentIds; avoids registry collisions when parallel same-type agents launch in the same millisecond. - Reset loopDetector and lastPromptId for Notification turns so a prior turn's loop count doesn't trip LoopDetected on the notification response. - Replay notification/cron displayText in ACP HistoryReplayer so the assistant reply has an antecedent in resumed transcripts. --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

* feat(core): add run_in_background support for Agent tool Enable sub-agents to run asynchronously via `run_in_background: true` parameter. Background agents execute independently from the parent, which receives an immediate launch confirmation and continues working. A notification is injected into the parent conversation when the background agent completes. Key changes: - BackgroundTaskRegistry tracks lifecycle of background agents - Agent tool gains async execution path with fire-and-forget semantics - Background agents use YOLO approval mode to prevent deadlock - Independent AbortControllers survive parent ESC cancellation - CLI bridges notifications via useMessageQueue for between-turn delivery - State race guards prevent complete/fail after cancellation - Session cleanup aborts all running background agents * feat(background): improve notification formatting and UI handling - Add prefix/separator protocol to distinguish background notifications from user input - Show concise summary in UI while sending full details to LLM - Add 'notification' history item type with specialized display - Add 'background' agent status for background-running agents - Prevent notifications from polluting prompt history (up-arrow) - Truncate long descriptions in display text This improves the UX for background agents by showing cleaner, more concise notifications while preserving full context for the LLM. Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com> * fix(background): reject run_in_background in non-interactive mode Headless mode skips AppContainer, so the notification callback is never registered and background agent results would be silently dropped. Return an error prompting the model to retry without run_in_background. * refactor(background): replace prefix/separator protocol with typed notification queue Replace the stringly-typed \x00__BG_NOTIFY__\x00 prefix/separator encoding with a typed notification path using SendMessageType.Notification. - Add SendMessageType.Notification to the enum - Change BackgroundNotificationCallback to emit (displayText, modelText) - Move notification queue from AppContainer into useGeminiStream (mirrors the cron queue pattern): register on registry, queue structured items, drain on idle via submitQuery - prepareQueryForGemini short-circuits for Notification type (skips slash commands, shell mode, @-commands, prompt history logging) - Remove BACKGROUND_NOTIFICATION_PREFIX/SEPARATOR constants * refactor(background): move abortAll to Config.shutdown Background agent cleanup belongs in Config.shutdown() alongside other resource teardown (skillManager, toolRegistry, arenaRuntime), not in AppContainer's registerCleanup. This also ensures headless mode gets cleanup for free. * fix(background): persist notification items for session resume Background agent notifications were missing after session resume because they were never recorded in the chat history. The model text was absent from the API history and the display item was lost. - Add recordNotification() to ChatRecordingService — stores as user-role message with subtype 'notification' and displayText payload - Thread notificationDisplayText through submitQuery → sendMessageStream - Restore as HistoryItemNotification in resumeHistoryUtils * fix(background): replace YOLO with deny-by-default for background agents Background agents were using YOLO approval mode which auto-approves all tool calls — too permissive. Replace with shouldAvoidPermissionPrompts which auto-denies tool calls that need interactive approval, matching claw-code's approach. The permission flow for background agents is now: 1. L3/L4 permission rules (allow/deny) — same as foreground 2. Approval mode overrides (AUTO_EDIT for edits) — same as foreground 3. PermissionRequest hooks — can override the denial 4. Auto-deny — if no hook decided, deny because prompts are unavailable * fix(background): add missing getBackgroundTaskRegistry mock in useGeminiStream tests * refactor(core): move fork subagent params from execute() to construction time Identity-shaping fork inputs (parent history, generationConfig, tool decls, env-skip flag) were threaded through `AgentHeadless.execute()`'s options bag and re-passed by the SubagentStop hook retry loop. They belong on the agent's construction-time configs, not its per-invocation options. - PromptConfig gains `renderedSystemPrompt` (verbatim, bypasses templating and userMemory injection) and drops the `systemPrompt`/`initialMessages` XOR so fork can carry both. createChat skips env bootstrap when `initialMessages` is non-empty. - AgentHeadless.execute() shrinks to (context, signal?). Fork dispatch in agent.ts builds synthetic PromptConfig/ModelConfig/ToolConfig from the parent's cache-safe params and calls AgentHeadless.create directly (bypassing SubagentManager). Parent's tool decls flow through verbatim including the `agent` tool itself for cache parity. - Recursive-fork prevention switches from fork-side tool stripping to a runtime guard. The previous `isInForkChild(history)` helper was dead code (it scanned the main GeminiClient's history, not the fork child's chat). Replaced with `isInForkExecution()` backed by AsyncLocalStorage: the fork's background execution runs inside `runInForkContext`, and the ALS frame propagates through the standard async chain into nested AgentTool.execute() calls where the guard fires. * refactor(core): move agent tool files into dedicated tools/agent/ directory Move agent.ts, agent.test.ts, and fork-subagent.ts under tools/agent/ and update all import paths accordingly. * refactor(core): remove dead temp and top_p fields from ModelConfig These fields were never populated from subagent frontmatter and served no purpose in the fork path either. The ModelConfig interface retains only the actively-used model field. * refactor(core): read parent generation config directly instead of getCacheSafeParams Fork subagent now reads system instruction and tool declarations from the live GeminiChat via getGenerationConfig() instead of the global getCacheSafeParams() snapshot. This removes the cross-module coupling between the agent tool and the followup infrastructure. * fix(core): prevent duplicate tool declarations when toolConfig has only inline decls prepareTools() treated asStrings.length === 0 as "add all registry tools", which is correct when no tools are specified at all, but wrong when the caller provides only inline FunctionDeclaration[] (no string names). The fork path passes parent tool declarations as inline decls for cache parity, so prepareTools was adding the full registry set on top — duplicating every non-excluded tool. Add onlyInlineDecls.length === 0 to the condition so that pure-inline toolConfigs bypass the registry entirely. * feat(core): support agent-level `background: true` in frontmatter Subagent definitions can now declare `background: true` in their YAML frontmatter to always run as background tasks. This is OR'd with the `run_in_background` tool parameter — useful for monitors, watchers, and proactive agents so the LLM doesn't need to remember to set the flag. * fix(core): address background subagent lifecycle gaps - Inherit bgConfig from agentConfig so the resolved approval mode is preserved for background agents (foreground would run AUTO_EDIT but background fell back to DEFAULT, which combined with shouldAvoid- PermissionPrompts would auto-deny every permission request). - Honor SubagentStop blocking decisions in background runs by looping on hook output up to 5 iterations, matching runSubagentWithHooks. - Check terminate mode before reporting completion; non-GOAL modes (ERROR, MAX_TURNS, TIMEOUT) are now reported as failures instead of emitting a success notification for an incomplete run. - Exclude SendMessageType.Notification from the UserPromptSubmit hook guard so background completion messages are not rewritten or blocked as if they were user input. * feat(cli): headless support and SDK task events for background agents (QwenLM#3379) * feat(cli): unify notification queue for cron and background agents Migrate cron from its own queue (cronQueueRef / cronQueue) to the shared notification queue used by background agents. Both producers now push the same item shape { displayText, modelText, sendMessageType } and a single drain effect / helper processes them in FIFO order. Cron fires render as HistoryItemNotification (● prefix) instead of HistoryItemUser (> prefix), with a "Cron: <prompt>" display label. Records use subtype 'cron' for clean resume and analytics separation. Lift the non-interactive rejection for background agents. Register a notification callback in nonInteractiveCli.ts with a terminal hold-back phase (100ms poll) that keeps the process alive until all background agents complete and their notifications are processed. * feat(cli): emit SDK task events for background subagents Emit `task_started` when a background agent registers and `task_notification` when it completes, fails, or is cancelled, so headless/SDK consumers can track lifecycle without parsing display text. Model-facing text is now structured XML with status, summary, truncated result, and usage stats. Completion stats (tokens, tool uses, duration) are captured from the subagent and included in both the SDK payload and the model XML. * fix: address codex review issues for background subagents - Background subagents now inherit the resolved approval mode from agentConfig instead of the raw session config, so a subagent with `approvalMode: auto-edit` (or execution in a trusted folder) keeps that override when it runs asynchronously. - Non-interactive cron drains are single-flight: concurrent cron fires now await the same in-flight drain, and the cron-done check gates on it, preventing the final result from being emitted while a cron turn is still streaming. - Background forks go through createForkSubagent so they retain the parent's rendered system prompt and inherited history instead of degrading to a plain FORK_AGENT. * fix(cli): restore cancellation, approval, and error paths in queued drain - Hold-back loop now reacts to SIGINT/SIGTERM: when the main abort signal fires it calls registry.abortAll() so background agents with their own AbortControllers stop promptly instead of pinning the process open. - Queued-turn tool execution forwards the stream-json approval update callback (onToolCallsUpdate) so permission-gated tools inside a background-notification follow-up emit can_use_tool requests. - Queued-turn stream loop mirrors the main loop's text-mode handling of GeminiEventType.Error, writing to stderr and throwing so provider errors produce a non-zero exit code instead of silently succeeding. - Interactive cron prompts go through the normal slash/@-command/shell preprocessing again; only Notification messages skip that path. * fix(cli): skip duplicate user-message item for cron prompts Cron prompts already render as a `● Cron: …` notification via the queue drain, so adding them again as a `USER` history item produced a duplicate `> …` line. * fix(cli): honor SIGINT/SIGTERM during cron scheduler wait The non-interactive cron phase awaits a Promise that resolves only when scheduler.size reaches 0 and no drain is in flight. Recurring cron jobs never drop the scheduler size to 0 on their own, so the previous abort handling (added to the hold-back loop) was unreachable — the process hung indefinitely after SIGINT/SIGTERM. Attach an abort listener inside the promise so abort stops the scheduler and resolves immediately, allowing the hold-back loop to run and the process to exit cleanly. * feat(core): propagate tool-use id through background agent notifications Plumb the scheduler's callId into AgentToolInvocation via an optional setCallId hook on the invocation, detected structurally in buildInvocation. The agent tool forwards it as toolUseId on the BackgroundTaskRegistry entry so completion notifications can carry a <tool-use-id> tag and SDK task_started / task_notification events can emit tool_use_id — letting consumers correlate background completions back to the original Agent tool-use that spawned them. * fix(cli): drain single-flight race kept task_notification from emitting drainLocalQueue wrapped its body in an async IIFE and cleared the promise reference via finally. When the queue is empty the IIFE has no awaits, so its finally runs synchronously as part of the RHS of the assignment `drainPromise = (async () => {...})()` — clearing drainPromise BEFORE the outer assignment overwrites it with the resolved promise. The reference then stayed stuck on that fulfilled promise forever, so later calls short-circuited through `if (drainPromise) return drainPromise` and never processed queued notifications. Symptom: in headless `--output-format json` (and `stream-json`), task_started emitted but task_notification never did, even after the background agent completed. The process sat in the hold-back loop until SIGTERM. Fix: move the null-clearing out of the async body into an outer `.finally()` on the returned promise. `.finally()` runs as a microtask after the current synchronous block, so it clears the latest drainPromise reference instead of the pre-assignment null. * fix(cli): append newline to text-mode emitResult so zsh PROMPT_SP doesn't erase the line Headless text mode wrote `resultMessage.result` without a trailing newline. In a TTY, zsh themes that use PROMPT_SP (powerlevel10k, agnoster, …) detect the missing `\n` and emit `\r\033[K` before drawing the next prompt, which wipes the final line off the screen. Pipe-captured output was unaffected, so the bug only surfaced for interactive shell users — most visibly in the background-agent flow where the drain-loop's final assistant message is the *only* stdout write in text mode. Append `\n` to both the success (stdout) and error (stderr) writes. * docs(skill): tighten worked-example blurb in structured-debugging Mirror the simplified blurb from .claude/skills/structured-debugging/SKILL.md (knowledge repo). Drops the round-by-round narrative; keeps the contradiction + two lessons. * docs(skill): mirror SKILL.md improvements (reframing failure mode, generalized path, value-logging guidance) Mirror of knowledge repo commit 38eb28d into the qwen-code .qwen/skills copy. * docs(skill): mirror worked example into .qwen/skills/structured-debugging/ Mirrors knowledge/.claude/skills/structured-debugging/examples/ headless-bg-agent-empty-stdout.md so the .qwen copy of the skill links resolve. * docs(skill): mirror generalized side-note path guidance * fix(cli): harden headless cron and background-agent failure paths Three regressions surfaced by Codex review of feat/background-subagent: - Cron drain rejections were dropped by a bare `void`, so a failing queued turn left the outer Promise unresolved and hung the run. Route drain failures through the Promise's reject so they propagate to the outer catch. - The background-agent registry entry was inserted before `createForkSubagent()` / `createAgentHeadless()` was awaited. Failed init returned an error from the tool call but left a phantom `running` entry, and the headless hold-back loop (`registry.getRunning()`) waited forever. Register only after init succeeds. - SIGINT/SIGTERM during the hold-back phase aborted background tasks, then fell through to `emitResult({ isError: false })`, so a cancelled `qwen -p ...` exited 0 with the prior assistant text. Route through `handleCancellationError()` so cancellation exits non-zero, matching the main turn loop. * test(cli): update stdout/stderr assertions for trailing newline `40c16aeb4` appended `\n` to text-mode `emitResult` output, but the nonInteractiveCli tests still asserted the pre-change strings. Update the 11 affected assertions to expect the trailing newline. * fix: address review comments on background-agent notifications Four additional issues from the PR review that the prior regression-fix commit didn't cover: - Escape XML metacharacters when interpolating `description`, `result`, `error`, `agentId`, `toolUseId`, and `status` into the task-notification envelope. Subagent output (which itself may carry untrusted tool output, fetched HTML, or another agent's notification) could contain `</result>` or `</task-notification>` and forge sibling tags the parent model would treat as trusted metadata. Truncate result text *before* escaping so the truncation never slices through an entity like `&`. - Emit the terminal notification from `cancel()` and `abortAll()`. The fire-and-forget `complete()`/`fail()` from the subagent task is guarded by `status !== 'running'` and was no-op'd after cancellation, so SDK consumers saw `task_started` with no matching `task_notification`, breaking the contract this PR establishes. Updated two race-guard tests that asserted the old behavior. - Call `adapter.finalizeAssistantMessage()` before the abort-triggered early return inside `drainOneItem`'s stream loop. Without it, `startAssistantMessage()` had already been called, so stream-json mode left `message_start` unpaired. - Enforce `config.getMaxSessionTurns()` in `drainOneItem` for symmetry with the main turn loop. Cron fires and notification replies otherwise bypass the budget cap in headless runs. * fix: address codex review comments for background subagents - Wrap background fork execute() in runInForkContext so the recursive-fork guard (AsyncLocalStorage-based) fires when a background fork's child model calls `agent` again. Previously only the foreground fork path was wrapped, so background forks could spawn nested implicit forks. - Emit queued terminal task_notifications on SIGINT/SIGTERM before handleCancellationError exits. abortAll() enqueues cancellation notifications via the registry callback, but the process was exiting before the drain loop had a chance to flush them — leaving stream-json consumers that already saw task_started without a matching terminal task_notification. Extracted the SDK-emit block into a shared emitNotificationToSdk helper reused by the normal drain and the cancellation flush. - Skip notification/cron subtypes in ACP HistoryReplayer. These records are persisted as type: 'user' so the model's chat history keeps them for continuity, but they were never user input — replaying them leaked raw <task-notification> XML (and cron prompts) back into the ACP session as if the user typed them. * test(cli): sync JsonOutputAdapter text-mode assertions with trailing newline Commit 11e6505eb appended a newline to text-mode emitResult output (zsh PROMPT_SP fix) and updated the nonInteractiveCli tests, but four assertions in JsonOutputAdapter.test.ts were missed. Update them to expect the trailing newline so CI passes. * refactor: simplify background subagent plumbing - Extract the SubagentStop hook blocking-decision loop into a runSubagentStopHookLoop helper so the foreground and background paths no longer duplicate the iteration/abort/log scaffolding. - Unify BackgroundTaskRegistry.abortAll to delegate to cancel, removing copy-pasted abort/notification bookkeeping. - Drop the unused findByName and BackgroundAgentEntry.name field. - In nonInteractiveCli drain, hoist inputFormat and toolCallUpdateCallback out of the inner tool loop, and drop the unreachable try/catch around the readonly registry. - Trim boilerplate doc/narration comments while keeping load-bearing WHY comments. * fix: address codex review comments for background subagents - Use tool callId (or short random suffix) instead of Date.now() for background agentIds; avoids registry collisions when parallel same-type agents launch in the same millisecond. - Reset loopDetector and lastPromptId for Notification turns so a prior turn's loop count doesn't trip LoopDetected on the notification response. - Replay notification/cron displayText in ACP HistoryReplayer so the assistant reply has an antecedent in resumed transcripts. --------- Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>

tanzhenxin added 8 commits April 16, 2026 16:17

fix(cli): skip duplicate user-message item for cron prompts

d5ce385

Cron prompts already render as a `● Cron: …` notification via the queue drain, so adding them again as a `USER` history item produced a duplicate `> …` line.

tanzhenxin linked an issue Apr 17, 2026 that may be closed by this pull request

Bring subagent system to feature parity with Claude Code #2409

Closed

tanzhenxin added the type/feature-request New feature or enhancement request label Apr 17, 2026

tanzhenxin added 5 commits April 17, 2026 12:49

docs(skill): tighten worked-example blurb in structured-debugging

ba99d10

Mirror the simplified blurb from .claude/skills/structured-debugging/SKILL.md (knowledge repo). Drops the round-by-round narrative; keeps the contradiction + two lessons.

docs(skill): mirror SKILL.md improvements (reframing failure mode, ge…

1104a03

…neralized path, value-logging guidance) Mirror of knowledge repo commit 38eb28d into the qwen-code .qwen/skills copy.

docs(skill): mirror worked example into .qwen/skills/structured-debug…

58e8a1f

…ging/ Mirrors knowledge/.claude/skills/structured-debugging/examples/ headless-bg-agent-empty-stdout.md so the .qwen copy of the skill links resolve.

docs(skill): mirror generalized side-note path guidance

8efdfa4

wenshao requested changes Apr 17, 2026

View reviewed changes

tanzhenxin added 4 commits April 17, 2026 14:14

Merge branch 'feat/background-subagent' into feat/background-subagent…

09825e3

…-notifications # Conflicts: # packages/core/src/tools/agent/agent.ts

test(cli): update stdout/stderr assertions for trailing newline

cf9d93f

`feadf052f` appended `\n` to text-mode `emitResult` output, but the nonInteractiveCli tests still asserted the pre-change strings. Update the 11 affected assertions to expect the trailing newline.

tanzhenxin merged commit 0da1182 into feat/background-subagent Apr 17, 2026
12 checks passed

mabry1985 mentioned this pull request May 2, 2026

upstream port: background agents subsystem — XL, scope before commit (#3076, #3379, #3471, #3488, #3642, #3687, #3720, #3739) protoLabsAI/protoCLI#191

Closed

tanzhenxin deleted the feat/background-subagent-notifications branch June 13, 2026 13:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): headless support and SDK task events for background agents#3379

feat(cli): headless support and SDK task events for background agents#3379
tanzhenxin merged 17 commits into
feat/background-subagentfrom
feat/background-subagent-notifications

tanzhenxin commented Apr 17, 2026 •

edited

Loading

Uh oh!

tanzhenxin commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tanzhenxin commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TLDR

Screenshots / Video Demo

Dive Deeper

Why unify cron and background notifications

Why headless deserves full support

Why SDK events in addition to user messages

Bug fixes rolled into this PR

Design-level decisions worth noting

Reviewer Test Plan

Testing Matrix

Linked issues / bugs

Uh oh!

tanzhenxin commented Apr 17, 2026

E2E Test Report

Results

Regressions found and fixed during this pass

Representative K1 output after the fix

What was dropped from the plan (compared to the earlier working copy)

Uh oh!

github-actions Bot commented Apr 17, 2026

📋 Review Summary

🔍 General Feedback

🎯 Specific Feedback

🔴 Critical

🟡 High

🟢 Medium

🔵 Low

✅ Highlights

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tanzhenxin commented Apr 17, 2026 •

edited

Loading