refactor(core): collapse three task registries into one TaskRegistry#3982
refactor(core): collapse three task registries into one TaskRegistry#3982tanzhenxin wants to merge 13 commits into
Conversation
E2E Test Reproduction ReportTest plan: Build under test: Methodology: Headless tests with Results
Notes
Test environment
|
752a0a4 to
2764df1
Compare
| }); | ||
| }); | ||
| mockMonitorRegistry.abortAll.mockImplementation(() => { | ||
| mockMonitorAbortAll.mockImplementation(() => { |
There was a problem hiding this comment.
[Critical] Incomplete test migration — 6 test cases further down in this file (lines 2547, 2663, 2787, 3207, 3334, 3447) still reference the old getBackgroundTaskRegistry API and the removed mockMonitorRegistry variable. The beforeEach block was correctly migrated to use getTaskRegistry + hoisted mocks, but these deeper test cases were missed.
This causes 7 TypeScript compilation errors (TS2339/TS2552) and prevents the entire test suite from running.
Affected test groups: structured-output shutdown tests, monitor-event drain tests, and holdback-gate tests.
Fix: Migrate the remaining 6 test cases to use getTaskRegistry() and the hoisted mock surface. Replace mockMonitorRegistry.setNotificationCallback (line 3207) with mockSetMonitorNotificationCallback (already hoisted).
— qwen3.7-max via Qwen Code /review
| .mockImplementationOnce(() => undefined); | ||
|
|
||
| const registry = mockConfig.getBackgroundShellRegistry(); | ||
| const registry = mockConfig.getTaskRegistry(); |
There was a problem hiding this comment.
[Critical] Incomplete test migration — 13 assertions further down in this file (lines 3696, 4062, 4105, 4113, 4126, 4204, 4277, 4371, 4383, 4466, 4559, 4565, 4778) still call the old per-kind registry methods cancel, complete, and fail on TaskRegistry. These methods no longer exist — the new API uses per-kind free functions (shellRequestCancel, shellComplete, shellFail, etc.).
This causes 13 TypeScript compilation errors (TS2339) and 18 runtime test failures in the "foreground-to-background promote" suites.
Fix: Update these assertions to mock and assert against the per-kind free functions instead of the removed registry methods.
— qwen3.7-max via Qwen Code /review
| 'a-just-finished', | ||
| 'a-quick-and-old', | ||
| ]); | ||
| expect(result.current.entries.map(entryId)).toEqual(['s1', 'a1', 'm1']); |
There was a problem hiding this comment.
[Critical] Test expectation does not match the new sort order. With startTime values a1=100, s1=50, m1=200 and the new buildMerged descending-startTime bucket sort for running entries, m1 (200) sorts first, not last. The actual result is ['m1', 'a1', 's1'].
The same issue appears at line 178: expects ['m1', 'a1'] but the new sort produces ['a1', 'm1'] (a1.startTime=100 > m1.startTime=50).
Fix: Update expectations to match the descending-startTime sort:
| expect(result.current.entries.map(entryId)).toEqual(['s1', 'a1', 'm1']); | |
| expect(result.current.entries.map(entryId)).toEqual(['m1', 'a1', 's1']); |
And at line 178:
expect(result.current.entries.map(entryId)).toEqual(['a1', 'm1']);
— qwen3.7-max via Qwen Code /review
| memoryManager.listTasksByType('dream', projectRoot), | ||
| ); | ||
|
|
||
| const unsubscribeRegistry = registry.subscribe(() => { |
There was a problem hiding this comment.
[Suggestion] The registry.subscribe callback calls buildMerged() (which includes listDreamTasks + sort) on every fireChange, including activity bursts (agentAppendActivity) and monitor event emissions. The old code's setStatusChangeCallback explicitly excluded activity updates, and MonitorRegistry.emitEvent fired no callback at all.
While registrySnapshotShape correctly suppresses React re-renders, the CPU cost of listDreamTasks (reads MemoryManager, filters, sorts) + building the full merged array + computing the shape signature is paid on every event — even when the shape hasn't changed. Under sustained monitor output or rapid agent tool calls, this becomes a hot path doing no useful work.
Fix: Use the entry argument for a fast-path check before calling buildMerged():
const unsubscribeRegistry = registry.subscribe((entry) => {
if (entry) {
const prev = statusCache.get(entryId(entry));
if (prev === entry.status) return;
statusCache.set(entryId(entry), entry.status);
}
const merged = buildMerged();
// ... shape check as before
});— qwen3.7-max via Qwen Code /review
| // Declared up-front (before `abortHandler`) so that the synchronous abort | ||
| // path — either `entryAc.signal.aborted` already true at registration | ||
| // time, or `registry.register()` throwing — can flush via | ||
| // time, or `monitorRegister(registry, )` throwing — can flush via |
There was a problem hiding this comment.
[Suggestion] Malformed function call in comment: monitorRegister(registry, ) has a dangling comma with a missing second argument. This was introduced during the rename from registry.register(registration) to monitorRegister(registry, registration).
| // time, or `monitorRegister(registry, )` throwing — can flush via | |
| // time, or `monitorRegister(registry, registration)` throwing — can flush via |
— qwen3.7-max via Qwen Code /review
| // `Task` implementation is wired up. Idempotent — re-importing this | ||
| // module (or constructing additional Config instances) does not | ||
| // re-register. | ||
| registerTaskKind(AgentTaskKind); |
There was a problem hiding this comment.
[Suggestion] Four registerTaskKind(...) side-effect calls are placed between two groups of import declarations. While ESM hoists all imports before executing module-level code (so this is functionally correct), it is visually surprising — imports continue at line 90 after this executable block. Most style guides and linters expect all imports to precede executable statements.
Fix: Move the registerTaskKind(...) block below the last import declaration in this file, so imports are contiguous.
— qwen3.7-max via Qwen Code /review
| * registry`, which ESM tolerates but JS hoisting around `const` | ||
| * exports does not. | ||
| * | ||
| * Init order: `Config.initialize()` calls `registerAllTaskKinds()` |
There was a problem hiding this comment.
[Suggestion] The comment says "Config.initialize() calls registerAllTaskKinds()" but in reality registration happens as a module-load side effect at config.ts:86-89 (top-level registerTaskKind(...) calls). There is no registerAllTaskKinds() function and no Config.initialize() method.
This creates a test infrastructure trap: _resetTaskKindsForTest() clears the dispatcher table, but there is no exported function to re-register all kinds. A test author who calls it in afterEach will break all subsequent tests that use registry.kill() or cancelSelected(), with an error message pointing to a nonexistent method.
Fix: Update the comment to say "module-load side effects in config.ts" and either export a registerAllTaskKinds() function for test recovery or remove _resetTaskKindsForTest() if it's not meant to be used.
— qwen3.7-max via Qwen Code /review
| cb: MonitorNotificationCallback | undefined, | ||
| ): void { | ||
| notificationCallback = cb; | ||
| } |
There was a problem hiding this comment.
[Suggestion] agentNotificationCallbacks and agentLifecycleCallbacks are module-level Maps that are not cleared by monitorAbortAll — only by monitorReset. Since Config.close() calls monitorAbortAll(this.taskRegistry, { notify: false }) but not monitorReset, these Maps survive session teardown.
In tests with multiple Config instances or a daemon process with session recycling, stale callbacks could route notifications to dead owner agents.
Fix: Have monitorAbortAll also clear these Maps (when called with { notify: false }), or have Config.close() call monitorReset after monitorAbortAll.
— qwen3.7-max via Qwen Code /review
…ask module Folds PR #4324's per-instance BackgroundTaskRegistry cap into the collapsed tasks/agent-task.ts module as kind-local module state. The new architecture's per-kind module is the natural home for an agent-only behavior — the generic TaskRegistry no longer needs to carry it. What moves into tasks/agent-task.ts: - MAX_CONCURRENT_BACKGROUND_AGENTS and the env-var resolver - agentAssertCanStartBackground(registry) free function - The race-guard inside agentRegister() (skips re-registers of already-running entries so resume can recover) - setAgentBackgroundCapForTest() for test override (mirrors the existing setAgentNotificationCallback pattern) What moves out of agent.ts: - The early preflight calls agentAssertCanStartBackground instead of registry.assertCanStartBackgroundAgent(). - The post-register error path is unchanged structurally — it still aborts the bgAbortController, fires SubagentStop, cleans up the worktree, and returns the cap message to the model. Resume path (background-agent-resume.ts) now wraps agentRegister in the same try/catch #4324 added, but routes to the new free function. Also addresses wenshao's review on #3982: 1. config.ts — moves the four registerTaskKind(...) calls below all imports, instead of dangling between two import groups. 2. dispatcher.ts — comment updated to describe actual init order (module-load side effect in config.ts), removes the fictional registerAllTaskKinds() / Config.initialize() references. 3. monitor.ts — fixes a stale comment that still referred to monitorRegister(registry, ) (dangling comma). 4. monitor-task.ts — monitorAbortAll now clears the module-level owner-routed callbacks so a daemon process recycling sessions doesn't leak handlers to dead owner agents. 5. useBackgroundTaskView.ts — moves the registry shape probe to getAll() BEFORE buildMerged() so activity bursts and monitor event bumps don't pay the listDreamTasks cost. Tests added: - tasks/agent-task.test.ts — covers env resolution, cap enforcement (running-only, foreground exclusion, paused exclusion, resume race exception), and module-level cap reset. Tests fixed (rebase fallout): - nonInteractiveCli.test.ts — six structured-output assertions migrated from mockBackgroundTaskRegistry.abortAll to the module-level mockAgentAbortAll mock. - shell.test.ts — 13 assertions migrated from registry.{cancel,complete,fail} to shellTaskModule.shell{Cancel,Complete,Fail} (called with (registry, ...)). - useBackgroundTaskView.test.ts — sort assertions updated to the new descending-startTime active-bucket order. - agent.test.ts — adds the getStopHookBlockingCap mock that was lost in the auto-merge with PR #4208. - clearCommand.test.ts — drops the SessionStart assertion that was lost in the auto-merge with PR #4115. - background-agent-resume.test.ts — monitorRegistry is now a set of vi.spyOn against the new module-level free functions. - InlineParallelAgentsDisplay.test.tsx — makeRegistryConfig exposes getTaskRegistry (matching the renamed Config method).
2764df1 to
942ea3f
Compare
|
@wenshao Thanks for the thorough review. Rebased onto current Critical (3) — all resolved
Suggestions (5) — all addressed
Bonus: integrated PR #4324 (background-agent concurrency cap)#4324 merged after this PR was opened and added
E2E smoke report (rebuilt bundle)
All four runs produced clean stderr (no unexpected warnings or stack traces). Verification
Ready for re-review. |
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
✅ Local build + test + runtime verification — PR #3982结论 / Verdict: 行为保持型大重构,构建/类型/测试/运行时接线均通过,建议合并。 Large behaviour-preserving refactor; builds, types, both test suites and runtime wiring all pass — recommend merge. 这是一个把三个 per-kind registry 合并成单一 For a "no user-observable behavior change" refactor of this size, the authoritative test is: clean build/typecheck, the rewritten suites green (the real safety net), correct API surface + back-compat shims, and a runtime check that the dispatcher's register-on-startup doesn't hit the import cycle it was designed to avoid. Method
Builds & typecheck ✅
Test suites ✅ (the behavioural safety net)
The rewritten task tests run green: The 2 cli failures are NOT caused by this PR. Both are in
I checked out
API surface & back-compat ✅
Runtime smoke (tmux, real CLI) ✅Launched the built CLI in an isolated tmux sandbox:
i.e. the migrated command subscribes to the new Notes for reviewers
Environment note (unrelated to this PR)This checkout's Bottom line: clean builds + typecheck, core fully green, cli green except 2 pre-existing root-environment failures that are identical on |
wenshao
left a comment
There was a problem hiding this comment.
A couple of additional findings that don't map to a single changed diff line:
[Suggestion] packages/core/src/config/config.ts:3729 & :3738 — the AgentTask type is still imported from the deprecated re-export shim ../agents/background-tasks.js, whose own header says "New code should import from '../tasks/agent-task.js' directly" and that it's "scheduled for removal one release after this lands." Line 87 of the same file already switched to the canonical ../tasks/agent-task.js. Pointing these two type references at the canonical module keeps config.ts internally consistent and lets the shim be removed atomically as planned.
[Suggestion] Test coverage — this PR deletes the three registry test suites (background-tasks.test.ts, backgroundShellRegistry.test.ts, monitorRegistry.test.ts, ~2,900 lines). Two gaps in the replacement suites are now exercised only through mocks: (1) the new dispatcher's kind→kill dispatch (tasks/dispatcher.ts + TaskRegistry.kill in tasks/registry.ts) — the centerpiece of the refactor — has no direct test (its only production caller, the dialog cancel, mocks getTaskByType); (2) the agent terminal-notification XML builder (tasks/agent-task.ts emitNotification) lost the cases the deleted suite had (XML escaping, <output-file>/<usage>/<tool-use-id> tags, foreground-suppression, double-notify guard) and now only runs against mockRegistry. A small dispatcher / registry.kill unit test plus porting the notification-XML cases would restore confidence in the parts the refactor changed most.
— claude-opus-4-8 via Claude Code /qreview
| /subagents\/test-session-id\/agent-file-search-.*\.jsonl$/, | ||
| ), | ||
| metaPath: expect.stringMatching( | ||
| /subagents[\\/]test-session-id[\\/]agent-file-search-.*\.meta\.json$/, | ||
| /subagents\/test-session-id\/agent-file-search-.*\.meta\.json$/, |
There was a problem hiding this comment.
[Critical] Windows CI is red on this test (Test (windows-latest, Node 22.x)). This PR narrowed both path regexes from the cross-platform character class [\\/] (used on main) to forward-slash-only \/. On Windows the reserved outputFile / metaPath use \ separators, so subagents\test-session-id\… no longer matches and the mockRegistry.register objectContaining assertion fails. The sibling agent-monitor test at line 1741 still uses [\\/], confirming this was an accidental narrowing rather than intentional.
| /subagents\/test-session-id\/agent-file-search-.*\.jsonl$/, | |
| ), | |
| metaPath: expect.stringMatching( | |
| /subagents[\\/]test-session-id[\\/]agent-file-search-.*\.meta\.json$/, | |
| /subagents\/test-session-id\/agent-file-search-.*\.meta\.json$/, | |
| /subagents[\\/]test-session-id[\\/]agent-file-search-.*\.jsonl$/, | |
| ), | |
| metaPath: expect.stringMatching( | |
| /subagents[\\/]test-session-id[\\/]agent-file-search-.*\.meta\.json$/, |
— claude-opus-4-8 via Claude Code /qreview
| const registry = config.getBackgroundTaskRegistry(); | ||
| const onActivity = (entry: AgentTask) => { | ||
| if (entry.agentId !== selectedAgentIdForActivity) return; | ||
| if (!dialogOpen || dialogMode !== 'detail' || !selectedLiveEntityId) return; |
There was a problem hiding this comment.
[Suggestion] Behavioral regression vs. the pre-refactor code. This per-entry live-refresh subscription gates on dialogMode !== 'detail', so it never registers when the detail view is entered from the LiveAgentPanel (dialogMode === 'detail-from-panel'). The detail body renders in both modes, and every sibling effect (lines 985, 1007) gates on isDetailMode (=== 'detail' || === 'detail-from-panel'). In detail-from-panel mode the selected agent's recentActivities / a monitor's eventCount no longer refresh on each event — for a running entry the 1s wall-clock tick masks it to ~1s lag; for a paused/terminal-but-mutating entry there's no compensating timer, so the body can appear frozen. For a refactor billed as behavior-preserving this is an unintended divergence.
| if (!dialogOpen || dialogMode !== 'detail' || !selectedLiveEntityId) return; | |
| if (!dialogOpen || !isDetailMode || !selectedLiveEntityId) return; |
Also add isDetailMode to this effect's dependency array (line 979), mirroring the sibling effect at 985-995, so eslint react-hooks/exhaustive-deps (enforced by lint:ci --max-warnings 0) stays clean.
— claude-opus-4-8 via Claude Code /qreview
| const registry = config.getBackgroundTaskRegistry(); | ||
| const onActivity = (entry: AgentTask) => { | ||
| if (entry.agentId !== selectedAgentIdForActivity) return; | ||
| if (!dialogOpen || dialogMode !== 'detail' || !selectedLiveEntityId) return; |
There was a problem hiding this comment.
[Critical] dialogMode !== 'detail' should be !isDetailMode. The old code used !isDetailMode (which includes dialogMode === 'detail' || dialogMode === 'detail-from-panel'); this rewrite narrowed the guard to only 'detail', silently dropping the 'detail-from-panel' path.
When the user navigates into detail from the live agent panel, this subscription effect short-circuits — activityTick never bumps, so the detail view's activity display and monitor event counts freeze at the moment the detail view opened. The wall-clock effect at line 989 correctly uses !isDetailMode, confirming this is an unintended regression.
| if (!dialogOpen || dialogMode !== 'detail' || !selectedLiveEntityId) return; | |
| if (!dialogOpen || !isDetailMode || !selectedLiveEntityId) return; |
Also add isDetailMode to the dependency array:
| if (!dialogOpen || dialogMode !== 'detail' || !selectedLiveEntityId) return; | |
| }, [dialogOpen, dialogMode, isDetailMode, config, selectedLiveEntityId]); |
— qwen3.7-max via Qwen Code /review
| vi.fn().mockReturnValue([]), | ||
| ); | ||
|
|
||
| import { vi, type Mock, type MockInstance } from 'vitest'; |
There was a problem hiding this comment.
[Critical] Duplicate import { vi, type Mock, type MockInstance } from 'vitest' — the original import still exists at line 92. This produces 6 typecheck errors (TS2300 Duplicate identifier for vi, Mock, MockInstance) and causes tsc --noEmit to fail, blocking npm run build.
| import { vi, type Mock, type MockInstance } from 'vitest'; |
(Remove this line — the import at line 92 already covers these types.)
— qwen3.7-max via Qwen Code /review
| @@ -0,0 +1,20 @@ | |||
| /** | |||
There was a problem hiding this comment.
[Critical] Missing test files for the new task subsystem modules. The old test suites (monitorRegistry.test.ts: 918 lines, background-tasks.test.ts: 1561 lines, backgroundShellRegistry.test.ts: 409 lines) were deleted without replacement:
monitor-task.ts(703 lines) — 0 tests. Critical untested paths:monitorEmitEventauto-stop re-entrancy, idle-timeout auto-stop, owner-scoped notification routing, dual cancel ordering, terminal-entry retention cap (128).registry.ts(218 lines) — 0 tests. Untested: duplicate-register warning,fireChangeexception isolation across multiple listeners,evict/killmiss paths.dispatcher.ts(110 lines) — 0 tests. Untested:getTaskByTypethrow on unregistered kind, re-register overwrite.dream-task.ts(199 lines) — 0 tests. Untested:listDreamTasksfiltering/cap,toDreamTaskmetadata mapping.agent-task.ts(995 lines) — only concurrency cap test. Untested: cancel grace timer with fallback finalize,agentFinalizeCancelled, message queue/drain/wait,pruneTerminalEntries.
— qwen3.7-max via Qwen Code /review
| registry: config.getTaskRegistry(), | ||
| memoryManager: config.getMemoryManager(), | ||
| }; | ||
| void Promise.resolve( |
There was a problem hiding this comment.
[Critical] getTaskByType(target.kind) is evaluated as an argument to Promise.resolve(...) before the Promise is constructed. If the kind is unregistered, getTaskByType throws synchronously during argument evaluation, which escapes the .catch() chain entirely and propagates into the React useCallback handler — crashing the component tree.
| void Promise.resolve( | |
| void Promise.resolve() | |
| .then(() => getTaskByType(target.kind).kill(entryId(target), ctx)) | |
| .catch((err) => { |
— qwen3.7-max via Qwen Code /review
| const entry = registry.get(monitorId) as MonitorTask | undefined; | ||
| if (!entry || entry.kind !== 'monitor' || entry.status !== 'running') return; | ||
|
|
||
| registry.update<MonitorTask>(monitorId, (current) => { |
There was a problem hiding this comment.
[Critical] registry.update() wraps eventCount += 1, unconditionally firing fireChange() on every monitor stdout line. The old MonitorRegistry.emitEvent() deliberately excluded per-event mutations from statusChangeCallback — the old JSDoc was explicit: "per-event registry mutations (eventCount / droppedLines) are deliberately excluded so the footer pill or AppContainer don't churn under heavy event traffic."
The detail-view subscriber in BackgroundTasksDialog.tsx:966 bumps activityTick (triggering a full React re-render) on every fireChange. A busy monitor (tail -f /var/log/syslog, npm run watch) producing 100+ lines/sec causes 100+ React re-renders/sec on the detail view.
Suggested fix: mutate eventCount/lastEventTime directly on the entry reference without going through registry.update(), matching the old code's deliberate approach. Or add a silent option to registry.update() that suppresses fireChange.
— qwen3.7-max via Qwen Code /review
| const registry = this.config.getMonitorRegistry(); | ||
| const registry = this.config.getTaskRegistry(); | ||
| const ownerAgentId = getCurrentAgentId() ?? undefined; | ||
|
|
There was a problem hiding this comment.
[Suggestion] This manually filters registry.getByKind('monitor').filter(e => e.status === 'running') instead of using the already-exported getRunningMonitorTasks(registry) helper from monitor-task.ts. The helper was extracted precisely for this purpose.
| const running = getRunningMonitorTasks(registry); |
— qwen3.7-max via Qwen Code /review
|
|
||
| if (exitCode !== null) { | ||
| registry.update<MonitorTask>(monitorId, (current) => { | ||
| current.exitCode = exitCode; |
There was a problem hiding this comment.
[Suggestion] monitorComplete makes two registry.update calls (one for exitCode, then settle() for status+endTime), firing fireChange twice per terminal transition. monitorFail has the same pattern. By contrast, shellComplete/shellFail in shell-task.ts combine all fields into a single registry.update call.
Consider merging into a single update to match the shell-task pattern:
registry.update<MonitorTask>(monitorId, (current) => {
current.exitCode = exitCode ?? undefined;
current.status = 'completed';
current.endTime = Date.now();
return current;
});— qwen3.7-max via Qwen Code /review
| * code should never call this — session shutdown uses per-kind | ||
| * `*AbortAll` helpers so each kind can run its own cleanup. | ||
| */ | ||
| _resetForTest(): void { |
There was a problem hiding this comment.
[Suggestion] _resetForTest() clears entries and listeners but does not reset the module-level singletons in agent-task.ts (notificationCallback, registerCallback, messageWaiters) or monitor-task.ts (notificationCallback, registerCallback, agentNotificationCallbacks, agentLifecycleCallbacks). A test that sets a callback and then calls _resetForTest() in afterEach will leak the mock into the next test.
Consider adding per-kind reset exports (e.g., resetAgentCallbacksForTest()) or consolidating callbacks onto the TaskRegistry instance so _resetForTest covers them.
— qwen3.7-max via Qwen Code /review
…th regex `expect.stringMatching(/subagents\/test-session-id\/agent-...$/)` fails on Windows because path.join() uses backslashes there. Switch to `[\\/]` to accept either separator, matching the existing pattern at line 1741. The test (added in #3970) has been silently red on Windows since that merge; nobody noticed until this PR's rebase made CI run. Caught by the CI loop on PR #3982.
Three real bugs + four polish items + a substantive test coverage
gap for the new monitor-task module.
## Critical (3 real bugs)
1. **BackgroundTasksDialog.tsx**: the per-entry live-refresh effect
gated on `dialogMode !== 'detail'`, silently dropping the
`detail-from-panel` path (the LiveAgentPanel → detail navigation).
Restored to `!isDetailMode` so both detail modes route through the
subscription. Same predicate sibling effects use.
2. **BackgroundTaskViewContext.tsx**: `getTaskByType(target.kind)`
was evaluated as a Promise.resolve() argument; a sync throw from
an unregistered kind escaped the .catch() chain into the React
useCallback handler. Wrap the lookup in try/catch (sync path),
keep the kill() dispatch synchronous-at-call-site (preserves the
"pressing 'x' invokes kill before the next tick" test contract).
3. **monitor-task.ts**: `monitorEmitEvent` called `registry.update`
which fires `fireChange`, churning the dialog list + footer pill
on every monitor stdout line — perf regression vs the old
`MonitorRegistry.emitEvent` which deliberately suppressed per-
event mutations from the status-change callback. Introduce
`TaskRegistry.mutateSilent` for in-place mutations whose change-
listener fanout the dialog doesn't need (id/kind/status don't
shift). Detail-view subscribers still see fresh data because
they re-read on their own tick. Also applied to
`agentAppendActivity` for the same reason (activity bursts
shouldn't churn list-mode renderers).
## Critical (test coverage gap)
4. **monitor-task.ts had zero tests** — the deleted
`monitorRegistry.test.ts` (918 lines) was not replaced. Added
`packages/core/src/tasks/monitor-task.test.ts` covering the
critical paths: cap enforcement, registration shape, silent
per-event mutation, idle-timer reset, auto-stop re-entrancy,
terminal transitions (single fireChange), notify:false /
notify:true cancel ordering, owner-routed notifications +
lifecycle wake, abortAll + reset clearing module-level callback
Maps, and the kind-narrowing helpers. 24 cases, all green.
## Suggestions (4)
1. **config.ts**: `loadPausedBackgroundAgents` /
`resumeBackgroundAgent` return-type imports now point at the
canonical `../tasks/agent-task.js` instead of the deprecated
re-export shim `../agents/background-tasks.js`.
2. **monitor.ts**: cap pre-check now uses the already-exported
`getRunningMonitorTasks(registry)` helper instead of an inline
filter.
3. **monitor-task.ts**: `monitorComplete` / `monitorFail` /
`monitorEmitEvent` (auto-stop branch) no longer issue a second
`registry.update` for status fields — the `settle()` helper
now accepts `{ exitCode, error }` so the terminal transition
fires exactly one `fireChange`. Mirrors the single-update
pattern in `shellComplete` / `shellFail`.
4. **registry.ts** / per-kind modules: added
`_resetAgentTaskModuleStateForTest` and
`_resetMonitorTaskModuleStateForTest` to clear module-level
singletons (callbacks, owner-routed Maps, the agent cap,
messageWaiters). Composed into a single
`_resetTaskKindModuleStateForTest(registry)` helper in
`tasks/index.ts` for `afterEach` use.
## Test infra
Added `mutateSilent` to the registry stubs in `agent.test.ts`,
`shell.test.ts`, `nonInteractiveCli.test.ts`, and
`clearCommand.test.ts` so the new method on `TaskRegistry` doesn't
trip mocks.
## Verification
- `npm run typecheck`: clean across all packages
- `npm run lint:ci`: clean (--max-warnings 0)
- `npm run build && npm run bundle`: clean
- Targeted core suites (`src/tasks`, `src/agents`,
`src/tools/agent`, shell, monitor, send-message, task-stop): 763
passed, 2 skipped
- New monitor-task.test.ts: 24 / 24 passed
|
@wenshao Thanks for the second pass — caught real issues I'd missed. All 9 items addressed in Critical (4)
Suggestions (4)
Test infraAdded Verification
Ready for re-review. |
…ask module Folds PR #4324's per-instance BackgroundTaskRegistry cap into the collapsed tasks/agent-task.ts module as kind-local module state. The new architecture's per-kind module is the natural home for an agent-only behavior — the generic TaskRegistry no longer needs to carry it. What moves into tasks/agent-task.ts: - MAX_CONCURRENT_BACKGROUND_AGENTS and the env-var resolver - agentAssertCanStartBackground(registry) free function - The race-guard inside agentRegister() (skips re-registers of already-running entries so resume can recover) - setAgentBackgroundCapForTest() for test override (mirrors the existing setAgentNotificationCallback pattern) What moves out of agent.ts: - The early preflight calls agentAssertCanStartBackground instead of registry.assertCanStartBackgroundAgent(). - The post-register error path is unchanged structurally — it still aborts the bgAbortController, fires SubagentStop, cleans up the worktree, and returns the cap message to the model. Resume path (background-agent-resume.ts) now wraps agentRegister in the same try/catch #4324 added, but routes to the new free function. Also addresses wenshao's review on #3982: 1. config.ts — moves the four registerTaskKind(...) calls below all imports, instead of dangling between two import groups. 2. dispatcher.ts — comment updated to describe actual init order (module-load side effect in config.ts), removes the fictional registerAllTaskKinds() / Config.initialize() references. 3. monitor.ts — fixes a stale comment that still referred to monitorRegister(registry, ) (dangling comma). 4. monitor-task.ts — monitorAbortAll now clears the module-level owner-routed callbacks so a daemon process recycling sessions doesn't leak handlers to dead owner agents. 5. useBackgroundTaskView.ts — moves the registry shape probe to getAll() BEFORE buildMerged() so activity bursts and monitor event bumps don't pay the listDreamTasks cost. Tests added: - tasks/agent-task.test.ts — covers env resolution, cap enforcement (running-only, foreground exclusion, paused exclusion, resume race exception), and module-level cap reset. Tests fixed (rebase fallout): - nonInteractiveCli.test.ts — six structured-output assertions migrated from mockBackgroundTaskRegistry.abortAll to the module-level mockAgentAbortAll mock. - shell.test.ts — 13 assertions migrated from registry.{cancel,complete,fail} to shellTaskModule.shell{Cancel,Complete,Fail} (called with (registry, ...)). - useBackgroundTaskView.test.ts — sort assertions updated to the new descending-startTime active-bucket order. - agent.test.ts — adds the getStopHookBlockingCap mock that was lost in the auto-merge with PR #4208. - clearCommand.test.ts — drops the SessionStart assertion that was lost in the auto-merge with PR #4115. - background-agent-resume.test.ts — monitorRegistry is now a set of vi.spyOn against the new module-level free functions. - InlineParallelAgentsDisplay.test.tsx — makeRegistryConfig exposes getTaskRegistry (matching the renamed Config method).
c8ebf6b to
a20c24f
Compare
…th regex `expect.stringMatching(/subagents\/test-session-id\/agent-...$/)` fails on Windows because path.join() uses backslashes there. Switch to `[\\/]` to accept either separator, matching the existing pattern at line 1741. The test (added in #3970) has been silently red on Windows since that merge; nobody noticed until this PR's rebase made CI run. Caught by the CI loop on PR #3982.
Three real bugs + four polish items + a substantive test coverage
gap for the new monitor-task module.
## Critical (3 real bugs)
1. **BackgroundTasksDialog.tsx**: the per-entry live-refresh effect
gated on `dialogMode !== 'detail'`, silently dropping the
`detail-from-panel` path (the LiveAgentPanel → detail navigation).
Restored to `!isDetailMode` so both detail modes route through the
subscription. Same predicate sibling effects use.
2. **BackgroundTaskViewContext.tsx**: `getTaskByType(target.kind)`
was evaluated as a Promise.resolve() argument; a sync throw from
an unregistered kind escaped the .catch() chain into the React
useCallback handler. Wrap the lookup in try/catch (sync path),
keep the kill() dispatch synchronous-at-call-site (preserves the
"pressing 'x' invokes kill before the next tick" test contract).
3. **monitor-task.ts**: `monitorEmitEvent` called `registry.update`
which fires `fireChange`, churning the dialog list + footer pill
on every monitor stdout line — perf regression vs the old
`MonitorRegistry.emitEvent` which deliberately suppressed per-
event mutations from the status-change callback. Introduce
`TaskRegistry.mutateSilent` for in-place mutations whose change-
listener fanout the dialog doesn't need (id/kind/status don't
shift). Detail-view subscribers still see fresh data because
they re-read on their own tick. Also applied to
`agentAppendActivity` for the same reason (activity bursts
shouldn't churn list-mode renderers).
## Critical (test coverage gap)
4. **monitor-task.ts had zero tests** — the deleted
`monitorRegistry.test.ts` (918 lines) was not replaced. Added
`packages/core/src/tasks/monitor-task.test.ts` covering the
critical paths: cap enforcement, registration shape, silent
per-event mutation, idle-timer reset, auto-stop re-entrancy,
terminal transitions (single fireChange), notify:false /
notify:true cancel ordering, owner-routed notifications +
lifecycle wake, abortAll + reset clearing module-level callback
Maps, and the kind-narrowing helpers. 24 cases, all green.
## Suggestions (4)
1. **config.ts**: `loadPausedBackgroundAgents` /
`resumeBackgroundAgent` return-type imports now point at the
canonical `../tasks/agent-task.js` instead of the deprecated
re-export shim `../agents/background-tasks.js`.
2. **monitor.ts**: cap pre-check now uses the already-exported
`getRunningMonitorTasks(registry)` helper instead of an inline
filter.
3. **monitor-task.ts**: `monitorComplete` / `monitorFail` /
`monitorEmitEvent` (auto-stop branch) no longer issue a second
`registry.update` for status fields — the `settle()` helper
now accepts `{ exitCode, error }` so the terminal transition
fires exactly one `fireChange`. Mirrors the single-update
pattern in `shellComplete` / `shellFail`.
4. **registry.ts** / per-kind modules: added
`_resetAgentTaskModuleStateForTest` and
`_resetMonitorTaskModuleStateForTest` to clear module-level
singletons (callbacks, owner-routed Maps, the agent cap,
messageWaiters). Composed into a single
`_resetTaskKindModuleStateForTest(registry)` helper in
`tasks/index.ts` for `afterEach` use.
## Test infra
Added `mutateSilent` to the registry stubs in `agent.test.ts`,
`shell.test.ts`, `nonInteractiveCli.test.ts`, and
`clearCommand.test.ts` so the new method on `TaskRegistry` doesn't
trip mocks.
## Verification
- `npm run typecheck`: clean across all packages
- `npm run lint:ci`: clean (--max-warnings 0)
- `npm run build && npm run bundle`: clean
- Targeted core suites (`src/tasks`, `src/agents`,
`src/tools/agent`, shell, monitor, send-message, task-stop): 763
passed, 2 skipped
- New monitor-task.test.ts: 24 / 24 passed
Replaces BackgroundTaskRegistry, BackgroundShellRegistry, and MonitorRegistry with a single thin TaskRegistry over the polymorphic TaskState union, plus per-kind modules that hold each kind's lifecycle helpers as free functions and register a small Task implementation with a polymorphic dispatcher. The CLI side collapses with it: useBackgroundTaskView goes from a four-source merge with dream-signature dedup to one registry subscription plus a thin dream adapter, and BackgroundTaskViewContext's cancel switch becomes a single getTaskByType(target.kind).kill(...) call. Old class files become one-release re-export shims so SDK consumers keep their import paths working through the deprecation window. Net: ~3950 lines deleted, no user-observable behavior change.
The new task abstraction holds shells, monitors, and dreams alongside agents. Living under agents/ misnamed the boundary — three of the four kinds aren't agents at all. Move to packages/core/src/tasks/ to match what the abstraction actually represents.
…ask module Folds PR #4324's per-instance BackgroundTaskRegistry cap into the collapsed tasks/agent-task.ts module as kind-local module state. The new architecture's per-kind module is the natural home for an agent-only behavior — the generic TaskRegistry no longer needs to carry it. What moves into tasks/agent-task.ts: - MAX_CONCURRENT_BACKGROUND_AGENTS and the env-var resolver - agentAssertCanStartBackground(registry) free function - The race-guard inside agentRegister() (skips re-registers of already-running entries so resume can recover) - setAgentBackgroundCapForTest() for test override (mirrors the existing setAgentNotificationCallback pattern) What moves out of agent.ts: - The early preflight calls agentAssertCanStartBackground instead of registry.assertCanStartBackgroundAgent(). - The post-register error path is unchanged structurally — it still aborts the bgAbortController, fires SubagentStop, cleans up the worktree, and returns the cap message to the model. Resume path (background-agent-resume.ts) now wraps agentRegister in the same try/catch #4324 added, but routes to the new free function. Also addresses wenshao's review on #3982: 1. config.ts — moves the four registerTaskKind(...) calls below all imports, instead of dangling between two import groups. 2. dispatcher.ts — comment updated to describe actual init order (module-load side effect in config.ts), removes the fictional registerAllTaskKinds() / Config.initialize() references. 3. monitor.ts — fixes a stale comment that still referred to monitorRegister(registry, ) (dangling comma). 4. monitor-task.ts — monitorAbortAll now clears the module-level owner-routed callbacks so a daemon process recycling sessions doesn't leak handlers to dead owner agents. 5. useBackgroundTaskView.ts — moves the registry shape probe to getAll() BEFORE buildMerged() so activity bursts and monitor event bumps don't pay the listDreamTasks cost. Tests added: - tasks/agent-task.test.ts — covers env resolution, cap enforcement (running-only, foreground exclusion, paused exclusion, resume race exception), and module-level cap reset. Tests fixed (rebase fallout): - nonInteractiveCli.test.ts — six structured-output assertions migrated from mockBackgroundTaskRegistry.abortAll to the module-level mockAgentAbortAll mock. - shell.test.ts — 13 assertions migrated from registry.{cancel,complete,fail} to shellTaskModule.shell{Cancel,Complete,Fail} (called with (registry, ...)). - useBackgroundTaskView.test.ts — sort assertions updated to the new descending-startTime active-bucket order. - agent.test.ts — adds the getStopHookBlockingCap mock that was lost in the auto-merge with PR #4208. - clearCommand.test.ts — drops the SessionStart assertion that was lost in the auto-merge with PR #4115. - background-agent-resume.test.ts — monitorRegistry is now a set of vi.spyOn against the new module-level free functions. - InlineParallelAgentsDisplay.test.tsx — makeRegistryConfig exposes getTaskRegistry (matching the renamed Config method).
The conflict resolution during the rebase added a second
`import { vi, type Mock, type MockInstance } from 'vitest';`
line. Local lint missed it because `npm run lint` doesn't
enforce `--max-warnings 0`, but CI does.
…th regex `expect.stringMatching(/subagents\/test-session-id\/agent-...$/)` fails on Windows because path.join() uses backslashes there. Switch to `[\\/]` to accept either separator, matching the existing pattern at line 1741. The test (added in #3970) has been silently red on Windows since that merge; nobody noticed until this PR's rebase made CI run. Caught by the CI loop on PR #3982.
Three real bugs + four polish items + a substantive test coverage
gap for the new monitor-task module.
## Critical (3 real bugs)
1. **BackgroundTasksDialog.tsx**: the per-entry live-refresh effect
gated on `dialogMode !== 'detail'`, silently dropping the
`detail-from-panel` path (the LiveAgentPanel → detail navigation).
Restored to `!isDetailMode` so both detail modes route through the
subscription. Same predicate sibling effects use.
2. **BackgroundTaskViewContext.tsx**: `getTaskByType(target.kind)`
was evaluated as a Promise.resolve() argument; a sync throw from
an unregistered kind escaped the .catch() chain into the React
useCallback handler. Wrap the lookup in try/catch (sync path),
keep the kill() dispatch synchronous-at-call-site (preserves the
"pressing 'x' invokes kill before the next tick" test contract).
3. **monitor-task.ts**: `monitorEmitEvent` called `registry.update`
which fires `fireChange`, churning the dialog list + footer pill
on every monitor stdout line — perf regression vs the old
`MonitorRegistry.emitEvent` which deliberately suppressed per-
event mutations from the status-change callback. Introduce
`TaskRegistry.mutateSilent` for in-place mutations whose change-
listener fanout the dialog doesn't need (id/kind/status don't
shift). Detail-view subscribers still see fresh data because
they re-read on their own tick. Also applied to
`agentAppendActivity` for the same reason (activity bursts
shouldn't churn list-mode renderers).
## Critical (test coverage gap)
4. **monitor-task.ts had zero tests** — the deleted
`monitorRegistry.test.ts` (918 lines) was not replaced. Added
`packages/core/src/tasks/monitor-task.test.ts` covering the
critical paths: cap enforcement, registration shape, silent
per-event mutation, idle-timer reset, auto-stop re-entrancy,
terminal transitions (single fireChange), notify:false /
notify:true cancel ordering, owner-routed notifications +
lifecycle wake, abortAll + reset clearing module-level callback
Maps, and the kind-narrowing helpers. 24 cases, all green.
## Suggestions (4)
1. **config.ts**: `loadPausedBackgroundAgents` /
`resumeBackgroundAgent` return-type imports now point at the
canonical `../tasks/agent-task.js` instead of the deprecated
re-export shim `../agents/background-tasks.js`.
2. **monitor.ts**: cap pre-check now uses the already-exported
`getRunningMonitorTasks(registry)` helper instead of an inline
filter.
3. **monitor-task.ts**: `monitorComplete` / `monitorFail` /
`monitorEmitEvent` (auto-stop branch) no longer issue a second
`registry.update` for status fields — the `settle()` helper
now accepts `{ exitCode, error }` so the terminal transition
fires exactly one `fireChange`. Mirrors the single-update
pattern in `shellComplete` / `shellFail`.
4. **registry.ts** / per-kind modules: added
`_resetAgentTaskModuleStateForTest` and
`_resetMonitorTaskModuleStateForTest` to clear module-level
singletons (callbacks, owner-routed Maps, the agent cap,
messageWaiters). Composed into a single
`_resetTaskKindModuleStateForTest(registry)` helper in
`tasks/index.ts` for `afterEach` use.
## Test infra
Added `mutateSilent` to the registry stubs in `agent.test.ts`,
`shell.test.ts`, `nonInteractiveCli.test.ts`, and
`clearCommand.test.ts` so the new method on `TaskRegistry` doesn't
trip mocks.
## Verification
- `npm run typecheck`: clean across all packages
- `npm run lint:ci`: clean (--max-warnings 0)
- `npm run build && npm run bundle`: clean
- Targeted core suites (`src/tasks`, `src/agents`,
`src/tools/agent`, shell, monitor, send-message, task-stop): 763
passed, 2 skipped
- New monitor-task.test.ts: 24 / 24 passed
… onto main Port the shell notification feature (#4355) from the old BackgroundShellRegistry class into the shell-task.ts free-function module. Update all call sites (Session.ts, useGeminiStream.ts, chatCompressionService.ts) and their tests to use the new module-level setters and unified TaskRegistry API. Also integrates AgentTerminateMode.SHUTDOWN grouping with CANCELLED (#4410) and fork-subagent feature gate (#4574) into the free-function architecture.
…fication tests
shellAbortAll now passes { notify: false } to shellCancel, matching
the original BackgroundShellRegistry.abortAll behavior that used a
private settleAsCancelled helper to avoid leaking notifications
across /clear boundaries or during session shutdown.
Ports all 14 notification tests from the deleted
backgroundShellRegistry.test.ts into shell-task.test.ts, covering
XML structure, command truncation, escaping, output-tail reading,
symlink security, idempotency, and the abortAll suppression.
a20c24f to
8f9b205
Compare
The registry collapse swapped all four monitor notification call sites from stripDisplayControlChars to stripTerminalControlSequences. The latter strips ANSI/OSC/CSI escape sequences and C0/C1 controls but not Unicode bidi override/isolate characters (U+202A-202E, U+2066-2069), silently dropping the Trojan Source (CVE-2021-42574) defense that the sibling shell notification surface still applies. Add a sanitizeForNotification helper that composes both functions so monitor notifications keep the improved ANSI-sequence stripping and regain bidi stripping, and a regression test covering the description and detail surfaces.
The registry collapse removed the dedicated lifecycle coverage that lived in the deleted background-tasks.test.ts, leaving agent-task.test.ts with only the concurrency-cap suite. The lifecycle state machine was exercised only indirectly by caller tests. Add focused unit coverage for the high-risk paths the collapse touched: terminal transitions and the double-terminal / notified short-circuit guards, the cancel-then-natural-complete race, notify:false suppression, the 5s cancel grace-timer fallback, agentAbandon, and the terminal retention cap eviction (including that running and not-yet-notified cancelled entries are retained).
… mutateSilent The round-2 fixes added a mutateSilent property to the mockTaskRegistry object literal but not to its type annotation, leaving the two inconsistent. (The file is in the tsconfig exclude list for TODO 5691, so this never surfaced as a build error, but the mock should still match its declared type.) Add the property to the annotation and fix the literal's indentation. Also correct the detail-view refresh comment in BackgroundTasksDialog: agent recentActivities is reassigned via mutateSilent, which deliberately does not fire the registry subscription — both agent activity and monitor events rely on the 1s wall-clock tick, not the subscription.
| let mockBackgroundTaskRegistry: { | ||
| setNotificationCallback: ReturnType<typeof vi.fn>; | ||
| setRegisterCallback: ReturnType<typeof vi.fn>; | ||
| let mockTaskRegistry: { |
There was a problem hiding this comment.
[Critical] (Carried from R3 — still present at this SHA.) The mockTaskRegistry type annotation (lines 151–160) is still missing the mutateSilent property that was added to TaskRegistry. This produces a tsc error at the assignment site (line 213): Object literal may only specify known properties, and 'mutateSilent' does not exist in type ....
| let mockTaskRegistry: { | |
| let mockTaskRegistry: { | |
| getAll: ReturnType<typeof vi.fn>; | |
| getByKind: ReturnType<typeof vi.fn>; | |
| get: ReturnType<typeof vi.fn>; | |
| register: ReturnType<typeof vi.fn>; | |
| update: ReturnType<typeof vi.fn>; | |
| evict: ReturnType<typeof vi.fn>; | |
| kill: ReturnType<typeof vi.fn>; | |
| subscribe: ReturnType<typeof vi.fn>; | |
| mutateSilent: ReturnType<typeof vi.fn>; | |
| }; |
— qwen3.7-max via Qwen Code /review
| const entry = registry.get(agentId) as AgentTask | undefined; | ||
| if (!entry || entry.kind !== 'agent' || entry.status !== 'running') return; | ||
|
|
||
| registry.mutateSilent<AgentTask>(agentId, (current) => { |
There was a problem hiding this comment.
[Critical] Behavioral regression: agentAppendActivity uses registry.mutateSilent which does NOT fire fireChange. The detail view's subscription in BackgroundTasksDialog.tsx:974 uses registry.subscribe(...), so it never fires on agent activity bursts. The Progress section in the detail view is now driven solely by the 1-second setInterval wall-clock tick — it no longer updates immediately on each tool call as the old setActivityChangeCallback mechanism did.
The old code had a dedicated BackgroundActivityChangeCallback / setActivityChangeCallback channel that fired immediately on every appendActivity call. This PR removed it and replaced it with registry.subscribe, but mutateSilent was intentionally designed to skip subscribers — creating a gap between the producer and the consumer.
Suggested fix options (either):
- Add a second subscription channel (e.g.
registry.subscribeDetail(id, listener)or per-kindsetAgentActivityChangeCallback) that fires onmutateSilentpaths, restoring the dual-channel behavior without regressing list-mode fan-out. - Have the detail view poll
recentActivities.lengthon the existing 1-second interval — simpler but adds up to 1s latency.
— qwen3.7-max via Qwen Code /review
| @@ -0,0 +1,194 @@ | |||
| /** | |||
There was a problem hiding this comment.
[Critical] (Carried from R1–R3 — partially addressed.) Tests have been added for the concurrency cap, monitor lifecycle, and shell lifecycle, but significant coverage gaps remain:
- No test files exist for
registry.ts(the central new abstraction),dispatcher.ts(thekill()routing table), ordream-task.ts(theMemoryManageradapter). agent-task.test.ts(194 lines) only covers the concurrency cap. 17 exported functions remain untested:agentComplete,agentFail,agentCancel(including the 5s grace timer),agentFinalizeCancelled,agentAppendActivity(rolling buffer cap),agentQueueMessage/agentDrainMessages/agentWaitForMessages(the async waiter mechanism),agentAbortAll,agentReset, andemitNotification.pruneTerminalEntrieseviction logic is untested across all three kinds (agent/shell/monitor).- Monitor idle timeout auto-stop path is untested (no
vi.advanceTimersByTimecall inmonitor-task.test.ts).
The deleted test suites (background-tasks.test.ts: 1561 lines, monitorRegistry.test.ts: 918 lines, backgroundShellRegistry.test.ts: 409 lines) covered 20+ scenarios including cancel grace timers, state machine transitions, and notification emission — all now untested.
— qwen3.7-max via Qwen Code /review
| // Persist the reason so the dialog detail view can show it after | ||
| // settle. Same pattern as the max-events branch in | ||
| // `monitorEmitEvent`. | ||
| registry.update<MonitorTask>(entry.monitorId, (current) => { |
There was a problem hiding this comment.
[Suggestion] The idle timeout path does two registry.update calls (one to set error, one inside settle to set status+endTime), firing fireChange twice for a single terminal transition. The sibling max-events branch at line 344 correctly uses settle with extras.error for a single update.
| registry.update<MonitorTask>(entry.monitorId, (current) => { | |
| settle(registry, entry, 'completed', { error: 'Idle timeout' }); | |
| emitTerminalNotification(entry, 'Idle timeout'); |
— qwen3.7-max via Qwen Code /review
| * Strip C0 control characters (except tab) and C1 control characters from | ||
| * terminal/UI display strings. | ||
| */ | ||
| function stripDisplayControlChars(text: string): string { |
There was a problem hiding this comment.
[Suggestion] This private stripDisplayControlChars is a divergent copy of the shared utility at packages/core/src/utils/terminalSafe.ts. The shared version additionally strips Unicode bidirectional control characters (U+202A–U+202E LRE/RLE/PDF/LRO/RLO and U+2066–U+2069 LRI/RLI/FSI/PDI), defending against Trojan Source attacks (CVE-2021-42574). This private copy omits those ranges.
monitor-task.ts correctly imports stripTerminalControlSequences from the shared utility (line 22). Shell notification XML (command labels, cwd, error messages) passes through the weaker sanitizer.
Not a regression from main (the old BackgroundShellRegistry also had the weaker copy), but the PR is an opportunity to converge on the shared utility.
| function stripDisplayControlChars(text: string): string { | |
| import { stripDisplayControlChars } from '../utils/terminalSafe.js'; |
(Delete the private copy below and keep the private stripOutputControlChars which has no shared equivalent.)
— qwen3.7-max via Qwen Code /review
| } | ||
|
|
||
| /** Emit a terminal notification (completed/failed/cancelled). */ | ||
| function emitTerminalNotification(entry: MonitorTask, detail?: string): void { |
There was a problem hiding this comment.
[Suggestion] emitTerminalNotification never sets entry.notified = true. Both agent-task.ts and shell-task.ts set this flag after emitting their terminal notification, fulfilling the TaskBase contract ("True once the kind's terminal notification has fired"). The monitor kind silently opts out — notified stays false forever on every monitor entry.
The monitor kind's own pruneTerminalEntries filters on status !== 'running' rather than notified === true, so eviction still works today. But any future consumer that reads entry.notified polymorphically across kinds will silently misclassify monitors as "not yet notified."
Suggested fix: Set entry.notified = true inside emitTerminalNotification (or in settle()), mirroring the pattern used by the other two kinds.
— qwen3.7-max via Qwen Code /review
| if (entry.notified) return; | ||
|
|
||
| // Mark notified silently — no need to trigger change listeners. | ||
| entry.notified = true; |
There was a problem hiding this comment.
[Suggestion] emitShellNotification sets entry.notified = true via direct mutation, bypassing registry.update(). The agent-task equivalent (emitNotification in agent-task.ts) uses registry.update<AgentTask>() to set notified, which fires the change listener.
This means the shell's notified transition is invisible to registry subscribers. Today no consumer keys off notified through the subscription, so there is no user-visible bug. But the inconsistency between kinds means a future feature that tracks notified via change events would silently miss shell notifications.
Suggested fix: Either route through registry.update() (matching agent-task) or document the intentional silence on TaskBase.notified so the next engineer doesn't assume symmetry.
— qwen3.7-max via Qwen Code /review
| * points each install exactly one of these per session, and a list | ||
| * would invite drift in error-handling. | ||
| */ | ||
| let notificationCallback: BackgroundNotificationCallback | undefined; |
There was a problem hiding this comment.
[Critical] Cross-session notification leak — the per-Config notification callbacks became process-global singletons, so concurrent ACP sessions collide.
This PR demotes notificationCallback/registerCallback from per-Config-instance fields to module-level lets here (same in monitor-task.ts:161-162 and shell-task.ts:60). On main, ACP registered them per session via this.config.getBackgroundTaskRegistry().setNotificationCallback(...). Now Session.#registerBackgroundNotificationCallbacks() calls the global setAgentNotificationCallback(...), and emitNotification reads the single module slot (agent-task.ts:917) — last writer wins.
AcpAgent keeps sessions: Map<string, Session> and builds a fresh Config per newSession (loadCliConfig → new Session(...)), holding multiple sessions live at once. Each constructor overwrites the global callback; each dispose() nulls it globally. Session B's closure (Session.ts:1655) enqueues onto this.notificationQueue with no agentId ownership check.
Concrete failure (reachable via the Zed ACP integration): Session A launches a background subagent → user opens Session B → A's subagent completes → A's terminal <task-notification> (carrying <result> agent output / shell <output-tail>) is delivered to Session B's client, not A's. If B is disposed first, A's notification is silently dropped.
The Scope/Risk note says "No qwen-code production use case hits this today" — ACP multi-session is that case. Suggest re-scoping the callbacks per TaskRegistry (e.g. a WeakMap<TaskRegistry, Callbacks> keyed off the entry's owning registry, or restore a per-Config field) so a notification fired for one session can only reach that session.
— claude-opus-4-8 via Claude Code /qreview
…essions The agent/monitor/shell notification and register callbacks were demoted to bare module-level singletons during the registry collapse. The ACP agent runs concurrent sessions in one process, each with its own Config and TaskRegistry, so the most-recently-started session's callback overwrote the others — one session's background-task notifications could be delivered to a different session's connection. Key the callbacks by TaskRegistry instance via a WeakMap so each session is isolated again, restoring the per-instance behavior the old per-kind registries had. The owner-scoped monitor maps stay keyed by ownerAgentId (globally unique), so they don't share the collision hazard. Callers in Session, the non-interactive session/CLI, and useGeminiStream now pass the session's registry to the setters. Adds a per-registry isolation regression test for the agent kind.
The registry collapse deleted the three per-kind registry test suites but left the new central modules without direct coverage. Add suites for TaskRegistry's narrow surface (register/get/update/mutateSilent/evict/ subscribe/kill dispatch), the kind dispatcher (registration, lookup, unregistered-kind error), and the dream adapter (filtering, terminal cap, field mapping, subscribe filter, kill delegation).
|
Closing this for now — deprioritizing the task-registry collapse rather than carrying a large refactor in the background-task subsystem at this stage. This is a prioritization call, not a problem with the branch: the work is in good shape and has been pushed (HEAD State at closeAddressed
Verified
Remaining if reopened (suggestion-level, none blocking):
To resume: reopen, rerun CI, optionally address the suggestions above, merge. |
Summary
TaskRegistryplus per-kind modules that hold each kind's lifecycle helpers as free functions and register a smallTaskimplementation with a polymorphic dispatcher. The CLI's background-tasks hook collapses from a four-source merge with dream-signature dedup to a single registry subscription plus a thin dream adapter, and the dialog's cancel switch becomes a single dispatcher call.Configto avoid an import cycle with the registry); whether the per-kind module split keeps each kind's helpers cohesive; and whether the CLI hook's new shape-signature filter behaves correctly across agent activity bursts, monitor event count bumps, and dream metadata updates.Net diff: ~3950 lines deleted, no user-observable behavior change.
Validation
knowledge/qwen-code/e2e-tests/task-registry-unification-pr2.md— exercises foreground/background subagent persistence, headless holdback, dialog cancel routing for each kind, and mixed dialog ordering.running) self-resolved on this build — the meta now correctly transitions tocancelled.knowledge/qwen-code/design/task-registry-unification-pr2-impl.md) for intent, then walk the new task module surface (registry + dispatcher + four kind modules), then spot-check that the migrated consumers (agent tool, shell tool, monitor tool, send-message, task-stop, background-agent-resume, the CLI hook, dialog context) all look right.Scope / Risk
Configinstances concurrently would have the second instance overwrite the first's callbacks. No qwen-code production use case hits this today; flagging in case future SDK multi-session work needs to revisit.scheduleDream()preconditions can't be met in a fresh test workdir. Thekillpath is structurally identical to the verified kinds and is covered by unit tests.@qwen-code/qwen-code-corekeep their imports working through one release via re-export shims in the old paths (scheduled for removal one release after this lands). The threeConfiggetters for the old registries were dropped atomically — consumers that hold aConfigreference and expectedgetBackgroundTaskRegistry()etc. need to switch togetTaskRegistry()plus the per-kind helpers.Testing Matrix
Testing matrix notes: