fix(subagent-announce): defer drain while parent session is busy#71706
Conversation
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Greptile SummaryThis PR adds an optional Confidence Score: 4/5Safe to merge; no blocking defects, two P2 polish items that don't affect correctness for current callers. Only P2 findings: the shouldDefer overwrite-with-undefined edge case and the double-debounce overhead are both unlikely to trigger in production given how queue keys are scoped, and neither produces wrong behavior for the new subagent use-case. Core logic, test coverage, and backward-compatibility are all sound. src/agents/subagent-announce-queue.ts lines 83–87 (shouldDefer overwrite) and 135–142 (defer loop structure). Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/agents/subagent-announce-queue.ts
Line: 80-87
Comment:
**`shouldDefer` silently cleared when queue is re-used without the hook**
When `getAnnounceQueue` finds an existing entry it unconditionally overwrites `existing.shouldDefer = shouldDefer`. Any caller that omits `shouldDefer` (i.e. passes `undefined`) will silently clear a hook that was already installed on that queue, allowing the drain to proceed without the parent-idle check. If a secondary enqueue ever hits the same queue key without passing `shouldDefer`, queued items could be delivered while the parent is still busy.
A safer guard would skip the update when the new value is `undefined`:
```typescript
if (shouldDefer !== undefined) {
existing.shouldDefer = shouldDefer;
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/agents/subagent-announce-queue.ts
Line: 136-142
Comment:
**Defer loop re-enters debounce wait on every poll cycle**
After the defer sleep the loop hits `continue`, which brings it back to `await waitForQueueDebounce(queue)` before the next `shouldDefer` check. With `debounceMs = 0` this is a no-op, but with a non-zero debounce the poll interval becomes `debounceMs + max(250, debounceMs)` (typically 2× the configured debounce) on every deferred iteration rather than the intended `max(250, debounceMs)`. This makes the parent-idle check slower than advertised when debounce is large.
Resetting `lastEnqueuedAt` before the `continue`, or restructuring so debounce is only waited once per enqueue event, would keep the effective polling cadence at exactly `max(250, debounceMs)`.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(subagent-announce): defer drain whil..." | Re-trigger Greptile |
0cc7185 to
4203b00
Compare
4203b00 to
1b0e5ff
Compare
When a subagent finishes while its parent main session is still running (executing tools or awaiting model output), the announce queue would follow the configured debounce and immediately attempt to deliver the completion event back into the parent session via callGateway. The gateway treats the parent as busy and the announce can either get buffered until the next external user message or surface only as a delayed echo, breaking the natural sessions_spawn -> sessions_yield workflow where the parent expects the result to arrive as the next turn. This change adds an optional shouldDefer hook on the announce queue state. The delivery layer wires it to the existing requester session activity probe (resolveRequesterSessionActivity), so while the parent session is still active the drain loop sleeps for max(250ms, debounceMs) and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally. - Plumbs shouldDefer through getAnnounceQueue / enqueueAnnounce. - Skips drain step in scheduleAnnounceDrain when shouldDefer says the target is still busy, with a bounded re-check sleep. - Updates maybeQueueSubagentAnnounce to pass the activity probe. - Adds a unit test that holds drain while parent is busy and resumes when it goes idle. No behavior change for callers that do not pass shouldDefer.
1b0e5ff to
3c261d1
Compare
|
Confirmed from personal usage: subagent spawn (NEON-SOUL synthesis, 200+ LLM calls) completes while parent session is active — completion announce is silently deferred until next user message. Workaround: manual polling every 5-10 minutes. This breaks natural |
fabianwilliams
left a comment
There was a problem hiding this comment.
Clean addition of shouldDefer to the announce queue — keeps drain cooperative with parent session activity. Tests cover both the busy-parent deferral and the defer-hook-preservation case on queue reuse. Targeted fix without expanding queue surface area. LGTM.
…nclaw#71706) When a subagent finishes while its parent main session is still running (executing tools or awaiting model output), the announce queue would follow the configured debounce and immediately attempt to deliver the completion event back into the parent session via callGateway. The gateway treats the parent as busy and the announce can either get buffered until the next external user message or surface only as a delayed echo, breaking the natural sessions_spawn -> sessions_yield workflow where the parent expects the result to arrive as the next turn. This change adds an optional shouldDefer hook on the announce queue state. The delivery layer wires it to the existing requester session activity probe (resolveRequesterSessionActivity), so while the parent session is still active the drain loop sleeps for max(250ms, debounceMs) and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally. - Plumbs shouldDefer through getAnnounceQueue / enqueueAnnounce. - Skips drain step in scheduleAnnounceDrain when shouldDefer says the target is still busy, with a bounded re-check sleep. - Updates maybeQueueSubagentAnnounce to pass the activity probe. - Adds a unit test that holds drain while parent is busy and resumes when it goes idle. No behavior change for callers that do not pass shouldDefer.
…nclaw#71706) When a subagent finishes while its parent main session is still running (executing tools or awaiting model output), the announce queue would follow the configured debounce and immediately attempt to deliver the completion event back into the parent session via callGateway. The gateway treats the parent as busy and the announce can either get buffered until the next external user message or surface only as a delayed echo, breaking the natural sessions_spawn -> sessions_yield workflow where the parent expects the result to arrive as the next turn. This change adds an optional shouldDefer hook on the announce queue state. The delivery layer wires it to the existing requester session activity probe (resolveRequesterSessionActivity), so while the parent session is still active the drain loop sleeps for max(250ms, debounceMs) and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally. - Plumbs shouldDefer through getAnnounceQueue / enqueueAnnounce. - Skips drain step in scheduleAnnounceDrain when shouldDefer says the target is still busy, with a bounded re-check sleep. - Updates maybeQueueSubagentAnnounce to pass the activity probe. - Adds a unit test that holds drain while parent is busy and resumes when it goes idle. No behavior change for callers that do not pass shouldDefer.
…nclaw#71706) When a subagent finishes while its parent main session is still running (executing tools or awaiting model output), the announce queue would follow the configured debounce and immediately attempt to deliver the completion event back into the parent session via callGateway. The gateway treats the parent as busy and the announce can either get buffered until the next external user message or surface only as a delayed echo, breaking the natural sessions_spawn -> sessions_yield workflow where the parent expects the result to arrive as the next turn. This change adds an optional shouldDefer hook on the announce queue state. The delivery layer wires it to the existing requester session activity probe (resolveRequesterSessionActivity), so while the parent session is still active the drain loop sleeps for max(250ms, debounceMs) and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally. - Plumbs shouldDefer through getAnnounceQueue / enqueueAnnounce. - Skips drain step in scheduleAnnounceDrain when shouldDefer says the target is still busy, with a bounded re-check sleep. - Updates maybeQueueSubagentAnnounce to pass the activity probe. - Adds a unit test that holds drain while parent is busy and resumes when it goes idle. No behavior change for callers that do not pass shouldDefer.
…nclaw#71706) When a subagent finishes while its parent main session is still running (executing tools or awaiting model output), the announce queue would follow the configured debounce and immediately attempt to deliver the completion event back into the parent session via callGateway. The gateway treats the parent as busy and the announce can either get buffered until the next external user message or surface only as a delayed echo, breaking the natural sessions_spawn -> sessions_yield workflow where the parent expects the result to arrive as the next turn. This change adds an optional shouldDefer hook on the announce queue state. The delivery layer wires it to the existing requester session activity probe (resolveRequesterSessionActivity), so while the parent session is still active the drain loop sleeps for max(250ms, debounceMs) and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally. - Plumbs shouldDefer through getAnnounceQueue / enqueueAnnounce. - Skips drain step in scheduleAnnounceDrain when shouldDefer says the target is still busy, with a bounded re-check sleep. - Updates maybeQueueSubagentAnnounce to pass the activity probe. - Adds a unit test that holds drain while parent is busy and resumes when it goes idle. No behavior change for callers that do not pass shouldDefer.
…nclaw#71706) When a subagent finishes while its parent main session is still running (executing tools or awaiting model output), the announce queue would follow the configured debounce and immediately attempt to deliver the completion event back into the parent session via callGateway. The gateway treats the parent as busy and the announce can either get buffered until the next external user message or surface only as a delayed echo, breaking the natural sessions_spawn -> sessions_yield workflow where the parent expects the result to arrive as the next turn. This change adds an optional shouldDefer hook on the announce queue state. The delivery layer wires it to the existing requester session activity probe (resolveRequesterSessionActivity), so while the parent session is still active the drain loop sleeps for max(250ms, debounceMs) and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally. - Plumbs shouldDefer through getAnnounceQueue / enqueueAnnounce. - Skips drain step in scheduleAnnounceDrain when shouldDefer says the target is still busy, with a bounded re-check sleep. - Updates maybeQueueSubagentAnnounce to pass the activity probe. - Adds a unit test that holds drain while parent is busy and resumes when it goes idle. No behavior change for callers that do not pass shouldDefer.
…nclaw#71706) When a subagent finishes while its parent main session is still running (executing tools or awaiting model output), the announce queue would follow the configured debounce and immediately attempt to deliver the completion event back into the parent session via callGateway. The gateway treats the parent as busy and the announce can either get buffered until the next external user message or surface only as a delayed echo, breaking the natural sessions_spawn -> sessions_yield workflow where the parent expects the result to arrive as the next turn. This change adds an optional shouldDefer hook on the announce queue state. The delivery layer wires it to the existing requester session activity probe (resolveRequesterSessionActivity), so while the parent session is still active the drain loop sleeps for max(250ms, debounceMs) and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally. - Plumbs shouldDefer through getAnnounceQueue / enqueueAnnounce. - Skips drain step in scheduleAnnounceDrain when shouldDefer says the target is still busy, with a bounded re-check sleep. - Updates maybeQueueSubagentAnnounce to pass the activity probe. - Adds a unit test that holds drain while parent is busy and resumes when it goes idle. No behavior change for callers that do not pass shouldDefer.
…nclaw#71706) When a subagent finishes while its parent main session is still running (executing tools or awaiting model output), the announce queue would follow the configured debounce and immediately attempt to deliver the completion event back into the parent session via callGateway. The gateway treats the parent as busy and the announce can either get buffered until the next external user message or surface only as a delayed echo, breaking the natural sessions_spawn -> sessions_yield workflow where the parent expects the result to arrive as the next turn. This change adds an optional shouldDefer hook on the announce queue state. The delivery layer wires it to the existing requester session activity probe (resolveRequesterSessionActivity), so while the parent session is still active the drain loop sleeps for max(250ms, debounceMs) and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally. - Plumbs shouldDefer through getAnnounceQueue / enqueueAnnounce. - Skips drain step in scheduleAnnounceDrain when shouldDefer says the target is still busy, with a bounded re-check sleep. - Updates maybeQueueSubagentAnnounce to pass the activity probe. - Adds a unit test that holds drain while parent is busy and resumes when it goes idle. No behavior change for callers that do not pass shouldDefer.
Summary
When a subagent finishes while its parent (the main session that called
sessions_spawn) is still running — executing tools or awaiting model output — the announce queue currently follows the configured debounce and immediately attempts to deliver the completion event back into the parent viacallGateway. The gateway treats the parent as busy, so the announce can be buffered until the next external user message arrives, or surfaces only as a delayed echo. This breaks the naturalsessions_spawn → sessions_yieldworkflow where the parent expects the subagent result to arrive as the next turn.Observed in practice: a parent agent spawns a child, doesn't yield (continues with other tool calls), the child finishes — the completion announce is queued while the parent is still busy, then doesn't reliably wake the parent until the human types something else.
Fix
Add an optional
shouldDeferhook on the announce queue state. The delivery layer wires it to the existingresolveRequesterSessionActivityprobe, so while the parent session is still active the drain loop sleepsmax(250ms, debounceMs)and re-checks instead of pushing the announce. As soon as the parent goes idle, the queue drains normally.shouldDeferthroughgetAnnounceQueue/enqueueAnnounce.scheduleAnnounceDrain, after the debounce wait, checkshouldDefer(nextItem); if true, sleep and re-loop.maybeQueueSubagentAnnouncepasses the requester activity probe.No behavior change for callers that do not pass
shouldDefer. Default behaviour for non-subagent queues is unchanged.Tests
Added a unit test in
subagent-announce-queue.test.ts:shouldDeferreporting parent busy.shouldDeferto false; expects exactly one send with the original prompt.Ran locally:
Risk / Compatibility
shouldDeferis optional; existing call sites and tests do not pass it, so behaviour is unchanged for them.max(250ms, debounceMs)), so even if the activity probe is wrong about idle/active state, the queue does not livelock — it keeps polling at human-perceptible cadence.Why this matters
This is the root cause of the "subagent completion arrives only after the next user message" class of issues we see when parent agents intentionally don't yield (e.g. iterative coding agents continuing other tool work in parallel). With this fix, the queue waits for the parent to be ready instead of racing it.