Subagents: restore announce chain + fix nested retry/drop regressions#22223
Subagents: restore announce chain + fix nested retry/drop regressions#22223
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 241b86e8cd
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/agents/subagent-announce.ts
Outdated
| if (!requesterIsSubagent && !directOrigin) { | ||
| const { entry } = loadRequesterSessionEntry(targetRequesterSessionKey); | ||
| directOrigin = resolveAnnounceOrigin(entry, targetRequesterOrigin); | ||
| directOrigin = deliveryContextFromSession(entry); |
There was a problem hiding this comment.
Merge direct announce origin with stored delivery context
This branch now uses targetRequesterOrigin as-is whenever it is present, so if a run was spawned from a non-deliverable/internal channel hint (for example webchat) the direct announce call keeps that channel and never falls back to the persisted lastChannel/lastTo route. In that case agent delivery is forced to non-delivery and the completion update is not sent to the user even though the requester session has a valid outbound target; this used to be avoided by always resolving origin against the session entry.
Useful? React with 👍 / 👎.
src/agents/subagent-announce.ts
Outdated
| // actually on and must take priority over the session entry, which may carry | ||
| // stale lastChannel / lastTo values from a previous channel interaction. | ||
| return mergeDeliveryContext(normalizedRequester, normalizedEntry); | ||
| return mergeDeliveryContext(requesterOrigin, deliveryContextFromSession(entry)); |
There was a problem hiding this comment.
Ignore internal channel hints when resolving queued announce origin
resolveAnnounceOrigin now directly prioritizes requesterOrigin without filtering non-deliverable channels, so a queued announce can inherit webchat/internal channel hints and overwrite a valid persisted delivery route from the requester session. When that queued item is sent, the gateway treats the channel as internal and does not deliver outward, causing queued subagent completions to be dropped while the requester run is active.
Useful? React with 👍 / 👎.
|
Merged via squash.
|
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204)
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204)
…#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (#22223) (thanks @tyler6204)
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204)
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204)
…2223 Commit f555835 ("Channels: add thread-aware model overrides") inadvertently reverted the `resolveAnnounceOrigin` channel guard introduced in fe57bea (openclaw#22223). The revert changed `isInternalMessageChannel` back to `!isDeliverableMessageChannel`, which strips non-standard channel hints (anything not in the built-in deliverable list) from the requester origin. When the stripped origin falls back to the session entry's persisted `lastChannel`/`lastTo`, stale values from a previous interaction (e.g. WhatsApp with no valid E.164 target) cause delivery failures: Error: Delivering to WhatsApp requires target <E.164|group JID> [warn] Subagent announce give up (retry-limit) The fix narrows the guard back to `isInternalMessageChannel` so only the webchat channel is stripped — matching the intent of openclaw#22223. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two issues caused subagent completion announces to silently fail for plugin channels like Gmail: 1. `resolveAnnounceOrigin` used `!isDeliverableMessageChannel()` to strip non-standard channels, inadvertently reverting the narrower `isInternalMessageChannel()` guard from openclaw#22223 (fe57bea). Restored the original guard so only webchat is stripped. 2. `sendSubagentAnnounceDirectly` and `sendAnnounce` passed the origin channel/to verbatim to `callGateway()`, even when the channel was not in the deliverable list. Plugin channels whose adapter ID differs from their registered plugin ID (e.g. "gmail" vs "openclaw-gmail") were unroutable, causing the gateway to time out (15s x 3 retries) and then fall back to stale session routes (typically WhatsApp with no valid E.164 target). Fix: check `isDeliverableMessageChannel()` before passing channel hints. When the channel is unroutable, omit channel/to/accountId so the gateway resolves delivery from connected clients or session state. Symptoms observed: Subagent completion direct announce failed: gateway timeout after 15000ms [warn] Subagent announce give up (retry-limit) retries=3 Error: Delivering to WhatsApp requires target <E.164|group JID> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two issues caused subagent completion announces to silently fail for plugin channels like Gmail: 1. `resolveAnnounceOrigin` used `!isDeliverableMessageChannel()` to strip non-standard channels, inadvertently reverting the narrower `isInternalMessageChannel()` guard from openclaw#22223 (fe57bea). Restored the original guard so only webchat is stripped. 2. `sendSubagentAnnounceDirectly` and `sendAnnounce` passed the origin channel/to verbatim to `callGateway()`, even when the channel was not in the deliverable list. Plugin channels whose adapter ID differs from their registered plugin ID (e.g. "gmail" vs "openclaw-gmail") were unroutable, causing the gateway to time out (15s x 3 retries) and then fall back to stale session routes (typically WhatsApp with no valid E.164 target). Fix: check `isDeliverableMessageChannel()` before passing channel hints. When the channel is unroutable, omit channel/to/accountId so the gateway resolves delivery from connected clients or session state. Symptoms observed: Subagent completion direct announce failed: gateway timeout after 15000ms [warn] Subagent announce give up (retry-limit) retries=3 Error: Delivering to WhatsApp requires target <E.164|group JID> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two issues caused subagent completion announces to silently fail for plugin channels like Gmail: 1. `resolveAnnounceOrigin` used `!isDeliverableMessageChannel()` to strip non-standard channels, inadvertently reverting the narrower `isInternalMessageChannel()` guard from openclaw#22223 (fe57bea). Restored the original guard so only webchat is stripped. 2. `sendSubagentAnnounceDirectly` and `sendAnnounce` passed the origin channel/to verbatim to `callGateway()`, even when the channel was not in the deliverable list. Plugin channels whose adapter ID differs from their registered plugin ID (e.g. "gmail" vs "openclaw-gmail") were unroutable, causing the gateway to time out (15s x 3 retries) and then fall back to stale session routes (typically WhatsApp with no valid E.164 target). Fix: check `isDeliverableMessageChannel()` before passing channel hints. When the channel is unroutable, omit channel/to/accountId so the gateway resolves delivery from connected clients or session state. Symptoms observed: Subagent completion direct announce failed: gateway timeout after 15000ms [warn] Subagent announce give up (retry-limit) retries=3 Error: Delivering to WhatsApp requires target <E.164|group JID> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two issues caused subagent completion announces to silently fail for plugin channels like Gmail: 1. `resolveAnnounceOrigin` used `!isDeliverableMessageChannel()` to strip non-standard channels, inadvertently reverting the narrower `isInternalMessageChannel()` guard from openclaw#22223 (fe57bea). Restored the original guard so only webchat is stripped. 2. `sendSubagentAnnounceDirectly` and `sendAnnounce` passed the origin channel/to verbatim to `callGateway()`, even when the channel was not in the deliverable list. Plugin channels whose adapter ID differs from their registered plugin ID (e.g. "gmail" vs "openclaw-gmail") were unroutable, causing the gateway to time out (15s x 3 retries) and then fall back to stale session routes (typically WhatsApp with no valid E.164 target). Fix: check `isDeliverableMessageChannel()` before passing channel hints. When the channel is unroutable, omit channel/to/accountId so the gateway resolves delivery from connected clients or session state. Symptoms observed: Subagent completion direct announce failed: gateway timeout after 15000ms [warn] Subagent announce give up (retry-limit) retries=3 Error: Delivering to WhatsApp requires target <E.164|group JID> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…unceOrigin Restores the narrower internal-channel guard from PR openclaw#22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: openclaw#22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204)
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204)
…unceOrigin Restores the narrower internal-channel guard from PR #22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: #22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…unceOrigin Restores the narrower internal-channel guard from PR #22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: #22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…unceOrigin Restores the narrower internal-channel guard from PR openclaw#22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: openclaw#22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…unceOrigin Restores the narrower internal-channel guard from PR openclaw#22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: openclaw#22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…unceOrigin Restores the narrower internal-channel guard from PR openclaw#22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: openclaw#22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…unceOrigin Restores the narrower internal-channel guard from PR openclaw#22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: openclaw#22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…unceOrigin Restores the narrower internal-channel guard from PR openclaw#22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: openclaw#22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…unceOrigin Restores the narrower internal-channel guard from PR openclaw#22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: openclaw#22223 regression introduced in f555835 Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204) (cherry picked from commit fe57bea) # Conflicts: # CHANGELOG.md # docs/concepts/session-tool.md # docs/tools/subagents.md # src/agents/openclaw-tools.subagents.sessions-spawn-depth-limits.test.ts # src/agents/openclaw-tools.subagents.sessions-spawn.lifecycle.e2e.test.ts # src/agents/pi-tools.policy.ts # src/agents/subagent-announce.format.e2e.test.ts # src/agents/subagent-announce.ts # src/agents/subagent-registry.announce-loop-guard.test.ts # src/agents/subagent-registry.ts # src/agents/system-prompt.test.ts # src/agents/tools/subagents-tool.ts # src/config/config.identity-defaults.test.ts # src/config/types.agent-defaults.ts # src/config/zod-schema.agent-defaults.ts # src/cron/isolated-agent/session.test.ts # src/cron/isolated-agent/session.ts
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204) (cherry picked from commit fe57bea) # Conflicts: # CHANGELOG.md # docs/concepts/session-tool.md # docs/tools/subagents.md # src/agents/openclaw-tools.subagents.sessions-spawn-depth-limits.test.ts # src/agents/openclaw-tools.subagents.sessions-spawn.lifecycle.e2e.test.ts # src/agents/pi-tools.policy.ts # src/agents/subagent-announce.ts # src/agents/subagent-registry.announce-loop-guard.test.ts # src/agents/subagent-registry.ts # src/agents/system-prompt.test.ts # src/agents/tools/subagents-tool.ts # src/config/config.identity-defaults.test.ts # src/config/defaults.ts # src/config/types.agent-defaults.ts # src/config/zod-schema.agent-defaults.ts # src/cron/isolated-agent/session.test.ts # src/cron/isolated-agent/session.ts
…openclaw#22223) * Subagents: restore announce flow and fix nested delivery retries * fix: prep subagent announce + docs alignment (openclaw#22223) (thanks @tyler6204)
…unceOrigin Restores the narrower internal-channel guard from PR openclaw#22223 (fe57bea) that was inadvertently reverted by f555835. The original !isDeliverableMessageChannel() check strips the requester's channel whenever it is not in the registered deliverable set. This causes delivery failures for plugin channels whose adapter ID differs from their plugin ID (e.g. "gmail" vs "openclaw-gmail"): the requester origin is discarded and the announce falls back to stale session routes — typically WhatsApp — resulting in a timeout followed by an E.164 format error. Replacing with isInternalMessageChannel() limits stripping to explicitly internal channels (webchat), preserving the requester origin for all external channels regardless of whether they are currently in the deliverable list. Fixes: openclaw#22223 regression introduced in f555835
Summary
This PR restores subagent orchestration behavior to the pre-regression model and fixes the remaining nested announce delivery issue observed in live depth tests.
1) Restore subagent announce to agent-only path
sendcompletion delivery pattern.agentpath only (system-message injection into requester session).2) Fix nested/sub-subagent announce completion timing
3) Fix announce drop under deep nesting
Live failures showed top-level runs hitting retry limit before descendants finished, dropping final announces.
finalizeSubagentCleanup, deferrals caused by active descendants no longer consume retry budget.4) Cron isolated session behavior
5) Default max spawn depth
DEFAULT_SUBAGENT_MAX_SPAWN_DEPTH = 2.Validation
pnpm exec vitest run src/cron/isolated-agent/session.test.ts src/config/config.agent-concurrency-defaults.test.ts src/config/config.identity-defaults.test.ts src/agents/openclaw-tools.subagents.sessions-spawn-depth-limits.test.ts src/agents/subagent-registry.announce-loop-guard.test.ts src/agents/subagent-registry.nested.test.tspnpm exec vitest run --config vitest.e2e.config.ts src/agents/system-prompt.e2e.test.ts src/agents/subagent-announce.format.e2e.test.ts src/agents/openclaw-tools.subagents.sessions-spawn.lifecycle.e2e.test.ts src/agents/pi-tools.policy.e2e.test.ts src/agents/subagent-registry.persistence.e2e.test.tsNotes
Greptile Summary
This PR restores subagent orchestration to a reliable announce-chain model and fixes nested completion delivery regressions observed in deep nesting scenarios.
Key Changes:
sendcalls, ensuring parent LLM owns user-facing voiceDEFAULT_SUBAGENT_MAX_SPAWN_DEPTH = 2constant wired through all depth checks and policy fallbacks for consistencyisLikelyWaitingForDescendantResult) to defer announcing stale intermediate replies and wait for synthesized follow-upsArchitecture:
The PR addresses a critical timing issue where top-level runs would exhaust retry limits before descendants finished, causing final announces to be dropped. The fix introduces descendant-aware deferral logic in
finalizeSubagentCleanupthat doesn't consume retry budget when waiting for active descendants, while still enforcing limits for actual repeated failures.Confidence Score: 4/5
finalizeSubagentCleanupis complex with multiple conditional paths that could benefit from additional integration testing under high-concurrency scenarios.src/agents/subagent-registry.tslines 324-377 (finalizeSubagentCleanup retry logic) andsrc/agents/subagent-announce.tslines 518-564 (descendant detection and placeholder handling)Last reviewed commit: 241b86e