fix: agent-only announce path, BB message IDs, sender identity, SSRF allowlist#23970
fix: agent-only announce path, BB message IDs, sender identity, SSRF allowlist#23970
Conversation
src/agents/subagent-registry.ts
Outdated
| export type SubagentRunRecord = { | ||
| runId: string; | ||
| childSessionKey: string; | ||
| requesterSessionKey: string; | ||
| requesterOrigin?: DeliveryContext; | ||
| requesterDisplayKey: string; | ||
| task: string; | ||
| cleanup: "delete" | "keep"; | ||
| label?: string; | ||
| model?: string; | ||
| runTimeoutSeconds?: number; | ||
| createdAt: number; | ||
| startedAt?: number; | ||
| endedAt?: number; | ||
| outcome?: SubagentRunOutcome; | ||
| archiveAtMs?: number; | ||
| cleanupCompletedAt?: number; | ||
| cleanupHandled?: boolean; | ||
| suppressAnnounceReason?: "steer-restart" | "killed"; | ||
| expectsCompletionMessage?: boolean; | ||
| /** Number of times announce delivery has been attempted and returned false (deferred). */ | ||
| announceRetryCount?: number; | ||
| /** Timestamp of the last announce retry attempt (for backoff). */ | ||
| lastAnnounceRetryAt?: number; | ||
| }; |
There was a problem hiding this comment.
Duplicate type definition conflicts with subagent-registry.types.ts
The inline SubagentRunRecord type here differs from the one exported in subagent-registry.types.ts (which still includes spawnMode, endedReason, and endedHookEmittedAt). This creates a type inconsistency since subagent-registry.store.ts imports from the types file and sets spawnMode on line 104.
| export type SubagentRunRecord = { | |
| runId: string; | |
| childSessionKey: string; | |
| requesterSessionKey: string; | |
| requesterOrigin?: DeliveryContext; | |
| requesterDisplayKey: string; | |
| task: string; | |
| cleanup: "delete" | "keep"; | |
| label?: string; | |
| model?: string; | |
| runTimeoutSeconds?: number; | |
| createdAt: number; | |
| startedAt?: number; | |
| endedAt?: number; | |
| outcome?: SubagentRunOutcome; | |
| archiveAtMs?: number; | |
| cleanupCompletedAt?: number; | |
| cleanupHandled?: boolean; | |
| suppressAnnounceReason?: "steer-restart" | "killed"; | |
| expectsCompletionMessage?: boolean; | |
| /** Number of times announce delivery has been attempted and returned false (deferred). */ | |
| announceRetryCount?: number; | |
| /** Timestamp of the last announce retry attempt (for backoff). */ | |
| lastAnnounceRetryAt?: number; | |
| }; | |
| import type { SubagentRunRecord } from "./subagent-registry.types.js"; |
Then update subagent-registry.types.ts to remove the deprecated fields instead of defining the type inline here.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/agents/subagent-registry.ts
Line: 14-38
Comment:
Duplicate type definition conflicts with `subagent-registry.types.ts`
The inline `SubagentRunRecord` type here differs from the one exported in `subagent-registry.types.ts` (which still includes `spawnMode`, `endedReason`, and `endedHookEmittedAt`). This creates a type inconsistency since `subagent-registry.store.ts` imports from the types file and sets `spawnMode` on line 104.
```suggestion
import type { SubagentRunRecord } from "./subagent-registry.types.js";
```
Then update `subagent-registry.types.ts` to remove the deprecated fields instead of defining the type inline here.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
openclaw/src/agents/subagent-registry.ts
Lines 98 to 102 in 053fb74
startSubagentAnnounceCleanupFlow no longer forwards entry.expectsCompletionMessage (or entry.spawnMode) into runSubagentAnnounceFlow, so completion-mode runs are treated as regular announces. This bypasses completion-specific routing (including resolveSubagentCompletionOrigin/subagent_delivery_target handling) and changes delivery behavior from completion-direct semantics to generic queue-first behavior, which can delay or misroute completion updates for manual/session subagent workflows.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/agents/subagent-registry.ts
Outdated
| const spawnMode = params.spawnMode === "session" ? "session" : "run"; | ||
| const archiveAtMs = | ||
| spawnMode === "session" ? undefined : archiveAfterMs ? now + archiveAfterMs : undefined; | ||
| const archiveAtMs = archiveAfterMs ? now + archiveAfterMs : undefined; |
There was a problem hiding this comment.
Keep session-mode runs out of archival sweep
registerSubagentRun now always computes archiveAtMs from archiveAfterMinutes, removing the previous session-mode exemption. Persistent session-mode subagents are therefore swept and deleted after the archive window, which breaks long-lived thread-bound subagent sessions that are expected to remain available (for example, focused/persistent agent threads).
Useful? React with 👍 / 👎.
| if (updated > 0) { | ||
| persistSubagentRuns(); | ||
| for (const entry of entriesByChildSessionKey.values()) { | ||
| void emitSubagentEndedHookOnce({ | ||
| entry, | ||
| reason: SUBAGENT_ENDED_REASON_KILLED, | ||
| sendFarewell: true, | ||
| outcome: SUBAGENT_ENDED_OUTCOME_KILLED, | ||
| error: reason, | ||
| inFlightRunIds: endedHookInFlightRunIds, | ||
| persist: persistSubagentRuns, | ||
| }).catch(() => { | ||
| // Hook failures should not break termination flow. | ||
| }); | ||
| } | ||
| } |
There was a problem hiding this comment.
Emit subagent_ended hook when terminating active runs
After marking runs terminated in markSubagentRunTerminated, the function now only persists state and returns; the prior emitSubagentEndedHookOnce call path was removed. In kill/terminate scenarios, subagent_ended hooks no longer fire, so plugin cleanup and downstream automation that depend on kill lifecycle notifications silently stop running.
Useful? React with 👍 / 👎.
nikolasdehor
left a comment
There was a problem hiding this comment.
Good work consolidating these five fixes. I went through all 15 files carefully. The method: "send" removal is the right call and the BB message ID extraction improvements are solid. A few things to flag:
SSRF allowlist — looks safe, one nit
The buildBlueBubblesAttachmentSsrFPolicy approach is sound: it extracts only the hostname from the user-configured serverUrl and pins both allowedHostnames and hostnameAllowlist to that single host. This correctly scopes the SSRF bypass to the exact BB server the user already trusts in their config — it doesn't blanket-allow RFC 1918 ranges or anything like that. The try/catch returning undefined on malformed URLs is a nice defensive fallback.
One thing to verify: does the SSRF guard in fetchRemoteMedia also validate redirect targets against hostnameAllowlist? If BB sends a 3xx redirect to a different internal host, we want to make sure that's still blocked. The comment says "Keep redirects pinned to the same trusted host" but I want to confirm the consumer actually checks hostnameAllowlist on redirect resolution, not just on the initial request. If it does, this is good.
method: "send" removal — correct fix, love the warning comment
The // ⚠️ CRITICAL: DO NOT ADD A "send" PATH HERE comment is excellent given this has regressed 3+ times. The test changes are thorough — every assertion that previously checked sendSpy now correctly checks agentSpy, and the expected message content changed from pre-formatted "✅ Subagent main finished" headers to the raw "Result:" + LLM instruction pattern. This is the right architecture: let the parent LLM process and rephrase the result.
countPendingDescendantRuns — good addition, subtle semantics
The distinction between "active" (not ended) and "pending" (ended but cleanup not completed) is important for avoiding premature announcements. The logic:
const runEnded = typeof entry.endedAt === "number";
const cleanupCompleted = typeof entry.cleanupCompletedAt === "number";
if (!runEnded || !cleanupCompleted) {
count += 1;
}This counts a run as pending if it's either still running OR ended-but-not-cleaned-up. The new test in subagent-registry.nested.test.ts covers this correctly. Good.
Codex bot's P1: session-mode archival sweep
The Codex bot flagged that removing the spawnMode === "session" exemption from archiveAtMs means persistent session-mode subagents will now get swept. Is this intentional? If session-mode subagents are being removed entirely (which this PR suggests — spawnMode is stripped from SubagentRunRecord), then yes, they should be archived. But if anything still relies on long-lived session subagents surviving indefinitely, this is a regression. Can you confirm the session-mode concept is fully removed, including from the store layer? The Greptile bot noted subagent-registry.types.ts still exports the old type with spawnMode — if subagent-registry.store.ts references that stale type, you'll get a mismatch at the persistence boundary.
Codex bot's P1: subagent_ended hook no longer fires on kill
markSubagentRunTerminated no longer calls emitSubagentEndedHookOnce. If any plugins rely on the subagent_ended lifecycle hook for cleanup when a subagent is killed (not just when it completes normally), those plugins will silently stop receiving kill notifications. Is this intentional? If the hook infrastructure is being simplified, that's fine, but it should be documented as a breaking change for plugin authors.
Inbound meta changes — mostly good, one question
Promoting SenderName above SenderE164 in the sender field preference chain and including sender info for DMs (not just groups) is a nice UX improvement. The message_id change to prefer MessageSid and never inject both message_id + message_id_full saves tokens.
One question: resolveSenderLabel now receives an id parameter. Is the function signature updated to accept it? The diff only shows the caller side, not the callee. If resolveSenderLabel doesn't use id in its label construction, passing it is harmless but the test expects label: "Tyler (+15551234567)" which suggests it does use it as a fallback when e164 is missing. Just want to confirm the function body handles the new param.
readLatestSubagentOutput removal — simplification is welcome
The old readLatestSubagentOutput had a complex fallback chain: try readLatestAssistantReply, then fall back to chat.history parsing with extractSubagentOutputText / extractToolResultText / extractInlineTextContent. This PR collapses all of that to just readLatestAssistantReply. The test changes confirm that tool-role messages are no longer used as fallback output — they now produce "(no output)" instead. This is cleaner, but is a behavioral change: subagents that only produce tool output without a final assistant message will now report no output. Is that acceptable for all subagent types?
finalizeSubagentCleanup sync vs async
finalizeSubagentCleanup changed from async to sync, and the .then() call in startSubagentAnnounceCleanupFlow dropped the void prefix. This is fine since the function no longer awaits anything internally, but the emitCompletionEndedHookIfNeeded call that was previously awaited inside it is now gone entirely (related to the lifecycle hook removal above).
Overall: the core fixes (send-path removal, SSRF allowlist, BB message IDs, sender identity) are solid. The main concerns are the two items the Codex bot flagged (session-mode archival, killed-hook removal) plus confirming redirect validation in the SSRF guard. Happy to approve once those are addressed or confirmed intentional.
a5a3785 to
0cd84ea
Compare
0cd84ea to
b7c8922
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b7c8922bde
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
src/agents/subagent-announce.ts
Outdated
| 0, | ||
| countActiveDescendantRuns(canonicalRequesterSessionKey), | ||
| typeof countPendingDescendantRuns === "function" | ||
| ? countPendingDescendantRuns(canonicalRequesterSessionKey) |
There was a problem hiding this comment.
Exclude current run from pending-descendant gate
sendSubagentAnnounceDirectly now uses countPendingDescendantRuns(canonicalRequesterSessionKey) to decide whether completion text can be sent directly, but that pending-count includes the just-finished run itself until cleanupCompletedAt is written. At announce time this run is always still pending cleanup, so pendingDescendantRuns is at least 1 even with no siblings/descendants, which makes shouldSendCompletionDirectly false for non-session paths and effectively disables the direct completion-send branch.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 59ba16a810
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| : messageIdFull && messageIdFull !== messageId | ||
| ? messageIdFull | ||
| : undefined, | ||
| message_id: isDirect ? undefined : resolvedMessageId, |
There was a problem hiding this comment.
Preserve full message IDs when short aliases are emitted
By collapsing MessageSid/MessageSidFull into a single message_id, this drops the full provider ID for channels that intentionally emit a short alias plus full GUID (for example BlueBubbles). In BlueBubbles action flow, resolveBlueBubblesMessageId(..., { requireKnownShortId: true }) throws once the short-ID cache is gone and explicitly asks for MessageSidFull, so after cache expiry/restart the model no longer has the full ID needed to target older messages for replies/reactions. Keeping message_id_full when it differs avoids that failure mode while still preferring short IDs.
Useful? React with 👍 / 👎.
5dcfee3 to
b564a0b
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b564a0bfeb
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (req.method !== "GET" && req.method !== "HEAD") { | ||
| if (!basePath) { | ||
| return false; |
There was a problem hiding this comment.
Enforce method guard before basePath redirect
With a configured basePath, requests to the exact base path (for example POST /openclaw) hit the earlier pathname === basePath redirect branch and return 302 before this guard runs, so non-GET/HEAD methods bypass the intended rejection path. That regresses behavior from before this change (where all non-GET/HEAD were rejected) and is inconsistent with POST /openclaw/, which now returns 405.
Useful? React with 👍 / 👎.
b564a0b to
0573052
Compare
…allowlist (openclaw#23970) * fix(agents): defer announces until descendant cleanup settles * fix(bluebubbles): harden message metadata extraction * feat(contributors): rank by composite score (commits, PRs, LOC, tenure) * refactor(control-ui): move method guard after path checks to improve request handling * fix subagent completion announce when only current run is pending * fix(subagents): keep orchestrator runs active until descendants finish * fix: prepare PR feedback follow-ups (openclaw#23970) (thanks @tyler6204)
…allowlist (openclaw#23970) * fix(agents): defer announces until descendant cleanup settles * fix(bluebubbles): harden message metadata extraction * feat(contributors): rank by composite score (commits, PRs, LOC, tenure) * refactor(control-ui): move method guard after path checks to improve request handling * fix subagent completion announce when only current run is pending * fix(subagents): keep orchestrator runs active until descendants finish * fix: prepare PR feedback follow-ups (openclaw#23970) (thanks @tyler6204)
…allowlist (openclaw#23970) * fix(agents): defer announces until descendant cleanup settles * fix(bluebubbles): harden message metadata extraction * feat(contributors): rank by composite score (commits, PRs, LOC, tenure) * refactor(control-ui): move method guard after path checks to improve request handling * fix subagent completion announce when only current run is pending * fix(subagents): keep orchestrator runs active until descendants finish * fix: prepare PR feedback follow-ups (openclaw#23970) (thanks @tyler6204)
|
I thought this was supposed to fix everything but I have the same issues still. Sequential Spawn doesnt work and that Subagent has finished message still comes and my main agent doesn't do anything. Is it possible I can get some help from you? @tyler6204 |
…allowlist (openclaw#23970) * fix(agents): defer announces until descendant cleanup settles * fix(bluebubbles): harden message metadata extraction * feat(contributors): rank by composite score (commits, PRs, LOC, tenure) * refactor(control-ui): move method guard after path checks to improve request handling * fix subagent completion announce when only current run is pending * fix(subagents): keep orchestrator runs active until descendants finish * fix: prepare PR feedback follow-ups (openclaw#23970) (thanks @tyler6204)
…allowlist (openclaw#23970) * fix(agents): defer announces until descendant cleanup settles * fix(bluebubbles): harden message metadata extraction * feat(contributors): rank by composite score (commits, PRs, LOC, tenure) * refactor(control-ui): move method guard after path checks to improve request handling * fix subagent completion announce when only current run is pending * fix(subagents): keep orchestrator runs active until descendants finish * fix: prepare PR feedback follow-ups (openclaw#23970) (thanks @tyler6204)
…allowlist (openclaw#23970) * fix(agents): defer announces until descendant cleanup settles * fix(bluebubbles): harden message metadata extraction * feat(contributors): rank by composite score (commits, PRs, LOC, tenure) * refactor(control-ui): move method guard after path checks to improve request handling * fix subagent completion announce when only current run is pending * fix(subagents): keep orchestrator runs active until descendants finish * fix: prepare PR feedback follow-ups (openclaw#23970) (thanks @tyler6204)
Summary
Fixes subagent announce reliability and BlueBubbles metadata handling, and includes follow up test coverage and cleanup from /preparepr. Supersedes #23166 (closed).
Changes
Agent only subagent announce delivery
countPendingDescendantRuns) so orchestrator announces wait until descendants fully settle, including cleanup.method: "agent"path so parent sessions process completion events reliably.Completion message cleanup hardening
BlueBubbles message ID extraction
message_idwhen available.messageIdandmessageGuidconsistently for send/reply tracking.Sender identity in direct message context
senderInfometadata for DM context as well as groups.Inbound metadata simplification
message_id(MessageSid ?? MessageSidFull) instead of injecting both IDs.Control UI method guard behavior
Contributor ranking update
Regression fixes
Testing