fix(outbound): thread sessionKey into message_sending + align session.key with agent runtime + document the contract#73706
Conversation
Greptile SummaryThis PR fixes a three-part bug where Confidence Score: 5/5This PR is safe to merge — changes are additive, well-tested, and confined to narrow plumbing points. No P0 or P1 issues found. The No files require special attention. Reviews (2): Last reviewed commit: "fix(outbound): align session.key with ag..." | Re-trigger Greptile |
|
Codex review: needs real behavior proof before merge. Summary Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Real behavior proof Next step before merge Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. What I checked:
Likely related people:
Remaining risk / open question:
Codex review notes: model gpt-5.5, reasoning high; reviewed against 972b3c673a2e. |
The outbound delivery path in applyMessageSendingHook constructed a
PluginHookMessageContext containing only { channelId, accountId,
conversationId } and dropped the sessionKey that the surrounding
deliverOutboundPayloads scope already had as
sessionKeyForInternalHooks. Plugins receiving message_sending therefore
saw ctx.sessionKey === undefined for every reply.
This breaks any plugin that needs to correlate a per-turn signal
emitted in agent_end with the matching outbound delivery in
message_sending. The concrete observed regression is the
openclaw-provenance developer-mode taint footer: agent_end populates a
finalTaintBySession map keyed by sessionKey, and message_sending looks
it up by sessionKey before appending the footer. With the key missing
on the message_sending side, the lookup always returned undefined and
no footer was ever appended.
PluginHookMessageContext already declares sessionKey?: string and the
auto-reply dispatch path in src/auto-reply/dispatch.ts already
threads it through deriveInboundMessageHookContext +
toPluginMessageContext. This change brings deliver.ts in line.
- applyMessageSendingHook now takes an optional sessionKey and forwards
it on the runMessageSending context.
- The single caller in deliverOutboundPayloads passes
sessionKeyForInternalHooks, which is already resolved earlier in the
function from params.mirror?.sessionKey ?? params.session?.key.
- Two new unit tests cover the present-and-absent cases.
Closes the deliver.ts side of the agent_end → message_sending
correlation gap. No behavior change for plugins that don't read
ctx.sessionKey.
92b5c0b to
df71f5f
Compare
…ment contract
Builds on the previous commit. The original commit threaded the value
already in OutboundSessionContext.key into the message_sending hook
context. That was correct as a plumbing fix, but exposed a deeper
issue: dispatch-from-config.ts:488 was filling session.key with
ctx.SessionKey unconditionally, which is the right value for non-native
chat but the wrong value for native-command-redirect, where the agent
runtime ran against ctx.CommandTargetSessionKey.
The agent runner resolves its sessionKey as
`targetSessionKey || ctx.SessionKey` (see get-reply.ts:198). For
agent_end and message_sending to see the same canonical session key,
the dispatch path must mirror that resolution. We now do.
Changes
=======
src/auto-reply/reply/dispatch-from-config.ts
Compute agentRuntimeSessionKey using the same expression
get-reply.ts uses for the agent run, then pass it as
routeReply({ sessionKey }). For non-native chat (the common case)
this collapses to the existing ctx.SessionKey \u2014 no behavior change.
For native-command-redirect it now correctly follows the redirect
target so message_sending and agent_end agree.
src/infra/outbound/session-context.ts
Tighten the JSDoc on OutboundSessionContext.key + policyKey:
* key MUST equal the agent runtime params.sessionKey for the run
that produced the payload, naming the call sites that should
already be honoring this contract.
* policyKey is the delivery target's session for policy lookups
(silent-reply policy, send rate limits, agent-scoped channel
preferences) when delivery differs from the control session.
src/plugins/hook-message.types.ts
Tighten the JSDoc on PluginHookMessageContext.sessionKey to state
it is the same canonical key used by the agent runtime / agent_end /
llm_input / llm_output, so plugin authors correlating across hooks
know what to expect. Also document runId as the recommended per-turn
correlation field (UUID, stable across LLM iterations and retry
attempts within one end-to-end turn, distinct per inbound message
and per cron/heartbeat/followup run \u2014 more robust than sessionKey
for plugins that need to disambiguate concurrent turns in the same
session).
src/infra/outbound/deliver.ts
Document why sessionKeyForDeliveryDiagnostics falls back to
policyKey (best-effort identifier for telemetry only) and why
sessionKeyForInternalHooks deliberately does NOT fall back to
policyKey (handing the policy key to plugins that correlate against
agent_end would be wrong). Both blocks now have explicit comments.
src/infra/outbound/deliver.test.ts
Add a contract test asserting that when session.key and
session.policyKey differ, message_sending receives session.key
(never policyKey). Brings deliver.test.ts to 56/56 passing
(53 prior + 3 new).
No semantic change for the regular Discord/Slack chat path. The fix
is meaningful for native-command-redirect flows; the JSDoc/comment
updates are the larger value of the change \u2014 the contract was
implicit and only honored by some callers, and downstream plugins
that correlate hooks need a documented invariant they can rely on.
|
@greptile-apps: We have added some documentation to fully explain the hook surface contract regarding session key vs policy key. please re-review. |
The developer-mode taint footer was being silently dropped on outbound
replies. Root cause: the per-session correlation maps used by the
agent_end \u2192 message_sending hook chain were declared inside
`registerSecurityHooks`, which is invoked once per agent context
(tank, narcissus, shiva, smith, main, ...). Each invocation produced
its own closure with its own Map. agent_end SET in one closure's Map
landed invisibly to message_sending GET in another closure's Map for
the same outbound delivery, so `finalTaintBySession.get(sessionKey)`
always returned undefined and the footer was never appended.
Empirical evidence (gateway log, 2026-04-28):
agent_end SET fullKey=agent:tank:discord:tank:direct:1594 mapInstance=d3j2d7fe
message_sending sessionKey=agent:tank:discord:tank:direct:1594 lookupHit=false mapInstance=amk32ibv
message_sending sessionKey=agent:tank:discord:tank:direct:1594 lookupHit=true mapInstance=d3j2d7fe
5 distinct mapInstance ids = 5 plugin closures from 5 agent contexts,
all alive within a single module load. After the fix:
moduleLoadCount=1
agent_end FIRED setting fullKey=agent:tank:discord:tank:direct:1594 mapSize_before=0
message_sending FIRED mapSize=1 mapKeys=["agent:tank:discord:tank:direct:1594"]
\u2192 footer rendered \u2705
Fix:
* Promote four per-session maps from function scope to a single
process-wide state object anchored on globalThis via
`Symbol.for("openclaw.provenance.processState.v1")`. Module scope
alone would already be sufficient at `moduleLoadCount=1`, but
using a globalThis-keyed Symbol is defense-in-depth: even if the
plugin module were ever re-evaluated (CommonJS/ESM dual instances,
agent-scoped node_modules resolution, etc.), all loads would
converge on the same Maps.
* Maps now shared:
- finalTaintBySession agent_end \u2192 message_sending
- turnStartTaintBySession before_prompt_build / inbound_claim \u2192 agent_end
- lastImpactedToolBySession after_tool_call \u2192 agent_end
- blockedToolsBySession before_tool_call \u2192 agent_end clear
* Maps that stay function-scoped (single-instance read+write only):
turnStartTimes, lastLlmNodeBySession, sessionAgentMap.
The full chain also depended on a separate core fix (openclaw/openclaw#73706)
that threads sessionKey into the message_sending hook context via
deliver.ts. Either fix alone leaves the footer broken; together they
restore developer-mode footers on every outbound reply, including the
first one after a fresh gateway boot.
The developer-mode taint footer was being silently dropped on outbound
replies. Root cause: the per-session correlation maps used by the
agent_end \u2192 message_sending hook chain were declared inside
`registerSecurityHooks`, which is invoked once per agent context
(tank, narcissus, shiva, smith, main, ...). Each invocation produced
its own closure with its own Map. agent_end SET in one closure's Map
landed invisibly to message_sending GET in another closure's Map for
the same outbound delivery, so `finalTaintBySession.get(sessionKey)`
always returned undefined and the footer was never appended.
Empirical evidence (gateway log, 2026-04-28):
agent_end SET fullKey=agent:tank:discord:tank:direct:1594 mapInstance=d3j2d7fe
message_sending sessionKey=agent:tank:discord:tank:direct:1594 lookupHit=false mapInstance=amk32ibv
message_sending sessionKey=agent:tank:discord:tank:direct:1594 lookupHit=true mapInstance=d3j2d7fe
5 distinct mapInstance ids = 5 plugin closures from 5 agent contexts,
all alive within a single module load. After the fix:
moduleLoadCount=1
agent_end FIRED setting fullKey=agent:tank:discord:tank:direct:1594 mapSize_before=0
message_sending FIRED mapSize=1 mapKeys=["agent:tank:discord:tank:direct:1594"]
\u2192 footer rendered \u2705
Fix:
* Promote four per-session maps from function scope to a single
process-wide state object anchored on globalThis via
`Symbol.for("openclaw.provenance.processState.v1")`. Module scope
alone would already be sufficient at `moduleLoadCount=1`, but
using a globalThis-keyed Symbol is defense-in-depth: even if the
plugin module were ever re-evaluated (CommonJS/ESM dual instances,
agent-scoped node_modules resolution, etc.), all loads would
converge on the same Maps.
* Maps now shared:
- finalTaintBySession agent_end \u2192 message_sending
- turnStartTaintBySession before_prompt_build / inbound_claim \u2192 agent_end
- lastImpactedToolBySession after_tool_call \u2192 agent_end
- blockedToolsBySession before_tool_call \u2192 agent_end clear
* Maps that stay function-scoped (single-instance read+write only):
turnStartTimes, lastLlmNodeBySession, sessionAgentMap.
The full chain also depended on a separate core fix (openclaw/openclaw#73706)
that threads sessionKey into the message_sending hook context via
deliver.ts. Either fix alone leaves the footer broken; together they
restore developer-mode footers on every outbound reply, including the
first one after a fresh gateway boot.
27516b5 to
b4ecfe2
Compare
…(review feedback) What ---- Address all three findings from Clawsweeper's review of openclaw#73706, plus attach the after-fix real-runtime behavior proof maintainer review asked for. - [P2] Align hook docs with delivered outbound fields: Thread the canonical outbound `sessionKey` into the `message_sent` plugin hook context as well, so plugins observing both `message_sending` and `message_sent` see the same `sessionKey` (and so it matches the value the internal `message:sent` hook already fires with). The value is already computed for the internal hook in `deliverOutboundPayloadsCore`; we just reuse it in `createMessageSentEmitter`. JSDoc on `PluginHookMessageContext` is narrowed to honestly describe what is actually plumbed: `sessionKey` flows into both outbound delivery hooks when delivery has a session attached, and `runId` is currently NOT plumbed through outbound delivery (only inbound + agent-runtime hooks), so plugins must use `sessionKey` for `agent_end` <-> `message_sending` correlation today. - [P3] Add the hook fix to the changelog: Single Plugins/hooks entry under Unreleased that calls out the outbound `sessionKey` threading and the native-redirect alignment. - [P3] Cover native redirect session-key selection: Two focused regression tests in `dispatch-from-config.test.ts` pin the routed reply session-key contract: native redirect with `CommandTargetSessionKey` set must route via the redirect-target session, while non-native (text) commands must keep the inbound `SessionKey` even if `CommandTargetSessionKey` happens to be populated. These guard against future divergence between `agent_end` and `message_sending` on native redirects. After-fix real behavior proof ----------------------------- Added `scripts/proof-73706-message-sending-session-key.ts`, a self-checking real-runtime harness that wires up the production `deliverOutboundPayloads` path against a real `PluginRegistry`, real `getGlobalHookRunner`/`initializeGlobalHookRunner` singleton, and a real channel plugin with `sendText`. It registers actual `message_sending` / `message_sent` hook handlers, exercises three scenarios (with session, without session, native redirect target), and asserts the captured runtime ctx values match the documented contract. Catches the `message_sent` regression I had introduced locally (the unit test mock was happy because vitest mocked the hook runner; the real runtime was not). Output captured under `~/reports/proof-73706/run-output.txt` and pasted into the PR comment. Validation gate --------------- - pnpm tsgo:core: clean - pnpm tsgo:core:test: clean - pnpm vitest run src/infra/outbound/deliver.test.ts: 64/64 passed - pnpm vitest run src/auto-reply/reply/dispatch-from-config.test.ts: 105/105 passed - pnpm vitest run src/hooks/message-hook-mappers.test.ts: 11/11 passed - pnpm oxlint <touched files>: 0 warnings, 0 errors - pnpm plugin-sdk:api:check: OK (regenerated baseline hash for the PluginHookMessageContext JSDoc edit; only the .sha256 changes) - git diff --check: clean - pnpm tsx scripts/proof-73706-message-sending-session-key.ts: all three runtime assertions pass Beads: openclaw-t9l
Round 4 — addressing Clawsweeper review feedbackRound-4 commit: Each finding from the previous review is addressed individually below, followed by the after-fix real-runtime behavior proof maintainer review asked for. [P2] Align hook docs with delivered outbound fields —
|
…ession-key # Conflicts: # docs/.generated/plugin-sdk-api-baseline.sha256
|
@clawsweeper please re-review. Round 4 follow-up commit |
What
Three layers of one bug:
Plumbing fix.
applyMessageSendingHookinsrc/infra/outbound/deliver.tsconstructed thePluginHookMessageContextwith only{ channelId, accountId, conversationId }and dropped thesessionKeythatdeliverOutboundPayloadsalready had in scope assessionKeyForInternalHooks(resolved at deliver.ts:952 frommirror.sessionKey ?? session.key). Plugins observingmessage_sendingsawctx.sessionKey === undefinedfor every reply.Semantic fix.
dispatch-from-config.ts:488was fillingsession.keywithctx.SessionKeyunconditionally. For non-native chat that's correct (matches what the agent runner uses); for native-command-redirect (CommandTargetSessionKeyset), the agent runs against the redirect target, sosession.keymust follow it too. Otherwiseagent_endandmessage_sendingsee different sessionKeys for the same turn. Now mirrorsget-reply.ts:198'sagentSessionKey = targetSessionKey || ctx.SessionKey.Contract documentation. Tightened JSDoc on
OutboundSessionContext.key/policyKeyandPluginHookMessageContext.sessionKey/runIdso the agent_end ↔ message_sending correlation invariant is documented, and added explicit comments indeliver.tsdistinguishing the diagnostics fallback (?? policyKeyallowed) from the internal-hook fallback (?? policyKeydeliberately disallowed).Concrete regression observed
A non-bundled plugin (
openclaw-provenance) correlates a per-turn signal across two hooks:agent_endwrites to a per-session map keyed byctx.sessionKey.message_sendinglooks up the same key and appends a developer-mode taint footer to outbound content.Pre-fix, message_sending's ctx.sessionKey was
undefined(issue 1) so the lookup always missed; post-plumbing-fix but pre-semantic-fix, the lookup would also miss for any native-command-redirect flow (issue 2). Both regressions are addressed; downstream plugins now have a stable invariant (issue 3).Tests
pnpm test src/infra/outbound/deliver.test.ts→ 56/56 (53 prior + 3 new). New tests cover:tsgo --noEmittypecheck clean.Risk
Low. The plumbing fix is additive (
sessionKey?: stringfield already declared in the hook context type). The semantic fix only changes behavior for the native-command-redirect path, which was already broken (no plugin could correlate across hooks there). JSDoc/comment updates are documentation-only. Five files modified at narrow points.Real behavior proof
Behavior or issue addressed: Canonical outbound
sessionKeyis threaded intomessage_sending/message_senthook context, including native redirect session-key selection from PR fix(outbound): thread sessionKey into message_sending + align session.key with agent runtime + document the contract #73706.Real environment tested: Local OpenClaw topic branch
fix/message-sending-session-keyat862bb66603, realdeliverOutboundPayloads, realgetGlobalHookRunner, real registered hooks, real channel plugin path viascripts/proof-73706-message-sending-session-key.ts.Exact steps or command run after this patch: Ran
pnpm tsx scripts/proof-73706-message-sending-session-key.tsafter the patch and captured the runtime hook context values received by the registered hooks.Evidence after fix: Full copied runtime output is in the proof comment: fix(outbound): thread sessionKey into message_sending + align session.key with agent runtime + document the contract #73706 (comment) and saved locally at
~/reports/proof-73706/run-output.txt.Excerpt of copied live output:
Observed result after fix: The actual runtime hook context contains the canonical delivery session key for outbound hooks; native redirect delivery uses the redirect target key instead of the inbound session key.
What was not tested:
runIdoutbound hook correlation is documented as not yet plumbed and was not claimed/tested by this PR.