feat(events): add audience field for hidden runtime-context system events#72201
feat(events): add audience field for hidden runtime-context system events#72201lukeboyett wants to merge 1 commit into
Conversation
Greptile SummaryThis PR adds an optional Confidence Score: 4/5Safe to merge — no correctness issues found; the primitive is well-tested and fully backward-compatible. The implementation is logically sound, the escape/strip round-trip is correct, and the nine new tests cover the default, round-trip, equality, wrap shape, ordering, edge cases, and adversarial delimiter injection. The only findings are P2 style/maintainability observations. No files require special attention. Prompt To Fix All With AIThis is a comment left during a code review.
Path: src/auto-reply/reply/session-system-events.ts
Line: 119-123
Comment:
**Header text duplicated from `LEGACY_INTERNAL_CONTEXT_HEADER`**
The three header lines pushed here are an exact copy of the private `LEGACY_INTERNAL_CONTEXT_HEADER` constant in `src/agents/internal-runtime-context.ts`. They aren't used by the strip logic (which relies on the BEGIN/END delimiters), so a mismatch doesn't cause a runtime bug today — but if the canonical wording ever changes in `internal-runtime-context.ts`, this copy will silently drift. Exporting a shared `INTERNAL_RUNTIME_CONTEXT_HEADER` constant (or a helper that pushes the standard framing) would keep the two sites in sync.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix(events): key consecutive-duplicate s..." | Re-trigger Greptile |
|
Codex review: needs maintainer review before merge. Reviewed May 27, 2026, 7:02 PM ET / 23:02 UTC. Summary PR surface: Source +207, Tests +477. Total +684 across 11 files. Reproducibility: yes. at source/proof level: the PR body's terminal proof drives the production enqueue, drain, and strip path on the PR head, and the merge-result source shows the same path. I did not run the harness locally because this review was read-only. Review metrics: 1 noteworthy metric.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Mantis proof suggestion Risk before merge
Maintainer options:
Next step before merge Security Review detailsBest possible solution: Land after maintainer acceptance of the audience primitive and green required checks on the merge ref, then keep additional producer migrations case-by-case under #69492. Do we have a high-confidence way to reproduce the issue? Yes, at source/proof level: the PR body's terminal proof drives the production enqueue, drain, and strip path on the PR head, and the merge-result source shows the same path. I did not run the harness locally because this review was read-only. Is this the best way to solve the issue? Yes, with maintainer approval: the patch keeps existing events user-facing by default, reuses the existing internal runtime-context strip contract, and migrates only the post-delivery cron-awareness producer. The safer long-term path is to keep additional migrations case-by-case rather than batch-converting user-deliverable events. AGENTS.md: found and applied where relevant. Codex review notes: model gpt-5.5, reasoning high; reviewed against 550a9b459a35. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +207, Tests +477. Total +684 across 11 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
c2ae546 to
1ddb5ec
Compare
1ddb5ec to
ad6b10b
Compare
ad6b10b to
6137343
Compare
6137343 to
4a6f1dd
Compare
6e7c5bd to
5aa64a9
Compare
5aa64a9 to
e68ce93
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e68ce9353c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if ((event.audience ?? "user-facing") === "internal") { | ||
| internalLines.push(...eventLines); |
There was a problem hiding this comment.
Avoid downgrading owner auth for hidden events
When an internal event is also trusted: false (the new cron awareness producer does exactly this in src/cron/isolated-agent/delivery-dispatch.ts), these added lines place a System (untrusted): ... line inside the hidden runtime-context block. I checked the consumer path in src/auto-reply/reply/get-reply-run.ts: it tests the entire drained block with UNTRUSTED_SYSTEM_EVENT_LINE_RE before passing senderIsOwner, so even hidden internal context now forces senderIsOwner to false on the next owner reply. That disables owner-only directives/tools after a successful isolated cron delivery even though the untrusted text is not user-facing input; internal events should either not use the visible untrusted prefix for this auth scan or the scan should ignore the wrapped block.
Useful? React with 👍 / 👎.
Status update — rebased onto current
|
| Commit | Effect on this PR |
|---|---|
9180173f9a fix: preserve exec event routing and sanitize tool XML |
Introduced selectGenericSystemEvents + consumeSelectedSystemEventEntries to keep exec-completion events queued for the heartbeat path. Accommodated in the rebase: this PR's audience-routing now layers on top of consumeSelectedSystemEventEntries(...) instead of drainSystemEventEntries(...), so exec-completion events stay on the heartbeat lane while the audience split applies to the remaining generic events. |
6dae3c273d fix: keep runtime prompt context out of system prompt (#77521) |
Refactored internal-runtime-context.ts and added OPENCLAW_RUNTIME_CONTEXT_NOTICE. Earlier prior version of this PR carried a refactor(events): export shared INTERNAL_RUNTIME_CONTEXT_HEADER_LINES commit to dedupe the canonical header text against LEGACY_INTERNAL_CONTEXT_HEADER. That commit was dropped during the rebase since OPENCLAW_RUNTIME_CONTEXT_NOTICE now covers the dedup. Functionally equivalent. |
resetQueueState helper extraction |
Upstream consolidated the post-splice queue state reset into a resetQueueState(key, entry) helper. The audience-tracking commit (e68ce9353c) was rebased onto that helper, so lastAudience is reset alongside lastText/lastContextKey inside the helper rather than inline. |
Upstream's INTERNAL_RUNTIME_CONTEXT_BEGIN/END delimiters and OPENCLAW_RUNTIME_CONTEXT_NOTICE are present, but no upstream commit currently provides a system-event audience-routing primitive — the internal vs user-facing lane split this PR adds remains an additive primitive over upstream. The wrapping site at drainFormattedSystemEvents is still the single integration point.
Re-review
@codex please re-review the rebased commits. Specific concerns to focus on:
- The merge with
consumeSelectedSystemEventEntries— does the audience split correctly compose with the exec-completion exclusion (i.e., are exec-completion events withaudience: 'internal'correctly left in the queue for the heartbeat path, not consumed here)? - The
lastAudiencetracking insideresetQueueState— any edge cases where queue-state reset diverges from the (text, audience) suppression key invariant? - Whether this PR's
internallane should adoptOPENCLAW_RUNTIME_CONTEXT_NOTICEas a single source of truth for the inserted header lines, replacing the inline strings indrainFormattedSystemEvents.
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Addresses the fourth codex P2 on openclaw#72201 (against PR head d6f03b8403). Counterpart to the relay-selector filter in 761872d088. selectSystemEventsConsumedByHeartbeat picked up audience: "internal" events whenever isCronWake or contextKey: cron: matched, then consumeSelectedSystemEventEntries removed them from the queue. With the relay-selector skip already in place, the cron prompt template no longer fires on the internal event — but the consume path still drained it silently, so drainFormattedSystemEvents never saw the event and the INTERNAL_RUNTIME_CONTEXT wrap never fired. Net effect: hidden cron-awareness events were getting dropped on the floor on cron-wake heartbeats — exactly the hidden agent-awareness this audience field is meant to preserve. Filter audience: "internal" out of all three branches of selectSystemEventsConsumedByHeartbeat (exec selector, cron selector, fallthrough "all pending events"). Internal events stay queued for the next normal drain, which routes them through the wrap-on-drain integration in session-system-events.ts. Tests: 23 passed (was 22) — added a regression in heartbeat-runner.ghost-reminder.test.ts asserting that an audience: "internal" event with contextKey: "cron:..." queued before an interval wake survives the heartbeat run (peekSystemEventEntries returns it after runHeartbeatOnce). Existing drain-after-run behavior for user-facing cron events is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the fifth codex P2 on openclaw#72201 (against PR head f39342979f). resolveHeartbeatPreflight computes hasTaggedCronEvents from any pending event whose contextKey starts with "cron:" — without filtering audience. Combined with the queue-survival behavior in 1debc560ca (which intentionally leaves audience: "internal" cron-tagged events queued for the wrap-on-drain path), the preflight cron-tag signal would persist across every subsequent interval heartbeat. That bypasses the normal file gates and runs another generic heartbeat each interval until a normal reply finally drains the queue — turning hidden cron-awareness into a persistent heartbeat loop. Apply the same audience-internal filter to the preflight cron-tag detection. Internal events still stay queued; they just no longer trigger the cron-tagged heartbeat preflight signal. Tests: extended the existing "leaves audience: 'internal' events queued" regression in heartbeat-runner.ghost-reminder.test.ts to do a second runHeartbeatOnce on the same queue and assert: - Second run still resolves as Provider: "heartbeat" (not cron-event). - Second heartbeat ctx Body does NOT contain the awareness text. - Internal event is still queued after the second run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the sixth codex P2 on openclaw#72201 (against PR head 45e0157f30, still applicable on the rebased 5999ece). selectGenericSystemEvents in session-system-events.ts excludes any event whose text matches isExecCompletionEvent so the heartbeat exec relay path can consume them. After the audience-aware fixes in 998dd9afd5 (prompt selectors) and 8d72e6e (consume selector) the heartbeat path also skips audience: "internal" events, which combined creates a no-consumer hole: an audience: "internal" event whose text happens to match exec-completion syntax (e.g. cron output literally starting with "Exec finished (...)" or "Exec failed (...)") gets stranded — the generic drain filters it out as exec-shaped, the heartbeat path skips it as internal, and it sits in peekSystemEventEntries indefinitely. Make audience the source of truth: audience: "internal" events ALWAYS go through the generic drain (which routes them to the INTERNAL_RUNTIME_CONTEXT wrap) regardless of text shape. Text-shape only matters for user-facing events, where it correctly keeps exec-completion on the heartbeat lane. Tests: 40 passed (was 39) — added a regression in src/infra/system-events.test.ts asserting that an audience: "internal" event with exec-completion-shaped text drains through the wrap (result contains BEGIN/END delimiters and the event text, queue is empty after drain). Existing "leaves exec completion events queued" test for user-facing exec events unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the seventh codex P2 on openclaw#72201 (against PR head 08bbd85). compactSystemEvent in session-system-events.ts strips heartbeat-noise text (reason periodic, Read HEARTBEAT.md, heartbeat poll, heartbeat wake) from user-facing relays so it does not appear in chat. After consumeSelectedSystemEventEntries removes the event from the queue, a null return from compactSystemEvent silently drops the event with continue. For audience: "internal" events whose text happens to include any of those phrases (e.g. cron output that mentions "reason periodic" or "heartbeat poll" in its summary), this is the same no-consumer hole class as the exec-shape filter from 08bbd85: the queue entry is gone and the wrap-on-drain path never runs. Pass the audience into compactSystemEvent and skip the noise filters when audience === "internal". The Node: line sanitizer (which removes the "last input <path>" suffix) still runs for both audiences — that is a transformer, not a filter, and the safety value applies to both relay surfaces. User-facing events keep being filtered exactly as before. Tests: 42 passed (was 40) — added two regressions in src/infra/system-events.test.ts: - audience: "internal" events containing each of the four heartbeat- noise phrases drain through the wrap (each phrase asserted individually so a regression in any single filter is caught). - user-facing events containing the same phrases still get filtered (confirms the audience-bypass is precisely scoped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses two new P1 codex findings on openclaw#72201 (against PR head 5277cf7). Codex flipped on the security model. Earlier P2 reviews argued that audience: "internal" events should be exempt from the owner-auth downgrade scan because the wrap hides them from user-facing transcripts. The new P1 reviews correctly observe that the wrapped block still flows into the model prompt, so attacker-influenceable content (e.g. a cron payload relayed via queueCronAwarenessSystemEvent with trusted: false) must NOT have owner-only tool access just because the line is later hidden from the user. The trust signal is the source of truth, not visibility. Revert the two owner-auth carve-outs: - src/auto-reply/reply/get-reply-run.ts: drop the stripInternalRuntimeContext call before the UNTRUSTED_SYSTEM_EVENT_LINE_RE scan; scan the full drained block again. - src/infra/heartbeat-runner.ts: drop the audience !== "internal" predicate from hasUntrustedInspectedEvents and hasUntrustedActiveSessionEvents; scan all trusted=false events again. Routing carve-outs (relay selectors, consume selector, preflight cron-tag, generic drain exec-shape exemption, noise-filter exemption) are unchanged — those are about which surface drains/wraps the event, not security. Tests: existing regression cases inverted to assert the new (correct) behavior — audience: "internal" + trusted: false events DO trigger owner downgrade in both the get-reply-run scan and the heartbeat scan. Total test counts unchanged. pnpm tsgo:core:test clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Rebased onto current
Conflicts resolved (one beyond the usual
Resolution preserves @vincentkoc's allocation-trim intent while keeping this PR's audience routing:
Verification:
Branch is now |
c0010a7 to
2f6518a
Compare
Addresses codex P2 on openclaw#72201 + the missing-changelog P3. Internal-runtime-context blocks are runtime-generated and stripped from every user-facing surface, so System (untrusted): lines inside the wrap cannot have come from user input. Without this carve-out, an audience: "internal" event whose payload happens to carry an untrusted text fragment (e.g. cron output relayed for agent awareness via queueCronAwarenessSystemEvent) would silently downgrade senderIsOwner on the next reply turn — stripping owner-only tools from a turn the user never saw the event for. Strip the wrapped block via stripInternalRuntimeContext before applying UNTRUSTED_SYSTEM_EVENT_LINE_RE so the scan only inspects the user-facing portion of the drained system events. Owner-auth downgrade still triggers on untrusted lines outside the wrap; that path is the existing user-impersonation guard and is unchanged. Tests: 48 passed (was 47) — added two regression cases in get-reply-run.media-only.test.ts: - Untrusted lines inside an INTERNAL_RUNTIME_CONTEXT block do NOT downgrade senderIsOwner. - Untrusted lines outside the wrap still DO downgrade (carve-out scoped precisely to wrapped content). CHANGELOG: Unreleased entry covers the audience primitive, the wrap-on-drain integration, and the owner-auth scan fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses codex's 2nd P2 on openclaw#72201 (against PR head 2786604681). Symmetric to the get-reply-run.ts carve-out from cd2719b9fd. The heartbeat path also has its own owner-auth downgrade scan that fires on any pending event with trusted=false (forces ForceSenderIsOwnerFalse on the heartbeat ctx). Without this carve-out, an audience: "internal" event waiting in the queue (e.g. queueCronAwarenessSystemEvent) drained on the next heartbeat run would still disable owner-only tools and directives for that turn — even though the awareness text was already delivered on the cron's own channel and is now hidden runtime context. audience: "internal" events are runtime-generated, sit behind the INTERNAL_RUNTIME_CONTEXT wrap, and are stripped from every user-facing surface. They cannot have come from user input by construction, so they must be exempt from the user-impersonation guard. Untrusted events without audience: "internal" still trigger the downgrade exactly as before. Tests: 21 passed (was 20) — added a regression case in heartbeat-runner.ghost-reminder.test.ts asserting that an audience: "internal" + trusted: false event in the queue does NOT force ForceSenderIsOwnerFalse on the heartbeat ctx. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…c selection Addresses the third codex P2 on openclaw#72201 (against PR head 25a7a265e3). resolveHeartbeatRunPrompt selects pending cron-shaped events for buildCronEventPrompt and pending exec-completion events for the heartbeat exec relay path. Both surfaces format their selected events for user delivery — bypassing the wrap-on-drain integration in session-system-events.ts. Without an audience filter on the selectors, an audience: "internal" cron-awareness event with a cron: contextKey (or queued during a cron wake) was getting pulled into cronEvents and relayed/double-announced to the user as a CRON_EVENT_PROMPT instead of staying hidden runtime context. Filter both selectors by event.audience !== "internal". The intent is that audience: "internal" events route exclusively through the INTERNAL_RUNTIME_CONTEXT_* wrap on the next drain — the cron and exec prompt surfaces are user-facing relay paths and must not pick them up. Tests: 22 passed (was 21) — added a regression case in heartbeat-runner.ghost-reminder.test.ts asserting that an audience: "internal" cron-shaped event with contextKey: "cron:..." queued before an interval wake does NOT trigger CRON_EVENT_PROMPT and its text does NOT appear in the heartbeat ctx Body. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the fourth codex P2 on openclaw#72201 (against PR head d6f03b8403). Counterpart to the relay-selector filter in 761872d088. selectSystemEventsConsumedByHeartbeat picked up audience: "internal" events whenever isCronWake or contextKey: cron: matched, then consumeSelectedSystemEventEntries removed them from the queue. With the relay-selector skip already in place, the cron prompt template no longer fires on the internal event — but the consume path still drained it silently, so drainFormattedSystemEvents never saw the event and the INTERNAL_RUNTIME_CONTEXT wrap never fired. Net effect: hidden cron-awareness events were getting dropped on the floor on cron-wake heartbeats — exactly the hidden agent-awareness this audience field is meant to preserve. Filter audience: "internal" out of all three branches of selectSystemEventsConsumedByHeartbeat (exec selector, cron selector, fallthrough "all pending events"). Internal events stay queued for the next normal drain, which routes them through the wrap-on-drain integration in session-system-events.ts. Tests: 23 passed (was 22) — added a regression in heartbeat-runner.ghost-reminder.test.ts asserting that an audience: "internal" event with contextKey: "cron:..." queued before an interval wake survives the heartbeat run (peekSystemEventEntries returns it after runHeartbeatOnce). Existing drain-after-run behavior for user-facing cron events is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses the fifth codex P2 on openclaw#72201 (against PR head f39342979f). resolveHeartbeatPreflight computes hasTaggedCronEvents from any pending event whose contextKey starts with "cron:" — without filtering audience. Combined with the queue-survival behavior in 1debc560ca (which intentionally leaves audience: "internal" cron-tagged events queued for the wrap-on-drain path), the preflight cron-tag signal would persist across every subsequent interval heartbeat. That bypasses the normal file gates and runs another generic heartbeat each interval until a normal reply finally drains the queue — turning hidden cron-awareness into a persistent heartbeat loop. Apply the same audience-internal filter to the preflight cron-tag detection. Internal events still stay queued; they just no longer trigger the cron-tagged heartbeat preflight signal. Tests: extended the existing "leaves audience: 'internal' events queued" regression in heartbeat-runner.ghost-reminder.test.ts to do a second runHeartbeatOnce on the same queue and assert: - Second run still resolves as Provider: "heartbeat" (not cron-event). - Second heartbeat ctx Body does NOT contain the awareness text. - Internal event is still queued after the second run. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Rebased onto current
Conflict resolution. Two real conflicts beyond the usual CHANGELOG churn:
While resolving, I extended both
Verification:
PR is |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2f6518a60f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // so the heartbeat path can consume and deliver them. | ||
| const queued = consumeSelectedSystemEventEntries( | ||
| params.sessionKey, | ||
| selectGenericSystemEvents(peekSystemEventEntries(params.sessionKey)), |
There was a problem hiding this comment.
Keep heartbeat replies from draining internal events
When a non-isolated heartbeat runs before the next user reply, runHeartbeatOnce passes the main SessionKey into getReplyFromConfig, which still calls this generic drain before the heartbeat consume selector runs. Because internal events are selected here, the hidden cron-awareness event is consumed by the heartbeat prompt; prompt-prelude.ts keeps systemEventBlocks out of transcriptCommandBody, so an OK/silent/no-target heartbeat can drop the awareness before the next normal turn. Fresh evidence beyond the earlier heartbeat-consume finding is this live runHeartbeatOnce -> getReplyFromConfig -> drainFormattedSystemEvents path, which the new heartbeat tests stub out.
Useful? React with 👍 / 👎.
|
Fixed in Confirmed the bug exactly as described: Fix: threaded The caller in Test: new
Verification:
On the real-behavior-proof staleness flag: noted, will refresh the PR body proof section against @clawsweeper please re-review. Re-review progress:
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 25e8f2a5d3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| systemLines.push(INTERNAL_RUNTIME_CONTEXT_BEGIN); | ||
| systemLines.push("OpenClaw runtime context (internal):"); | ||
| systemLines.push( | ||
| "This context is runtime-generated, not user-authored. Keep internal details private.", | ||
| ); | ||
| systemLines.push(""); | ||
| systemLines.push(escapeInternalRuntimeContextDelimiters(internalLines.join("\n"))); | ||
| systemLines.push(INTERNAL_RUNTIME_CONTEXT_END); |
There was a problem hiding this comment.
Preserve hidden events through collect-mode queuing
When an audience: "internal" event is drained while another run is active and the session uses collect-mode queuing, this block is added only to queuedBody (the transcript body intentionally omits systemEventBlocks in src/auto-reply/reply/prompt-prelude.ts). I checked the collect drain path: renderCollectItem uses item.prompt, and the combined follow-up is submitted without a transcriptPrompt (src/auto-reply/reply/queue/drain.ts:133-138,252-258), so the <<<BEGIN_OPENCLAW_INTERNAL_CONTEXT>>> block becomes ordinary queued prompt text and can be persisted/displayed instead of staying hidden. The wrapped context needs a transcript-safe counterpart when queued items are collected, or collect rendering should use each item's transcriptPrompt for visible transcript text.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 1972afc0e6.
Confirmed the bug exactly as described. renderCollectItem in src/auto-reply/reply/queue/drain.ts used item.prompt verbatim, and the combined follow-up was submitted with { prompt, ... } only — no transcriptPrompt. So when an audience: "internal" event had been drained into queuedBody (e.g. via queueCronAwarenessSystemEvent) and a second message arrived during the active run triggering collect-mode batching, the <<<BEGIN_OPENCLAW_INTERNAL_CONTEXT>>>...<<<END_OPENCLAW_INTERNAL_CONTEXT>>> block became plain queued prompt text and would surface in the user-visible transcript.
Fix: scheduleFollowupDrain now builds both bodies:
promptviarenderCollectItem(existing; model-facing, wrap preserved so runtime context still reaches the model).transcriptPromptvia newrenderCollectItemTranscript(prefers each item's owntranscriptPromptset at enqueue time inget-reply-run.ts:975, falls back tostripInternalRuntimeContext(item.prompt)).
The transcript variant is only attached when the combined prompt carries a wrap or any grouped item already split prompt vs transcriptPrompt at enqueue — plain collects pay no extra cost.
Test: two new cases in queue.collect.test.ts:
strips internal runtime-context wrap from the transcriptPrompt of a collected batch— proves wrap-bearingitem.promptyields a combinedpromptthat keeps the wrap, plus atranscriptPromptthat drops the wrap markers AND the wrapped body, and still merges the second queued item's plain text.omits transcriptPrompt when no items carry a wrap or transcript split— proves plain collects stay cost-free.
Note on parallel paths: the announce-queue collect path in src/agents/subagent-announce-queue.ts:213-235 uses a different AnnounceQueueItem schema (no transcriptPrompt field; internal events flow through a dedicated internalEvents channel), so it isn't reachable by this same shape and was left untouched.
Verification:
pnpm test -- src/auto-reply/reply/queue.collect.test.ts src/auto-reply/reply/queue.dedupe.test.ts src/infra/system-events.test.ts src/utils/queue-helpers.test.ts— all passing.pnpm tsgo:core:test— clean.
@clawsweeper please re-review.
|
Rebased onto current Migrated to the structured
Three commits dropped during rebase because they're obsolete under the structured contract:
Net: 9 commits on top of upstream/main, no carve-outs that bypass owner-downgrade for hidden-context events. Untrusted runtime context still triggers owner-tool downgrade exactly as Verification on
Fresh real-behavior proof against @clawsweeper please re-review. Re-review progress:
|
|
ClawSweeper PR egg ✨ Hatched: 🌱 uncommon Clockwork Merge Sprite Hatch commandComment Hatchability rules:
Rarity: 🌱 uncommon. What is this egg doing here?
|
…ents Adds an optional `audience: 'internal' | 'user-facing'` field to `SystemEvent`, defaulting to `'user-facing'` (full backward-compat for every existing caller). Internal-audience events are wrapped at the consumer integration point (`drainFormattedSystemEvents` in session-system-events.ts) in the existing `INTERNAL_RUNTIME_CONTEXT_BEGIN/END` delimiters with the canonical header lines from `formatAgentInternalEventsForPrompt`. The agent runtime sees the content as runtime context; user-facing surfaces strip it via the already-installed `stripInternalRuntimeContext` consumers (`sanitize-user-facing-text.ts`, `memory-host-sdk/host/session-files.ts`, `agents/internal-events.ts`, history readers, Control UI extract). Mirrors the producer-side runtime-context delimiter pattern landed in `e918e5f75c` (openclaw#71761) and `6e985a421d` (webchat runtime context). No new persistence shape, no consumer-side filter (avoids the audit-bypass concern flagged on openclaw#69217), no display-layer per-surface patch (avoids the rejection pattern from openclaw#69366). Heartbeat-path additions: `selectGenericSystemEvents` keeps audience: "internal" events queued when the drain is invoked from a heartbeat reply (whose prompt envelope discards `systemEventBlocks`), so the wrapped runtime-context block is not silently consumed before the next regular reply turn that actually carries it through to the model. The heartbeat exec/cron-prompt selectors also skip internal events so they don't get classified as user-facing relay payloads. `compactSystemEvent` bypasses heartbeat-noise filters ("reason periodic", "Read HEARTBEAT.md", "heartbeat poll/wake") for audience: "internal" events — those go through the wrap-on-drain path and never reach a user-facing surface, so filtering them after consumption would silently drop the event (same no-consumer hole class as the exec-shape filter). Migrated one producer (the canonical justification for the lane): - `queueCronAwarenessSystemEvent` in `src/cron/isolated-agent/delivery-dispatch.ts`. This event is queued onto the main session only after the isolated agent run has already delivered to the user via its own channel (gated on `delivered === true` and `shouldQueueCronAwareness`). The main session needs hidden runtime context so the next turn knows the cron landed, not a user-visible `System: ...` bubble. Scope is intentionally tight. This primitive is the hidden runtime-context lane only; it is NOT a delivery-routing primitive. Events with a positive user delivery contract (exec completion via `notifyOnExit`, cron payloads, heartbeat acks) are not migrated — those have heartbeat-driven explicit-relay paths (`buildExecEventPrompt` / `buildCronEventPrompt`) plus tactical producer-side skips like `bd60df3e53`. Marking them internal would suppress delivery on regular reply turns where the model is instructed to keep wrapped content private. Tests cover: default audience, round-trip preservation, audience equality (consumeSystemEventEntries prefix-match respects audience), wrap-on-drain shape with canonical header, mixed user-facing+internal ordering, the user-facing-only and internal-only edge cases, end-to-end strip via stripInternalRuntimeContext, adversarial delimiter-token escape, heartbeat-path audience: "internal" queueing semantics, and exec-shaped internal events bypassing the text-shape filter. Addresses part of openclaw#69492 (system-event-shape consumer leakage; umbrella openclaw#69208 Track B). Leaves the user-deliverable noise question (exec completion, cron payloads, heartbeat acks) open for case-by-case follow-ups matching the maintainer's revealed pattern (`bd60df3e53`, `3f63ba8fd808`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rebase note — 2026-05-26Force-pushed onto current Decoupled from Squashed the 9-commit codex-iteration ladder into 1 commit. The original chain's incremental design was intra-PR refinement against the pre-refactor base — after the upstream pivot, those commits' individual diffs were a mix of conflict resolution and forceSender decoupling, not standalone units worth preserving. The single commit Validation:
Diff stat is essentially unchanged (713/29 vs prior 718/27; the 5-line net delta is the dropped forceSenderIsOwnerFalse coupling in The "Real behavior proof" section of the PR description remains accurate; the underlying behavior (wrap delimiters + isolated-cron awareness migration) is unchanged. |
|
@clawsweeper re-review No contributor-facing work is open: force-rebased onto current |
|
🦞🧹 I asked ClawSweeper to review this item again. |
|
@clawsweeper re-review The prior re-review run failed mid-pass (clawsweeper/actions/runs/26507462206) and left labels half-stripped (rating flipped to off-meta tidepool, proof: sufficient dropped to supplied, status/mantis removed). Requesting a clean re-review of the current head — no contributor-facing work is open; the PR is MERGEABLE with the audience-field change + real-behavior proof intact. |
|
🦞👀 Command router queued. I will update this comment with the next step. Re-review progress:
|
|
Closing this PR. After thorough investigation we can't reproduce the underlying leak it addresses, and main's shape has moved on:
A refactor to fit the structured internal-events shape may still be warranted, but we'd want to see the leak actually present on current main in the wild before chasing it further. Happy to revisit/reopen if it's observed. Thanks! |
Summary
Addresses part of #69492 (system-event-shape consumer-path leakage; under umbrella #69208 Track B).
Adds an optional
audience: 'internal' | 'user-facing'field toSystemEvent, defaulting to'user-facing'(full backward-compat for every existing caller). Internal-audience events are wrapped at the consumer integration point in the existingINTERNAL_RUNTIME_CONTEXT_BEGIN/ENDdelimiters with the canonical header lines fromformatAgentInternalEventsForPrompt. The agent runtime sees the content as runtime context; user-facing surfaces strip it via the already-installedstripInternalRuntimeContextconsumers (sanitize-user-facing-text.ts,memory-host-sdk/host/session-files.ts,agents/internal-events.ts, history readers, Control UI extract).Mirrors the producer-side runtime-context delimiter pattern landed in:
e918e5f75c([Bug]: All channel messages are injected twice into agent context after 2026.4.24 update (duplicate turns, double token cost) #71761) — agent runtime-context custom entry6e985a421d— webchat runtime-contextNo new persistence shape, no consumer-side filter (avoids the audit-bypass concern flagged on #69217), no display-layer per-surface patch (avoids the rejection pattern from #69366).
Why this scope is tight
This primitive is the hidden runtime-context lane only. It is not a delivery-routing primitive. Events with a positive user delivery contract (exec completion via
notifyOnExit, cron payloads, heartbeat acks) are NOT migrated — those have heartbeat-driven explicit-relay paths (buildExecEventPrompt/buildCronEventPrompt) plus tactical producer-side skips likebd60df3e53(fix: silence cron exec completion noise). Marking them internal would suppress delivery on regular reply turns where the model is told to keep wrapped content private.The user-deliverable noise question stays open; the maintainer's recent pattern is case-by-case (
bd60df3e53,3f63ba8fd808), and that lane should remain.Changes
src/infra/system-events.ts— addSystemEventAudiencetype and optionalaudiencefield onSystemEvent+SystemEventOptions. Default"user-facing". JSDoc explicitly documents the scope (hidden runtime-context, not delivery routing) and the operator-surface caveat (peekSystemEvents/peekSystemEventEntrieskeep showing internal events for debugging).areSystemEventsEqualhonors the field so duplicate-suppression andconsumeSystemEventEntriesprefix-match keep working.src/auto-reply/reply/session-system-events.ts— partition the drain output into user-facing lines and internal lines. User-facing lines emit as plainSystem: ...(unchanged). Internal lines are wrapped inINTERNAL_RUNTIME_CONTEXT_BEGIN/ENDwith the canonical header (matchingformatAgentInternalEventsForPrompt).escapeInternalRuntimeContextDelimitershandles adversarial inner tokens.src/cron/isolated-agent/delivery-dispatch.ts— migratequeueCronAwarenessSystemEventtoaudience: "internal". This event is queued only after the isolated agent run has already delivered to the user via its own channel (gated ondelivered === trueandshouldQueueCronAwareness); the main session needs hidden context, not a fake-userSystem (untrusted): ...bubble.Explicitly NOT migrated
maybeNotifyOnExit(exec completion) —notifyOnExitdelivery contract.executeMainSessionCronJob(cron payload) — user-scheduled reminder content.wake()API — exposes arbitrary user-supplied text via gatewaycron.wake.server-methods/system.ts), node events (server-node-events.ts).JSDoc on
SystemEventAudienceexplicitly excludes these from the lane and points at the existing delivery paths.Follow-up plan (out of scope for this PR)
This PR is intentionally one-step. It introduces the primitive and migrates exactly one producer (
queueCronAwarenessSystemEvent) so the wrap-on-drain integration point can be reviewed against a real call site. No additional audience-PR is pre-committed; future migrations should be evaluated case-by-case as concrete leakage reports surface.Candidates that may earn
audience: "internal"later if a real need is demonstrated: debug- or awareness-style producers that are queued after the user has already received delivery on a different channel (the same shape asqueueCronAwarenessSystemEventhere). Each would need its own justification — no batch migration.Explicitly NOT in any audience-PR sequence (these stay in the maintainer's case-by-case tactical-patch lane, e.g.
bd60df3e53,3f63ba8fd808):maybeNotifyOnExit)executeMainSessionCronJob)A larger abstraction was floated during peer review — a 3-state shape (
prompt-visible|hidden-context|dedicated-delivery) for events that have both an internal-awareness and a user-delivery contract. Not on the roadmap. That redesign collides with the heartbeat-runner explicit-relay paths and the producer-side skips the maintainer has been adding directly; pursuing it would re-open the rejection patterns from #69366. Flagging here only so future readers know it was considered and deliberately not pursued.If maintainer direction differs, happy to follow guidance on #69492 before landing further migrations.
Real behavior proof
queueCronAwarenessSystemEvent, queued onto the main session after the isolated cron has already delivered the result on its own channel) were appearing in the main session as user-visibleSystem: cron run …bubbles, even though their purpose is internal awareness for the agent rather than a user-deliverable message. The fix migrates that producer toaudience: "internal"so the autoreply integration site wraps the event in<<<BEGIN_OPENCLAW_INTERNAL_CONTEXT>>>…<<<END_OPENCLAW_INTERNAL_CONTEXT>>>and the existingstripInternalRuntimeContextconsumers remove it from every user-facing surface.feat/system-event-audience-internal-routingHEADd5fc9d9609. Mercury's production OpenClaw gateway daemon (LaunchDaemon, 11-account Matrix deployment) has been running an earlier 2-commit local-patch variant of this PR continuously and shows the wrap-delimiter primitive working in real transcripts (excerpts retained below); the rebased squashed commit shares the same observable behavior and will be re-attested on the next prod reload.git fetch upstream && git checkout d5fc9d9609; (2)CI=true pnpm install(no native rebuild required); (3)pnpm tsgo:core:test(type-check); (4)pnpm vitest run src/infra/system-events.test.ts src/cron/isolated-agent/delivery-dispatch.double-announce.test.ts src/auto-reply/reply/queue.collect.test.ts src/infra/heartbeat-runner.ghost-reminder.test.ts(touched-test surfaces); (5)pnpm tsx scripts/proof-pr72201.ts(end-to-end production-code-path harness against the actualenqueueSystemEvent→drainFormattedSystemEvents→stripInternalRuntimeContextpipeline, no mocks).pnpm tsgo:core:test— clean exit 0.pnpm vitest run …—Test Files: 4 passed (4); Tests: 171 passed (171); Duration ~38s(53 insystem-events.test.tscovering default audience, round-trip, equality, wrap shape, ordering, user-facing-only, internal-only, escape, end-to-end strip, exec-shape filter bypass, heartbeat-noise filter bypass; 118 acrossdelivery-dispatch.double-announce.test.ts+queue.collect.test.ts+heartbeat-runner.ghost-reminder.test.tscovering producer migration, drain semantics, heartbeat-path queueing).scripts/proof-pr72201.tsagainst HEADd5fc9d9609— drives the actual production code paths on a throwaway sessionKey, no network / matrix / cron involvement. Live terminal output (run 2026-05-26T15:49:37Z):audience: "internal"(operator-sidepeekSystemEventEntriesexposes it for debugging); (B) the prompt body the model sees is wrapped in<<<BEGIN_OPENCLAW_INTERNAL_CONTEXT>>>…<<<END_OPENCLAW_INTERNAL_CONTEXT>>>with the canonical header lines; (C)stripInternalRuntimeContextreduces the user-facing view to an empty string — nothing from the internal event leaks to the transcript / history reader / channel surface. The harness lives atscripts/proof-pr72201.ts(local artifact, not pushed in this PR — output above is the captured terminal log).~/.openclaw/agents/devin/sessions/3998fa7e-…03d.jsonlshowing the<<<BEGIN_OPENCLAW_INTERNAL_CONTEXT>>>/OpenClaw runtime context (internal):block landing in the user-message context (auto-stripped from user-visible surfaces bystripInternalRuntimeContext):~/.openclaw/cron/runs/4309f6e7-f09d-4011-93ff-21726e3aa76d.jsonl:{"action":"finished","status":"ok","summary":"Top hit: openclaw-docs/automation/cron-jobs … [mercury | claude-sonnet-4-6]","sessionKey":"agent:mercury:cron:4309f6e7-…","durationMs":27268,"model":"gpt-5-mini","provider":"openai"}queueCronAwarenessSystemEventcalls produce the events this PR migrates toaudience: "internal". Cron + main session have been continuously stable on the pre-rebase Mercury build; reload onto the squashed-rebase build is pending and will be re-attested.codex review --base upstream/mainagainst the rebased HEAD reports "no discrete, actionable regressions in the diff against the merge base. The new internal-audience routing is carried through enqueue, drain, heartbeat filtering, and collect-mode transcript handling with matching tests for the touched behavior."tsgo:core:testis clean; all 171 touched-surface vitest tests pass; harness PASS end-to-end againstd5fc9d9609.System: cron run …bubble — the negative-evidence form is hard to capture cleanly because the bubble simply stops appearing. The unit test insrc/cron/isolated-agent/delivery-dispatch.double-announce.test.tsasserts theaudience: "internal"is set at the producer site as the structural regression.mantis: telegram-visible-proofcapture — Mercury's prod channel is Matrix, not Telegram. Happy to capture Telegram visible proof if a maintainer can point at a usable test account, but cannot self-generate it without channel access.queueCronAwarenessSystemEvent) was intentionally not migrated in this PR; that scope is deferred to follow-up case-by-case work per system-event-shape consumer-path leakage; propose per-event audience classification #69492.mercury/local-patchesdeploy, after which the cron-awareness path will be re-exercised on the new code. The pre-rebase prod evidence above stands in for this gap; the new code path is byte-equivalent in observable behavior (only the internal type-level coupling to the removed sender-owner machinery was dropped).Test plan
pnpm tsgo:core:testpasses (clean exit; required for test-touching PRs perfeedback_openclaw_check_gaps.md-style validation gap closure).pnpm vitest run src/infra/system-events.test.tspasses (53/53).pnpm vitest run src/cron/isolated-agent/delivery-dispatch.double-announce.test.ts src/auto-reply/reply/queue.collect.test.ts src/infra/heartbeat-runner.ghost-reminder.test.tspasses (118/118).pnpm tsx scripts/proof-pr72201.tsPASS end-to-end against the rebased HEADd5fc9d9609(see proof section).codex review --base upstream/mainagainst the rebased HEAD reports no actionable regressions.Pre-PR peer review
This proposal was peer-reviewed by
codex exec -m gpt-5.5 -c reasoning_effort=highbefore implementation. The peer review caught a contract-confusion issue between two earlier drafts (selective-drain vs wrap-on-drain) and pushed for the wrap-on-drain design landed here. Round-2 verdict: approve with PR-body wording tweak (avoid GitHub auto-close keywords for a partial fix; "Addresses part of #69492" is the framing applied here).Degree of testing
Lightly tested. Unit tests cover the primitive default, round-trip, equality, the wrap shape, ordering, edge cases, end-to-end strip via
stripInternalRuntimeContext, and adversarial delimiter-token escaping. The single migrated producer has a regression assertion in the existing test. No e2e or live-channel coverage added.🤖 Generated with Claude Code