Skip to content

fix(agents): drop stale exec approval followups after session rebind#85679

Merged
shakkernerd merged 1 commit into
openclaw:mainfrom
openperf:fix/59349-exec-approval-followup-session-rebind
Jun 8, 2026
Merged

fix(agents): drop stale exec approval followups after session rebind#85679
shakkernerd merged 1 commit into
openclaw:mainfrom
openperf:fix/59349-exec-approval-followup-session-rebind

Conversation

@openperf

@openperf openperf commented May 23, 2026

Copy link
Copy Markdown
Member

Summary

  • Problem: Issue [Bug] Exec approval follow-up can leak into a new session after /new because it rebinds by sessionKey instead of original sessionId #59349 reports that if a session has a pending exec approval and the user runs /new or /reset before resolving it, the eventual approval follow-up can be delivered into the new session. It surfaces as stale approval messages, Exec denied, or continuation text appearing in an unrelated fresh conversation. The bug is observed across elevated and non-elevated approvals on every channel that routes through /v1/chat/completions or the agent-command pipeline.
  • Root Cause: The follow-up is dispatched by logical sessionKey only (the agent follow-up request carries the key, not the originating session instance). The gateway resolves that key to whatever sessionId it currently maps to at follow-up time. /new and /reset rotate the sessionId under the same key, so a follow-up created while the original session was active resolves to the new session and lands there. The session instance active when the approval was requested was never captured or compared, so there was nothing to detect the rebind. CWE-200 information leak: old approval output reaches a fresh user-facing conversation.
  • Fix: Capture the session UUID active when the approval is requested (the run's sessionId, regenerated on /new and /reset) and thread it from the exec tool through the exec hosts onto the follow-up target. Drop the follow-up once that key has been rebound to a different session id, on both delivery paths:
    • Agent-run follow-ups (resume path): forward the expected id on the follow-up agent request as a new optional field execApprovalFollowupExpectedSessionId, and drop it at the gateway as an early preflight — before the handler touches the rebound session at all (session-store write, chat/agent run + active-run registration, dedupe, accepted ack), not just before model dispatch. So a stale follow-up leaves the rebound session completely untouched. Because the id rides on the request (not only on the elevated runtime handoff), it covers both elevated and non-elevated approvals.
    • Denied / direct fallback follow-ups (the Exec denied and direct-send path, which never reaches the gateway): resolve the key's current session id from the session store and drop the follow-up before the channel send. The session.store template is threaded through so custom store locations resolve correctly.
  • What changed:
    • src/agents/bash-tools.exec-types.ts: add sessionId and the session.store template to ExecToolDefaults.
    • src/agents/agent-tools.ts: forward the run's session id and store template into the exec tool defaults.
    • src/agents/bash-tools.exec.ts, bash-tools.exec-host-gateway.ts, bash-tools.exec-host-node.ts, bash-tools.exec-host-node.types.ts: thread the session id and store template to buildExecApprovalFollowupTarget.
    • src/agents/bash-tools.exec-host-shared.ts: carry expectedSessionId and the store template on the follow-up target and forward them to the dispatch.
    • src/agents/bash-tools.exec-approval-followup.ts: include execApprovalFollowupExpectedSessionId on the follow-up agent request, and drop denied/direct fallback sends when the key resolves to a different session id.
    • src/agents/bash-tools.exec-approval-followup-state.ts: add the isExecApprovalFollowupSessionRebound decision helper.
    • packages/gateway-protocol/src/schema/agent.ts: add the optional request field (additive only).
    • src/gateway/server-methods/agent.ts: drop the follow-up as an early preflight when the session key was rebound — before any session-store write, run/context/active-run registration, dedupe, or accepted ack.
    • apps/shared/OpenClawKit/Sources/OpenClawProtocol/GatewayModels.swift: protocol model mirror for the new optional field.
    • src/agents/bash-tools.exec-host-shared.test.ts, bash-tools.exec-approval-followup.test.ts: regression coverage for both delivery paths (rebound / unchanged / missing-id).
    • src/gateway/server-methods/agent.test.ts: focused gateway integration test asserting that a stale exec-approval followup is dropped at preflight without any side effect on the rebound session — response is {status: "ok", summary: "exec approval followup dropped: ..."}, dedupe entry is seeded with that payload, and neither mocks.updateSessionStore nor mocks.agentCommand is invoked.
  • What did NOT change (scope boundary):
    • Runtime handoff registration, deterministic follow-up idempotency, and exec execution — untouched. The drop only fires when an expected session id is present and differs from the resolved session id, so non-rebound and id-less follow-ups deliver exactly as before.
    • Steady-state established sessions are unaffected (the expected id matches the resolved id, so nothing is dropped).
    • Gateway protocol change is additive only (one optional request field on agentRequest); no key is retired, no version bump required, existing clients unaffected.
    • No config keys, doctor migrations, plugin SDK, streaming, or delivery behavior changed.
    • No package.json / pnpm-lock.yaml / npm-shrinkwrap.json / pnpm-workspace.yaml edits; Dependency Guard not applicable.
    • No CHANGELOG.md edit (release-owned).

Reproduction

  1. Start a session for an agent with a stable session key.
  2. Trigger a tool execution that requires approval and leave it pending.
  3. Before approving, send /new or /reset, creating a new sessionId under the same key.
  4. Approve the old request.
  • Before: the new session receives the old approval follow-up (a follow-up run enters the agent pipeline for the rebound session) — stale approval output, Exec denied, or continuation text surfaces in an unrelated fresh conversation.
  • After: the gateway detects the rebind and drops the stale follow-up before any session-store write, run registration, or model dispatch; the new session is not polluted.

Real behavior proof

  • Behavior addressed: an exec approval follow-up whose session key was rebound by /new or /reset while the approval was pending must be dropped at the gateway (resume path) or in the agents layer (denied/direct path) instead of being delivered into the new session.
  • Real environment tested: a real OpenClaw gateway started in-process via startGatewayServer on loopback (bundled plugins disabled for fast startup; channels/providers/cron skipped), driven over a real operator WebSocket connection. Linux x64, Node 22. Real performGatewaySessionReset + real exec.approval.resolve operator RPC + real on-disk session-store JSON.
  • Exact steps or command run after this patch: seed session key agent:main:main to session id 11111111-...-111111111111; start the gateway and connect an operator; build the gateway exec tool for that session (agent follow-up mode, ask: always) and run a command requiring approval, leaving it pending; call the real reset path performGatewaySessionReset({ key, reason: "reset" }) to rotate the key to a new session id; resolve the pending approval via the exec.approval.resolve operator RPC; observe the gateway log + re-read the session store. Unit-level regression: node scripts/run-vitest.mjs src/agents/bash-tools.exec-approval-followup.test.ts src/agents/bash-tools.exec-host-shared.test.ts.
  • Evidence after fix (verbatim live console output, color codes stripped; [REPRO] = harness script markers, [gateway] / [agents/exec-approval-followup] = OpenClaw runtime logs):
=== SCENARIO 1 — resume path (agent-run follow-up dropped at gateway preflight) ===
[REPRO] exec pending status: approval-pending
[REPRO] approvalId: afb796ab-f635-45f7-b7f1-37224a429aff
[REPRO] after reset: agent:main:main -> e3870fca-ac72-4efc-943b-e4c24c64bce2 reset.ok= true
[REPRO] resolving approval (allow-once) ...
[gateway] Dropping stale exec approval followup afb796ab-f635-45f7-b7f1-37224a429aff: session agent:main:main rebound (expected 11111111-1111-4111-8111-111111111111, resolved e3870fca-ac72-4efc-943b-e4c24c64bce2) before the approval resolved

=== SCENARIO 2 — rebound session untouched (no session-store write happens for the dropped follow-up) ===
[REPRO] after reset: agent:main:main -> 471d713c-b3c8-4910-b7c7-e60f0ecdc024 updatedAt= 1779553025393
[gateway] Dropping stale exec approval followup adfbf4a7-...: session agent:main:main rebound (expected 33333333-..., current 471d713c-...) before the approval resolved
[REPRO] after followup: agent:main:main -> 471d713c-b3c8-4910-b7c7-e60f0ecdc024 updatedAt= 1779553025393
[REPRO] REBOUND SESSION UNTOUCHED (updatedAt unchanged, stale followup dropped before any store write)

=== SCENARIO 3 — denied/direct path (no gateway send; agents-layer drop) ===
[REPRO] exec pending status: approval-pending
[REPRO] approvalId: 47c60a91-b795-458e-8263-907e5c5aa6c2
[REPRO] after reset: agent:main:main -> 0a2007ac-162c-40f1-be21-938cf4ee1126 reset.ok= true
[REPRO] DENYING approval after reset ...
[agents/exec-approval-followup] Dropping stale denied exec approval followup 47c60a91-b795-458e-8263-907e5c5aa6c2: session agent:main:main was rebound before the approval resolved

=== BASELINE (same scenario on the pre-fix base commit) — no drop happens; follow-up enters the run pipeline ===
[REPRO] approvalId: 0fb830d1-adc1-4a4e-9880-fe2563fb13a5
[REPRO] after reset: agent:main:main -> d762ec2c-fd60-49d6-b932-9bc41e6fef27 reset.ok= true
exec approval followup dispatch failed (id=0fb830d1-adc1-4a4e-9880-fe2563fb13a5): Session followup failed: gateway closed (1012): service restart
[ws] res agent errorCode=UNAVAILABLE errorMessage=FailoverError: No API key found for provider "openai" ... runId=exec-approval-followup:0fb830d1-adc1-4a4e-9880-fe2563fb13a5
  • Observed result after fix: the gateway recognized the session rebind (expected original id vs resolved new id), logged the drop at the preflight, and did not dispatch a follow-up run into the new session. The session-store entry for the rebound key is byte-identical before and after the dropped follow-up (updatedAt unchanged), proving no run/store/dedupe/ack side effects on the rebound session. The denied/direct path equivalent is dropped one layer earlier in the agents code so no channel send occurs. The pre-fix baseline shows the same scenario unconditionally dispatching a follow-up agent run (runId=exec-approval-followup:0fb830d1-...) into the rebound session.
  • Code-path coverage: node scripts/run-vitest.mjs src/agents/bash-tools.exec-approval-followup.test.ts src/agents/bash-tools.exec-host-shared.test.ts (86 passed across 4 files) covers rebound / unchanged / missing-id branches on the predicate, the followup-target thread-through, and the denied/direct fallback path. node scripts/run-vitest.mjs src/gateway/server-methods/agent.test.ts -t 'drops a stale exec approval followup at preflight' (2 passed) locks the gateway preflight invariant — drop response shape, dedupe seed, and the absence of updateSessionStore / agentCommand calls — in regression tests. Bite check confirmed: temporarily disabling the preflight makes the new gateway test fail with status: "accepted" (followup dispatched into the rebound session), which is exactly the pre-fix behavior the predicate is supposed to prevent. pnpm exec oxfmt --check and pnpm exec oxlint on the touched TypeScript files are clean.
  • What was not tested: bundled plugins were disabled and no model API key was configured in the in-process gateway test environment; the dropped-follow-up path returns before any model invocation, so model-auth-side concerns are out of scope. A multi-hour production deployment and a third-party channel end-to-end were not exercised. Cross-PR merge order with fix(agents): serialize new-session resolution per session key #85404 / [Fix] Deliver restart recovery replies #86089 / fix(agents): persist user turn before attempt failures #86764 was not simulated end-to-end, but the branch rebases cleanly onto current origin/main with the single import-merge conflict resolved.

Risk / Mitigation

  • Risk: dropping a follow-up could suppress a legitimate one if the expected id is wrong. Mitigation: the drop only fires when an expected session id is present and differs from the resolved session id (unit tests cover rebound / unchanged / missing-id branches). Id-less follow-ups deliver unchanged, so backwards compatibility for callers that do not opt in is preserved.
  • Risk: the new protocol request field could be misread by older clients. Mitigation: the field is optional and additive on agentRequest; no key is retired, no version bump required. Existing clients that do not send the field hit the existing delivery path. The Swift OpenClawProtocol mirror is updated in lockstep so iOS clients deserialize cleanly.
  • Risk: the early preflight changes session-state timing for rebound keys (drop happens before any handler write, run registration, or dedupe). Mitigation: that is the documented contract this PR introduces, and it is strictly stronger than the pre-fix behavior — every effect that used to land on the rebound session is now suppressed at one explicit point, with a verbatim log line. Scenario 2 above shows the rebound session's updatedAt is unchanged across the drop.
  • Risk: cross-PR conflict with sibling PRs fix(agents): serialize new-session resolution per session key #85404 / [Fix] Deliver restart recovery replies #86089 / fix(agents): persist user turn before attempt failures #86764 in adjacent regions of src/agents/agent-command.ts and bash-tools.*. Mitigation: this branch rebases cleanly onto current origin/main after resolving one import-merge in bash-tools.exec-approval-followup.ts (combine the new isExecApprovalFollowupSessionRebound import with main's renamed embedded-agent-helpers/sanitize-user-facing-text path). Merge order with the sibling PRs is independent.

Change Type (select all)

  • Bug fix

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • API / contracts

Linked Issue/PR

Fixes #59349

@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui gateway Gateway runtime agents Agent runtime and tooling size: S proof: supplied External PR includes structured after-fix real behavior proof. labels May 23, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs maintainer review before merge. Reviewed June 8, 2026, 12:03 PM ET / 16:03 UTC.

Summary
The PR adds an optional exec approval expected-session-id request field and threads sessionId/session.store through exec approval follow-up dispatch so rebound-session follow-ups are dropped.

PR surface: Source +172, Tests +231, Other +4. Total +407 across 15 files.

Reproducibility: yes. Source inspection shows current main rotates sessionId on reset while exec approval follow-ups carry only sessionKey/idempotency, and the PR body supplies live before/after gateway logs for that path.

Review metrics: 1 noteworthy metric.

  • Protocol request surface: 1 optional field added. The new field is the agents-to-gateway contract for the guard, so maintainers should confirm protocol mirrors stay aligned.

Merge readiness
Overall: 🦞 diamond lobster
Proof: 🦞 diamond lobster
Patch quality: 🦞 diamond lobster
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] The patch intentionally suppresses approval follow-up delivery after a session key rebind; maintainer review has accepted that as the safer isolation behavior, but the merge should preserve that decision plainly.
  • [P1] Nearby open agent/gateway session work touches overlapping functions, so the final exact-head merge should be refreshed against current base and sibling session/message-delivery PRs.

Maintainer options:

  1. Accept Session-Isolation Semantics (recommended)
    Proceed once exact-head checks stay green because maintainer review explicitly accepted dropping stale follow-ups after a session rebind.
  2. Refresh Around Sibling Session PRs
    If nearby agent/gateway session PRs land first, rebase and rerun the focused exec follow-up and gateway tests before merging.

Next step before merge

  • [P2] No repair lane is needed because this active member-authored PR has no discrete actionable finding and should stay in normal maintainer landing review.

Security
Cleared: No concrete security or supply-chain concern found; the diff adds no dependencies or CI changes and narrows a stale session/message exposure path.

Review details

Best possible solution:

Land the focused session-id guard after exact-head checks and merge coordination, keeping the optional protocol field and direct/denied fallback checks covered by regression tests.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main rotates sessionId on reset while exec approval follow-ups carry only sessionKey/idempotency, and the PR body supplies live before/after gateway logs for that path.

Is this the best way to solve the issue?

Yes. Pinning the approval-time sessionId and checking it at the gateway preflight plus direct fallback is the narrow owner-boundary fix; changing session-key semantics globally would be broader and riskier.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 57633c42b647.

Label changes

Label changes:

  • add rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • remove rating: 🐚 platinum hermit: Current PR rating is rating: 🦞 diamond lobster, so this older rating label is no longer current.

Label justifications:

  • P1: The PR fixes stale exec approval output crossing a session reset boundary, which can affect real agent/channel workflows.
  • merge-risk: 🚨 session-state: Merging changes whether a follow-up can touch a session after the session key has been rebound.
  • merge-risk: 🚨 message-delivery: Merging intentionally drops stale approval follow-ups instead of delivering them into the currently resolved route.
  • rating: 🦞 diamond lobster: Overall readiness is 🦞 diamond lobster; proof is 🦞 diamond lobster and patch quality is 🦞 diamond lobster.
  • status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (logs): The PR body includes live in-process gateway/WebSocket proof with runtime logs, session-store observation, and a pre-fix baseline showing the stale follow-up entering the run pipeline.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body includes live in-process gateway/WebSocket proof with runtime logs, session-store observation, and a pre-fix baseline showing the stale follow-up entering the run pipeline.
Evidence reviewed

PR surface:

Source +172, Tests +231, Other +4. Total +407 across 15 files.

View PR surface stats
Area Files Added Removed Net
Source 11 173 1 +172
Tests 3 231 0 +231
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 1 4 0 +4
Total 15 408 1 +407

What I checked:

Likely related people:

  • Peter Steinberger: Current-main blame for the central exec follow-up and gateway agent handler regions points to the recent session metadata refactor commit; shallow history limits older provenance. (role: recent area contributor; confidence: medium; commits: 538d36eaaaa6; files: src/agents/bash-tools.exec-host-shared.ts, src/gateway/server-methods/agent.ts)
  • shakkernerd: The PR timeline shows the final head was force-pushed by this user and their maintainer review explicitly accepted the session/message isolation behavior. (role: reviewer and branch refresher; confidence: medium; commits: f021633bf316; files: src/agents/bash-tools.exec-approval-followup.ts, src/gateway/server-methods/agent.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 23, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

✨ Hatched: 🥚 common Brave Test Hopper

Hatch command

Comment @clawsweeper hatch when this PR is hatchable.

Hatchability rules:

  • Merged PRs are hatchable.
  • Open PRs are hatchable when they are status: 👀 ready for maintainer look, status: 🚀 automerge armed, or labeled clawsweeper:automerge.
  • Closed unmerged PRs are hatchable only when one of those hatchable labels is still present in the durable record.

Rarity: 🥚 common.
Trait: purrs at green checks.
Image traits: location review cove; accessory shell-shaped keyboard; palette rose quartz and slate; mood focused; pose holding its accessory up for inspection; shell frosted glass shell; lighting tiny status-light glow; background miniature CI buoys.
Share on X: post this hatch
Copy: My PR egg hatched a 🥚 common Brave Test Hopper in ClawSweeper.

What is this egg doing here?
  • Eggs appear after the PR passes real-behavior proof. It is here for vibes, not verdicts: it does not change labels, ratings, merge decisions, or automation.
  • The shell reacts to review momentum: open follow-up work warms it up, re-review makes it wobble, and a clean final review lets it hatch.
  • Hatchability usually comes from sufficient real-behavior proof, no blocking P0/P1/P2 findings, no security attention needed, and clean correctness. A merged PR is already final, so merge makes the egg hatchable independently.
  • The hatch is seeded from this repository and PR number, so the same PR keeps the same creature; the reviewed head SHA can only change safe visual details.
  • Rarity is just collectible sparkle: 🥚 common, 🌱 uncommon, 💎 rare, ✨ glimmer, and 🌈 legendary.

@openperf openperf force-pushed the fix/59349-exec-approval-followup-session-rebind branch from ecd3182 to 10d3060 Compare May 23, 2026 10:01
@openclaw-barnacle openclaw-barnacle Bot added size: M and removed size: S proof: supplied External PR includes structured after-fix real behavior proof. proof: sufficient ClawSweeper judged the real behavior proof convincing. labels May 23, 2026
@clawsweeper

clawsweeper Bot commented May 23, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@openclaw-barnacle openclaw-barnacle Bot added the proof: supplied External PR includes structured after-fix real behavior proof. label May 23, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. status: 🛠️ actively grinding The PR author has acted after the latest ClawSweeper review and work remains. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. and removed status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 23, 2026
@openperf openperf force-pushed the fix/59349-exec-approval-followup-session-rebind branch from 10d3060 to 03311e4 Compare May 23, 2026 11:04
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 23, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: 🛠️ actively grinding The PR author has acted after the latest ClawSweeper review and work remains. labels May 23, 2026
@clawsweeper clawsweeper Bot added proof: sufficient ClawSweeper judged the real behavior proof convincing. rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR. and removed rating: 🦐 gold shrimp Decent PR readiness signal, but merge confidence is limited. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. labels May 23, 2026
@openperf openperf force-pushed the fix/59349-exec-approval-followup-session-rebind branch from 5b44e01 to a082367 Compare May 24, 2026 00:43
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 24, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 24, 2026
@steipete steipete self-assigned this May 24, 2026
@BingqingLyu

This comment was marked as spam.

@openperf openperf force-pushed the fix/59349-exec-approval-followup-session-rebind branch from a082367 to b77003e Compare May 29, 2026 13:51
@clawsweeper clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels May 29, 2026
@openperf openperf force-pushed the fix/59349-exec-approval-followup-session-rebind branch 2 times, most recently from 460860b to 8ac9521 Compare May 29, 2026 15:37
@clawsweeper clawsweeper Bot added rating: 🦀 challenger crab Exceptional PR readiness: strong proof, clean patch, and convincing validation. and removed rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. labels May 29, 2026
Exec approval followups were dispatched by sessionKey only. When /new or
/reset rotates the sessionId under that key while an approval is pending,
the resolved followup landed in the new session, surfacing stale approval
output (or 'Exec denied' / continuation text) in a fresh conversation.

Capture the session UUID active when the approval is requested and drop the
followup once the key has been rebound to a different sessionId:
- agent-run followups: carry the expected id on the agent request and drop it
  at the gateway as an early preflight, before the handler touches the rebound
  session (session-store write, chat/agent run + active-run registration,
  dedupe, accepted ack) — not just before model dispatch. Covers elevated and
  non-elevated.
- denied / direct fallback followups: resolve the key's current sessionId from
  the session store and drop before the channel send.

Fixes openclaw#59349.
@shakkernerd shakkernerd force-pushed the fix/59349-exec-approval-followup-session-rebind branch from 8ac9521 to f021633 Compare June 8, 2026 15:39
@clawsweeper clawsweeper Bot added rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. and removed rating: 🦀 challenger crab Exceptional PR readiness: strong proof, clean patch, and convincing validation. labels Jun 8, 2026
@shakkernerd

Copy link
Copy Markdown
Member

Maintainer review: the stale-follow-up behavior in this PR is intentional and accepted.

After /new or /reset rebinds a session key, an exec approval result from the previous session instance should not be delivered into the new session. Dropping that follow-up is the safer behavior for session/message isolation.

The additive optional gateway protocol field and Swift mirror are acceptable here. CI is green on the rebased head, and the local/Testbox verification covered the touched agent/gateway paths.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@clawsweeper clawsweeper Bot added rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. and removed rating: 🐚 platinum hermit Good normal PR readiness with ordinary maintainer review expected. labels Jun 8, 2026
@shakkernerd

Copy link
Copy Markdown
Member

Maintainer pre-merge verification for rebase merge:

  • Head checked: f021633bf3162a8e6d0a7564d7a2e8b6dc8ef4a3
  • GitHub merge state: MERGEABLE / CLEAN
  • ClawSweeper state: proof: sufficient, status: 👀 ready for maintainer look, rating: 🦞 diamond lobster
  • Maintainer behavior decision accepted: stale exec approval follow-ups after /new or /reset session-key rebind should be dropped rather than delivered into the new session.
  • Nearby overlap check: Feat/acp hub delegated sessions #91093 is still open and not merged, so it has not invalidated this exact head.

Verification run before push/landing:

  • git diff --check origin/main...HEAD
  • node_modules/.bin/oxfmt --check --threads=1 src/agents/bash-tools.exec-approval-followup-state.ts src/agents/bash-tools.exec-approval-followup.test.ts src/agents/bash-tools.exec-approval-followup.ts src/agents/bash-tools.exec-host-shared.test.ts src/gateway/server-methods/agent.ts
  • Testbox targeted tests: tbx_01ktkxkmcmqj0nfxfapmwtmd3y, corepack pnpm test src/agents/bash-tools.exec-approval-followup.test.ts src/agents/bash-tools.exec-host-shared.test.ts src/gateway/server-methods/agent.test.ts -- --reporter=verbose
  • Testbox changed gate: tbx_01ktkxq67ccqyn8qf1tb6qmx2d, env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed
  • PR checks are passing on the rebased head.

Known proof gap: no extra live end-to-end channel run was performed by me after the rebase; this landing relies on the PR's supplied real-behavior proof, the focused agent/gateway Testbox tests, the changed gate, and the green PR CI.

Proceeding with rebase merge.

@shakkernerd shakkernerd merged commit 2ffbea2 into openclaw:main Jun 8, 2026
178 of 182 checks passed
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request Jun 12, 2026
…26.6.6) (#1040)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.5` → `2026.6.6` |

---

### Release Notes

<details>
<summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary>

### [`v2026.6.6`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202666)

[Compare Source](openclaw/openclaw@v2026.6.5...v2026.6.6)

##### Highlights

- Security boundaries are substantially tighter across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP access, native search policy, elevated sender checks, deleted-agent ACP bypasses, loopback tools, Discord moderation, and Teams group actions; exec approvals now fail closed on timeout. ([#&#8203;91529](openclaw/openclaw#91529), [#&#8203;91618](openclaw/openclaw#91618), [#&#8203;91615](openclaw/openclaw#91615), [#&#8203;91619](openclaw/openclaw#91619), [#&#8203;91741](openclaw/openclaw#91741), [#&#8203;91745](openclaw/openclaw#91745), [#&#8203;91746](openclaw/openclaw#91746), [#&#8203;91748](openclaw/openclaw#91748), [#&#8203;91749](openclaw/openclaw#91749), [#&#8203;91750](openclaw/openclaw#91750), [#&#8203;91751](openclaw/openclaw#91751), [#&#8203;91752](openclaw/openclaw#91752), [#&#8203;91763](openclaw/openclaw#91763), [#&#8203;89938](openclaw/openclaw#89938)) Thanks [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;mmaps](https://github.com/mmaps), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;shakkernerd](https://github.com/shakkernerd), and [@&#8203;drobison00](https://github.com/drobison00).
- Telegram delivery is safer and more coherent: account-scoped topics route to the right agent, streamed text survives tool calls, `/compact` works on generic ingress, callback handling uses concrete APIs, draft chunking is shared, durable dispatch dedupe moved into the SDK, and unauthorized DM text stays out of cache and prompt context. ([#&#8203;91189](openclaw/openclaw#91189), [#&#8203;88682](openclaw/openclaw#88682), [#&#8203;89588](openclaw/openclaw#89588), [#&#8203;90212](openclaw/openclaw#90212), [#&#8203;91876](openclaw/openclaw#91876), [#&#8203;91874](openclaw/openclaw#91874), [#&#8203;91904](openclaw/openclaw#91904), [#&#8203;91478](openclaw/openclaw#91478), [#&#8203;91915](openclaw/openclaw#91915)) Thanks [@&#8203;codysai001](https://github.com/codysai001), [@&#8203;alexzhu0](https://github.com/alexzhu0), [@&#8203;joelnishanth](https://github.com/joelnishanth), [@&#8203;snowzlm](https://github.com/snowzlm), [@&#8203;obviyus](https://github.com/obviyus), and [@&#8203;sallyom](https://github.com/sallyom).
- iMessage recovery and delivery now cover always-on inbound restart, durable echo markers, block streaming, idle approval discovery, hardened outbound transport, and actionable inbound startup diagnostics. ([#&#8203;91335](openclaw/openclaw#91335), [#&#8203;91449](openclaw/openclaw#91449), [#&#8203;88969](openclaw/openclaw#88969), [#&#8203;88530](openclaw/openclaw#88530), [#&#8203;91783](openclaw/openclaw#91783), [#&#8203;91785](openclaw/openclaw#91785)) Thanks [@&#8203;omarshahine](https://github.com/omarshahine), [@&#8203;jmissig](https://github.com/jmissig), and [@&#8203;colmbrogan](https://github.com/colmbrogan).
- Browser and MCP connectivity gained existing-session CDP support, discovered WebSocket validation, default-profile `cdpUrl` handling, safer browser-output boundaries, Streamable HTTP loopback transport, corrected OAuth/SSE authorization handling, and broader schema compatibility. ([#&#8203;91422](openclaw/openclaw#91422), [#&#8203;89851](openclaw/openclaw#89851), [#&#8203;91736](openclaw/openclaw#91736), [#&#8203;91747](openclaw/openclaw#91747), [#&#8203;91451](openclaw/openclaw#91451), [#&#8203;80143](openclaw/openclaw#80143)) Thanks [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia), [@&#8203;lifuyue](https://github.com/lifuyue), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;LiuwqGit](https://github.com/LiuwqGit), and [@&#8203;HemantSudarshan](https://github.com/HemantSudarshan).
- Control UI startup and first-reply latency are lower through cached model metadata, removal of the startup catalog wait, lazy slash-command loading, and first-event tracing with slow-reply diagnostics. ([#&#8203;91531](openclaw/openclaw#91531), [#&#8203;91538](openclaw/openclaw#91538), [#&#8203;91568](openclaw/openclaw#91568), [#&#8203;91583](openclaw/openclaw#91583), [#&#8203;91598](openclaw/openclaw#91598))
- Provider support expands with OpenRouter OAuth onboarding and Claude Fable 5 adaptive thinking, while Codex sessions keep correct compaction ownership, local models skip guardian review, dynamic tool progress normalizes cleanly, and Gemma 4 reasoning replay is preserved. ([#&#8203;91830](openclaw/openclaw#91830), [#&#8203;91882](openclaw/openclaw#91882), [#&#8203;91590](openclaw/openclaw#91590), [#&#8203;88630](openclaw/openclaw#88630), [#&#8203;88768](openclaw/openclaw#88768), [#&#8203;91696](openclaw/openclaw#91696)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;bdjben](https://github.com/bdjben), and [@&#8203;Coder-Wangyankun](https://github.com/Coder-Wangyankun).

##### Changes

- CLI progress: emit Claude CLI commentary progress events and bridge inter-tool commentary into channel progress without exposing internal protocol scaffolding. ([#&#8203;89834](openclaw/openclaw#89834), [#&#8203;90883](openclaw/openclaw#90883)) Thanks [@&#8203;anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia).
- Observability: allow trusted diagnostics channels to capture tool input/output content, add first-assistant-event traces, and warn on slow initial replies. ([#&#8203;91256](openclaw/openclaw#91256), [#&#8203;91568](openclaw/openclaw#91568), [#&#8203;91583](openclaw/openclaw#91583)) Thanks [@&#8203;amknight](https://github.com/amknight).
- Plugins/ClawHub: dogfood reusable package publishing, let dry runs skip publish approval, allow declared installed trusted hooks, report managed plugin version drift, and warn instead of failing on retired Skill Workshop configuration. ([#&#8203;91574](openclaw/openclaw#91574), [#&#8203;91591](openclaw/openclaw#91591), [#&#8203;90004](openclaw/openclaw#90004), [#&#8203;90927](openclaw/openclaw#90927), [#&#8203;90838](openclaw/openclaw#90838)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;brokemac79](https://github.com/brokemac79), and [@&#8203;lonexreb](https://github.com/lonexreb).
- Memory/providers: move the local llama.cpp runtime into its provider plugin, batch embeddings across files, persist the agent model catalog cache, and keep QMD JSON search one-shot while filtering stale REM recall previews. ([#&#8203;91324](openclaw/openclaw#91324), [#&#8203;89138](openclaw/openclaw#89138), [#&#8203;90457](openclaw/openclaw#90457), [#&#8203;91837](openclaw/openclaw#91837), [#&#8203;91851](openclaw/openclaw#91851)) Thanks [@&#8203;osolmaz](https://github.com/osolmaz), [@&#8203;mushuiyu886](https://github.com/mushuiyu886), [@&#8203;ai-hpc](https://github.com/ai-hpc), and [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- Channels/mobile: add the QQBot group mention toggle, improve iPad and iPhone control surfaces, and expose the active connection host in the TUI footer. ([#&#8203;91423](openclaw/openclaw#91423), [#&#8203;91557](openclaw/openclaw#91557), [#&#8203;89909](openclaw/openclaw#89909)) Thanks [@&#8203;cxyhhhhh](https://github.com/cxyhhhhh), [@&#8203;Solvely-Colin](https://github.com/Solvely-Colin), and [@&#8203;baskduf](https://github.com/baskduf).
- Performance: prewarm TUI runtime plugins, deduplicate plugin auto-enable fanout, trim dense text-delta snapshots, and reuse prepared startup model metadata. ([#&#8203;90782](openclaw/openclaw#90782), [#&#8203;89978](openclaw/openclaw#89978), [#&#8203;91580](openclaw/openclaw#91580), [#&#8203;91531](openclaw/openclaw#91531)) Thanks [@&#8203;RomneyDa](https://github.com/RomneyDa) and [@&#8203;ai-hpc](https://github.com/ai-hpc).

##### Fixes

- Agent/session recovery: drop stale approval follow-ups after session rebind, remove drained reply-queue items by identity, recover stale main and visible replies, preserve Codex context-engine compaction ownership, lower the default compaction timeout to 180 seconds while respecting explicit configuration, and keep provider-failure terminal lifecycle state correct. ([#&#8203;85679](openclaw/openclaw#85679), [#&#8203;91450](openclaw/openclaw#91450), [#&#8203;91566](openclaw/openclaw#91566), [#&#8203;91840](openclaw/openclaw#91840), [#&#8203;91590](openclaw/openclaw#91590), [#&#8203;91361](openclaw/openclaw#91361), [#&#8203;91895](openclaw/openclaw#91895)) Thanks [@&#8203;openperf](https://github.com/openperf), [@&#8203;yetval](https://github.com/yetval), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;wangmiao0668000666](https://github.com/wangmiao0668000666), and [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- User-visible content boundaries: suppress Codex/Harmony protocol artifacts, neutralize browser and LanceDB memory media directives, redact transcript images, and preserve native `/compact` replies through source suppression. ([#&#8203;89151](openclaw/openclaw#89151), [#&#8203;91422](openclaw/openclaw#91422), [#&#8203;91425](openclaw/openclaw#91425), [#&#8203;91529](openclaw/openclaw#91529), [#&#8203;90212](openclaw/openclaw#90212)) Thanks [@&#8203;joelnishanth](https://github.com/joelnishanth), [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;joshavant](https://github.com/joshavant), and [@&#8203;snowzlm](https://github.com/snowzlm).
- Channel delivery: keep WhatsApp captured replies attached to the successor controller after restart, retry Feishu rate limits, preserve Mattermost thread replies, canonicalize LINE webhook paths, restore Discord reply hydration and runtime timeout exports, and show OpenAI Realtime WebRTC assistant transcripts. ([#&#8203;85823](openclaw/openclaw#85823), [#&#8203;89659](openclaw/openclaw#89659), [#&#8203;91684](openclaw/openclaw#91684), [#&#8203;91649](openclaw/openclaw#91649), [#&#8203;90263](openclaw/openclaw#90263), [#&#8203;91686](openclaw/openclaw#91686), [#&#8203;90426](openclaw/openclaw#90426)) Thanks [@&#8203;itsuzef](https://github.com/itsuzef), [@&#8203;ladygege](https://github.com/ladygege), [@&#8203;jacobtomlinson](https://github.com/jacobtomlinson), [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), and [@&#8203;shushushv](https://github.com/shushushv).
- Cron: cancel active task runs cleanly, preserve terminal timeout/cancel state, and recover no-deliver tool warnings instead of silently losing the outcome. ([#&#8203;90666](openclaw/openclaw#90666), [#&#8203;90678](openclaw/openclaw#90678)) Thanks [@&#8203;ai-hpc](https://github.com/ai-hpc).
- Gateway/config/auth: share the approval runtime socket token, replace arrays explicitly in `config.patch`, skip the deleted-agent guard only for valid ACP harness sessions, surface headless LaunchAgent state, verify SQLite auth migration before cleanup, and arm QMD startup maintenance. ([#&#8203;87105](openclaw/openclaw#87105), [#&#8203;91551](openclaw/openclaw#91551), [#&#8203;91219](openclaw/openclaw#91219), [#&#8203;91614](openclaw/openclaw#91614), [#&#8203;91740](openclaw/openclaw#91740), [#&#8203;91978](openclaw/openclaw#91978)) Thanks [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev) and [@&#8203;scotthuang](https://github.com/scotthuang).
- Providers/Codex: clarify quota errors, restore the Codex synthetic usage line, canonicalize Codex protocol assets, require API-key auth for realtime voice, normalize ACP model refs, preserve Gemma 4 `reasoning_content`, and avoid guardian review for local models. ([#&#8203;91390](openclaw/openclaw#91390), [#&#8203;91709](openclaw/openclaw#91709), [#&#8203;91507](openclaw/openclaw#91507), [#&#8203;91567](openclaw/openclaw#91567), [#&#8203;88630](openclaw/openclaw#88630), [#&#8203;91696](openclaw/openclaw#91696)) Thanks [@&#8203;hxy91819](https://github.com/hxy91819), [@&#8203;brokemac79](https://github.com/brokemac79), [@&#8203;RomneyDa](https://github.com/RomneyDa), [@&#8203;joshavant](https://github.com/joshavant), and [@&#8203;Coder-Wangyankun](https://github.com/Coder-Wangyankun).
- Updates/builds: recover package Gateway restarts after refresh failure, expose plugin convergence repair, fall back to Corepack in PATH-less pnpm environments, seed the correct Docker store packages, and keep ClawHub dry-run and publish paths reusable. ([#&#8203;91581](openclaw/openclaw#91581), [#&#8203;91599](openclaw/openclaw#91599), [#&#8203;91547](openclaw/openclaw#91547), [#&#8203;91591](openclaw/openclaw#91591)) Thanks [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), [@&#8203;sallyom](https://github.com/sallyom), and [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen).
- UI: require explicit user intent before opening chat sessions and drain restored chat queues after session switches. ([#&#8203;91480](openclaw/openclaw#91480)) Thanks [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- Android: avoid the `dataSync` foreground-service type for persistent nodes. ([#&#8203;80082](openclaw/openclaw#80082)) Thanks [@&#8203;davelutztx](https://github.com/davelutztx).
- Native hooks: bound relay lifetimes so abandoned native hook connections cannot linger indefinitely. ([#&#8203;91550](openclaw/openclaw#91550)) Thanks [@&#8203;joshavant](https://github.com/joshavant).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/1040
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling app: web-ui App: web-ui gateway Gateway runtime merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. merge-risk: 🚨 session-state 🚨 May lose, corrupt, stale, or mis-associate session, agent, or context state. P1 High-priority user-facing bug, regression, or broken workflow. proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🦞 diamond lobster Very strong PR readiness with only minor maintainer review expected. size: M status: 👀 ready for maintainer look ClawSweeper has no concrete contributor-facing blocker left for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Exec approval follow-up can leak into a new session after /new because it rebinds by sessionKey instead of original sessionId

4 participants