fix(infra): retry session-event runs on failure and pass sessionKey for exec wakes#31388
fix(infra): retry session-event runs on failure and pass sessionKey for exec wakes#31388Sid-Qin wants to merge 1 commit into
Conversation
…or exec wakes requestHeartbeatNow calls from exec completion and node exec events were missing the sessionKey, causing the heartbeat runner to potentially target the wrong session. Additionally, handler results with status "failed" were silently dropped instead of being retried like thrown errors. Closes openclaw#31354 Made-with: Cursor
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7a3fc5b40a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| (res.status === "skipped" && res.reason === "requests-in-flight") || | ||
| res.status === "failed" |
There was a problem hiding this comment.
Limit retries for non-transient failed wake results
Retrying on res.status === "failed" causes the same wake target to be requeued indefinitely at a 1s cadence, and this now applies to every handled heartbeat failure, not just contention. In runHeartbeatOnce, many persistent errors are surfaced as { status: "failed" } (for example auth/config/provider errors in src/infra/heartbeat-runner.ts catch handling), so a single exec-triggered wake can spin forever and continuously emit retries/log noise instead of failing once. Consider restricting retries to transient failure reasons or adding a bounded retry policy.
Useful? React with 👍 / 👎.
Greptile SummaryThis PR adds retry logic for failed heartbeat handler results, treating Critical issue: The PR description claims to pass
Confidence Score: 1/5
Last reviewed commit: 7a3fc5b |
| if ( | ||
| (res.status === "skipped" && res.reason === "requests-in-flight") || | ||
| res.status === "failed" | ||
| ) { |
There was a problem hiding this comment.
PR description claims sessionKey is now passed to requestHeartbeatNow in bash-tools.exec-runtime.ts (lines 241, 267) and server-node-events.ts (line 577), but those files weren't changed in this PR. The sessionKey parameter is available in those contexts but isn't being passed, meaning exec completion events may still target the wrong session.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/infra/heartbeat-wake.ts
Line: 156-159
Comment:
PR description claims `sessionKey` is now passed to `requestHeartbeatNow` in `bash-tools.exec-runtime.ts` (lines 241, 267) and `server-node-events.ts` (line 577), but those files weren't changed in this PR. The `sessionKey` parameter is available in those contexts but isn't being passed, meaning exec completion events may still target the wrong session.
How can I resolve this? If you propose a fix, please make it concise.
Summary
requestHeartbeatNowcalls from exec completion and node exec events were missing thesessionKeyparameter, causing the heartbeat runner to potentially target the wrong session. Additionally, when a heartbeat handler returned{ status: "failed" }, the wake was silently dropped instead of being retried.sessionKeytorequestHeartbeatNowinbash-tools.exec-runtime.ts(bothnotifyExecExitandemitExecSystemEvent) andserver-node-events.ts(exec event handler){ status: "failed" }handler results the same asrequests-in-flight— requeue the wake and schedule a retry afterDEFAULT_RETRY_MS(1s)sessionKeycorrectly)Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
User-visible / Behavior Changes
Security Impact (required)
NoNoNoNoNoRepro + Verification
Environment
Steps
Expected
Actual
requestHeartbeatNowwithoutsessionKeycould target wrong session;failedstatus was silently droppedsessionKeyis always forwarded;failedresults trigger retry with 1s backoffEvidence
Human Verification (required)
Compatibility / Migration
YesNoNoFailure Recovery (if this breaks)
src/infra/heartbeat-wake.ts,src/agents/bash-tools.exec-runtime.ts,src/gateway/server-node-events.tsRisks and Mitigations