Skip to content

fix: ensure exec notifyOnExit wakes heartbeat as exec-event#41479

Merged
vincentkoc merged 1 commit into
openclaw:mainfrom
rstar327:fix-exec-notify-heartbeat
Apr 5, 2026
Merged

fix: ensure exec notifyOnExit wakes heartbeat as exec-event#41479
vincentkoc merged 1 commit into
openclaw:mainfrom
rstar327:fix-exec-notify-heartbeat

Conversation

@rstar327

@rstar327 rstar327 commented Mar 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: When a backgrounded exec session exits, notifyOnExit requests a heartbeat with reason exec:<session-id>:exit. The heartbeat runner only bypasses the empty HEARTBEAT.md guard for reasons exec-event, wake, hook:*, cron:* — so exec:<id>:exit is skipped and the agent never gets a turn for the system event.
  • Why it matters: Background exec completion events sit in the queue until the next user message (e.g. 70+ minutes), so the agent does not react to exec completion when HEARTBEAT.md is empty.
  • What changed: maybeNotifyOnExit now requests the heartbeat with reason exec-event (same as node exec events), so the bypass applies and the agent runs on background exec exit.
  • What did NOT change (scope boundary): Only the wake reason string; no changes to HEARTBEAT.md semantics, exec-event handling, or other heartbeat reasons.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

Background exec exit now triggers an agent turn even when HEARTBEAT.md is empty (or comments-only). Previously the system event was queued but the heartbeat was skipped, so the agent did not run until the next user message.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Any (code path is platform-agnostic)
  • Runtime/container: Node
  • Model/provider: N/A
  • Integration/channel (if any): N/A
  • Relevant config (redacted): tools.exec.notifyOnExit: true (default), empty or comments-only HEARTBEAT.md

Steps

  1. Set HEARTBEAT.md to empty or comments only.
  2. Start a background exec (e.g. long-running command with background: true).
  3. Wait for the exec to exit.
  4. Observe heartbeat/agent behavior.

Expected

  • Heartbeat is requested with a reason that bypasses empty HEARTBEAT.md; agent runs and processes the exec completion system event.

Actual (before fix)

  • Heartbeat requested with exec:<id>:exit; bypass list does not match; heartbeat skipped; event stays queued until next user message.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Existing exec notifyOnExit tests updated to expect reason: "exec-event"; they assert the wake is still scoped/unscoped by session key as before.

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: Logic follows existing emitExecSystemEvent use of exec-event; resolveHeartbeatReasonKind and bypass in heartbeat-runner already treat exec-event as event-driven.
  • Edge cases checked: Scoped vs unscoped session key behavior preserved (tests).
  • What you did not verify: Full E2E with real background exec and HEARTBEAT.md on a running gateway (no local pnpm/run).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: Revert commit or set tools.exec.notifyOnExit: false to disable exec-exit notifications entirely.
  • Files/config to restore: N/A
  • Known bad symptoms reviewers should watch for: None expected; change is a single reason string.

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • None.

@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Mar 9, 2026
@greptile-apps

greptile-apps Bot commented Mar 9, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes a bug where background exec process completion heartbeats were failing to bypass HEARTBEAT.md empty-guard checks, causing agent turns to not be triggered reliably after a background exec command finishes.

Root cause: The dynamic reason string `exec:${session.id}:exit` used in maybeNotifyOnExit did not match any known pattern in resolveHeartbeatReasonKind, resolving to "other". This caused isExecEventReason to be false and shouldBypassFileGates to be false, so the heartbeat runner would fall through to HEARTBEAT.md gating — and if the file was empty, the heartbeat would be skipped entirely.

The fix: Replaces the dynamic reason with the standardized "exec-event" string (matching the reason already used by emitExecSystemEvent), which correctly resolves to the exec-event kind, sets isExecEventReason: true, and bypasses file gates.

  • src/agents/bash-tools.exec-runtime.ts: maybeNotifyOnExit now uses reason: "exec-event" instead of reason: `exec:${session.id}:exit`
  • src/agents/bash-tools.test.ts: Two test assertions updated to expect "exec-event" instead of the old dynamic string

Confidence Score: 5/5

  • This PR is safe to merge — it's a minimal, well-targeted bug fix with correctly updated tests.
  • The change is a single-line fix replacing a dynamic reason string that was silently mis-classified as "other" with the canonical "exec-event" reason that already exists throughout the codebase. The fix aligns maybeNotifyOnExit with emitExecSystemEvent (which already used the correct reason), has no side effects, and the test updates accurately reflect the new behavior. No other callers of the old dynamic format exist in the codebase.
  • No files require special attention.

Last reviewed commit: d05a712

@rstar327 rstar327 force-pushed the fix-exec-notify-heartbeat branch 3 times, most recently from a0a1c2f to c924cf6 Compare March 9, 2026 23:28
@holgergruenhagen

Copy link
Copy Markdown

Reproduced and validated this live in a Discord group-channel setup.

What we observed before the fix:

  • local background exec completion did enqueue the system event
  • but no autonomous visible follow-up was sent to the originating Discord channel
  • the completion only surfaced later on the next user turn
  • our HEARTBEAT.md was effectively empty/comments-only, so the heartbeat path mattered

After tracing the live runtime, the root cause matched this PR exactly:

  • notifyOnExit was waking heartbeat with exec:<session-id>:exit
  • that did not bypass the empty-HEARTBEAT guard like canonical exec-event does

We then applied the equivalent minimal local fix in our running install:

  • change the wake reason from exec:<session-id>:exit to exec-event
  • reload OpenClaw
  • leave the turn idle so the autonomous heartbeat could run

Result:

  • repeated smoke tests started producing visible completion messages in the originating Discord channel
  • we also validated it with a real background coding task, not just a toy sleep command
  • the important part was the user-visible completion message in Discord, not only the internal System: Exec completed(...) line

In our setup, heartbeat delivery back to the chat was already enabled, so the remaining blocker was specifically the wake-reason mismatch.

So from downstream live validation: this PR matches the minimal fix that resolved the bug for us.

Related broader cleanup / superset approach: #42394

@vincentkoc vincentkoc force-pushed the fix-exec-notify-heartbeat branch from 7896cf1 to 2630905 Compare April 5, 2026 08:24
@vincentkoc vincentkoc self-assigned this Apr 5, 2026
@vincentkoc vincentkoc merged commit 43fe68f into openclaw:main Apr 5, 2026
9 checks passed
lovewanwan pushed a commit to lovewanwan/openclaw that referenced this pull request Apr 28, 2026
ogt-redknie pushed a commit to ogt-redknie/OPENX that referenced this pull request May 2, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 9, 2026
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: XS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

notifyOnExit wake reason doesn't bypass empty HEARTBEAT.md guard

3 participants