fix(gateway): deliver targeted exec-event wakes when heartbeat.every is disabled#79869
fix(gateway): deliver targeted exec-event wakes when heartbeat.every is disabled#79869xinhuagu wants to merge 2 commits into
Conversation
|
Codex review: needs real behavior proof before merge. Reviewed June 1, 2026, 1:13 AM ET / 05:13 UTC. Summary PR surface: Source +87, Tests +129. Total +216 across 3 files. Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path. Review metrics: none identified. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Risk before merge
Maintainer options:
Next step before merge
Review detailsBest possible solution: Retry the Codex review after fixing the execution failure. Do we have a high-confidence way to reproduce the issue? Unclear. The review failed before ClawSweeper could establish a reproduction path. Is this the best way to solve the issue? Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction. AGENTS.md: unclear because the file could not be read completely. Codex review notes: model gpt-5.5, reasoning high; reviewed against c0195f7ed579. Label changesLabel changes:
Label justifications:
Evidence reviewedPR surface: Source +87, Tests +129. Total +216 across 3 files. View PR surface stats
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 078830765f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return await runOnce({ | ||
| cfg: state.cfg, | ||
| agentId: targetAgentId, | ||
| heartbeat: resolveRequestedHeartbeat(resolveHeartbeatConfig(state.cfg, targetAgentId)), |
There was a problem hiding this comment.
Preserve exec-event cooldown when every is 0m
When a targeted exec-event wake arrives for an agent whose heartbeat.every is 0m, state.agents has no HeartbeatAgentState, so this new fallback calls runOnce directly and skips the evaluateWakeDeferral/recordRunBookkeeping path used by normal targeted exec wakes. That reopens the runaway case documented in heartbeat-cooldown.ts and the scheduler regression test: repeated background process exits for the same session can invoke a fresh heartbeat/model run every time instead of respecting the min-spacing/flood guard, specifically for users who disable interval heartbeats but still want exec-event delivery.
Useful? React with 👍 / 👎.
…t.every is disabled
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e7576ebf33
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const deferral = evaluateWakeDeferral(fallbackAgent, now, reason, intent); | ||
| if (deferral.defer) { | ||
| return { status: "skipped", reason: deferral.reason }; |
There was a problem hiding this comment.
Requeue deferred zero-interval exec events
When heartbeat.every is 0m, this new fallback state is not in state.agents, so a second targeted exec-event that arrives within the 30s cooldown returns skipped/not-due here but is never retried: heartbeat-wake.ts:229-240 only requeues retryable busy reasons, and scheduleNext() still returns early when state.agents.size === 0. In the common case of two background execs finishing close together, the later system event stays queued indefinitely unless some unrelated wake happens after the cooldown.
Useful? React with 👍 / 👎.
The cron `wake` MCP tool currently forwards only `{mode, text}` to the
gateway. Every wake then enqueues a system event with no sessionKey /
agentId, so the cron service falls back to the heartbeat / main
default. Wakes scheduled from a non-main session (Telegram thread,
Discord channel, multi-agent setup) silently route to the wrong
conversation lane — and on CLI runtimes the woken session burns
tokens generating output that no caller can route back.
Origin capture (closes the upstream half of openclaw#46886 and openclaw#64556):
- `src/agents/tools/cron-tool.ts` — the `wake` case now resolves
`opts.agentSessionKey` through `resolveInternalSessionKey` and
`resolveSessionAgentId`, matching the existing `add` action at
L569-583. Explicit `sessionKey` / `agentId` params on the tool
call take precedence over the inferred values so cross-session
wakes remain expressible.
- `src/gateway/protocol/schema/agent.ts` — `WakeParamsSchema` now
declares optional `sessionKey` and `agentId` (NonEmptyString)
so the gateway-level validator types them explicitly. The
schema's `additionalProperties: true` continues to accept
forward-compat metadata unchanged.
- `src/gateway/server-methods/cron.ts` — the wake handler reads
both fields, trims them, and forwards to `context.cron.wake`.
- `src/cron/service.ts` + `src/cron/service/ops.ts` +
`src/cron/service/timer.ts` — `wake()` accepts optional
`sessionKey`/`agentId` and threads them into
`enqueueSystemEvent` and `requestHeartbeatNow`. Missing /
whitespace-only fields fall through to the dep's default so
pre-existing call sites with no origin keep behaving the same
way (backwards compatible).
Tests:
- `src/cron/service/wake-origin.test.ts` (new, 6 tests) —
direct seams on `wake()`: origin forwarded on `mode: "now"`,
queued on `mode: "next-heartbeat"`, default fallback when
origin omitted, whitespace-only origin treated as omitted,
empty text still rejected.
- `src/gateway/protocol/index.test.ts` — extends
`validateWakeParams` coverage with the new optional fields
accepted + empty-string rejected.
Out of scope (deliberate split):
- Channel/thread/topic capture on the job's `delivery` block —
follow-up PR once this contract lands. The minimum-viable fix
here is session routing, which unblocks the CLI runtime's
`--resume <session>` path and the embedded session-resolution
path without a schema rewrite of the delivery contract.
- The 5 stalled wake-related PRs (openclaw#70268, openclaw#57199, openclaw#82767,
openclaw#79869, openclaw#63096) each fix downstream specifics. This PR fixes
the upstream origin-capture they all silently assume.
The cron `wake` MCP tool currently forwards only `{mode, text}` to the
gateway. Every wake then enqueues a system event with no sessionKey /
agentId, so the cron service falls back to the heartbeat / main
default. Wakes scheduled from a non-main session (Telegram thread,
Discord channel, multi-agent setup) silently route to the wrong
conversation lane — and on CLI runtimes the woken session burns
tokens generating output that no caller can route back.
Origin capture (closes the upstream half of openclaw#46886 and openclaw#64556):
- `src/agents/tools/cron-tool.ts` — the `wake` case now resolves
`opts.agentSessionKey` through `resolveInternalSessionKey` and
`resolveSessionAgentId`, matching the existing `add` action at
L569-583. Explicit `sessionKey` / `agentId` params on the tool
call take precedence over the inferred values so cross-session
wakes remain expressible.
- `src/gateway/protocol/schema/agent.ts` — `WakeParamsSchema` now
declares optional `sessionKey` and `agentId` (NonEmptyString)
so the gateway-level validator types them explicitly. The
schema's `additionalProperties: true` continues to accept
forward-compat metadata unchanged.
- `src/gateway/server-methods/cron.ts` — the wake handler reads
both fields, trims them, and forwards to `context.cron.wake`.
- `src/cron/service.ts` + `src/cron/service/ops.ts` +
`src/cron/service/timer.ts` — `wake()` accepts optional
`sessionKey`/`agentId` and threads them into
`enqueueSystemEvent` and `requestHeartbeatNow`. Missing /
whitespace-only fields fall through to the dep's default so
pre-existing call sites with no origin keep behaving the same
way (backwards compatible).
Tests:
- `src/cron/service/wake-origin.test.ts` (new, 6 tests) —
direct seams on `wake()`: origin forwarded on `mode: "now"`,
queued on `mode: "next-heartbeat"`, default fallback when
origin omitted, whitespace-only origin treated as omitted,
empty text still rejected.
- `src/gateway/protocol/index.test.ts` — extends
`validateWakeParams` coverage with the new optional fields
accepted + empty-string rejected.
Out of scope (deliberate split):
- Channel/thread/topic capture on the job's `delivery` block —
follow-up PR once this contract lands. The minimum-viable fix
here is session routing, which unblocks the CLI runtime's
`--resume <session>` path and the embedded session-resolution
path without a schema rewrite of the delivery contract.
- The 5 stalled wake-related PRs (openclaw#70268, openclaw#57199, openclaw#82767,
openclaw#79869, openclaw#63096) each fix downstream specifics. This PR fixes
the upstream origin-capture they all silently assume.
The cron `wake` MCP tool currently forwards only `{mode, text}` to the
gateway. Every wake then enqueues a system event with no sessionKey /
agentId, so the cron service falls back to the heartbeat / main
default. Wakes scheduled from a non-main session (Telegram thread,
Discord channel, multi-agent setup) silently route to the wrong
conversation lane — and on CLI runtimes the woken session burns
tokens generating output that no caller can route back.
Origin capture (closes the upstream half of openclaw#46886 and openclaw#64556):
- `src/agents/tools/cron-tool.ts` — the `wake` case now resolves
`opts.agentSessionKey` through `resolveInternalSessionKey` and
`resolveSessionAgentId`, matching the existing `add` action at
L569-583. Explicit `sessionKey` / `agentId` params on the tool
call take precedence over the inferred values so cross-session
wakes remain expressible.
- `src/gateway/protocol/schema/agent.ts` — `WakeParamsSchema` now
declares optional `sessionKey` and `agentId` (NonEmptyString)
so the gateway-level validator types them explicitly. The
schema's `additionalProperties: true` continues to accept
forward-compat metadata unchanged.
- `src/gateway/server-methods/cron.ts` — the wake handler reads
both fields, trims them, and forwards to `context.cron.wake`.
- `src/cron/service.ts` + `src/cron/service/ops.ts` +
`src/cron/service/timer.ts` — `wake()` accepts optional
`sessionKey`/`agentId` and threads them into
`enqueueSystemEvent` and `requestHeartbeatNow`. Missing /
whitespace-only fields fall through to the dep's default so
pre-existing call sites with no origin keep behaving the same
way (backwards compatible).
Tests:
- `src/cron/service/wake-origin.test.ts` (new, 6 tests) —
direct seams on `wake()`: origin forwarded on `mode: "now"`,
queued on `mode: "next-heartbeat"`, default fallback when
origin omitted, whitespace-only origin treated as omitted,
empty text still rejected.
- `src/gateway/protocol/index.test.ts` — extends
`validateWakeParams` coverage with the new optional fields
accepted + empty-string rejected.
Out of scope (deliberate split):
- Channel/thread/topic capture on the job's `delivery` block —
follow-up PR once this contract lands. The minimum-viable fix
here is session routing, which unblocks the CLI runtime's
`--resume <session>` path and the embedded session-resolution
path without a schema rewrite of the delivery contract.
- The 5 stalled wake-related PRs (openclaw#70268, openclaw#57199, openclaw#82767,
openclaw#79869, openclaw#63096) each fix downstream specifics. This PR fixes
the upstream origin-capture they all silently assume.
|
This pull request has been automatically marked as stale due to inactivity. |
Summary
exec-eventheartbeat wakes to run even whenheartbeat.everyresolves disabled (0m)runHeartbeatOnceProblem
Issue #62505 reports that coding-agent/background work can appear to never complete. One narrow cause is that background exec completion already enqueues an
exec-eventwake, but heartbeat dispatch drops it whenheartbeat.everyis disabled. That blocks the one-shot completion wake even though it is not a periodic heartbeat run.Changes
src/infra/heartbeat-runner.tsrunHeartbeatOnce(...)to proceed forexec-eventwakes when the interval is otherwise disabledexec-eventwakes through the runner even when no interval-backed agent state existssrc/infra/heartbeat-runner.scheduler.test.tssrc/infra/heartbeat-runner.returns-default-unset.test.tsTesting
pnpm exec oxlint src/infra/heartbeat-runner.ts src/infra/heartbeat-runner.scheduler.test.ts src/infra/heartbeat-runner.returns-default-unset.test.tstest/non-isolated-runner.tsthrowsClass extends value undefined is not a constructor or nullCloses #62505