Runtime: stabilize tool/run state transitions under compaction and backpressure by Takhoffman · Pull Request #33826 · openclaw/openclaw

Takhoffman · 2026-03-04T02:56:20Z

Summary

stabilize runtime state transitions for tool/run lifecycle under compaction and long-running handler backpressure
synthesize fixes from fix(anthropic): strip dangling tool_use blocks after compaction #33630 and fix(discord): decouple MESSAGE_CREATE handler from AI run execution to prevent blocking #33583 into one provider-safe runtime path

What changed

add shared run-state machine for busy/activeRuns/heartbeat/deactivation lifecycle handling
use the shared state machine in Discord message handler queue execution
keep Anthropic turn validation resilient by stripping dangling toolUse blocks after compaction/replay
preserve stale-busy recovery semantics in channel health policy/monitor paths
add changelog entry for this synthesized runtime fix

Regression coverage

compaction + replay idempotency in Anthropic turn validation
Discord queue recovery after a failed long-running run
stale busy/inherited busy recovery in channel health policy + monitor
shared run-state machine unit tests

Verification

pnpm install --frozen-lockfile
pnpm build
pnpm check
pnpm test:macmini

Provenance

synthesized from fix(anthropic): strip dangling tool_use blocks after compaction #33630 and fix(discord): decouple MESSAGE_CREATE handler from AI run execution to prevent blocking #33583

Fixes #33621 When compaction trims conversation history, some tool_use blocks may lose their corresponding tool_result blocks. This causes Anthropic to reject the history with 'tool_use ids found without tool_result blocks' error. This change adds stripDanglingAnthropicToolUses() which: - Removes tool_use blocks from assistant messages when the following user message doesn't have a matching tool_result (by tool_use_id) - Preserves non-tool content in assistant messages - Inserts '[tool calls omitted]' fallback when all content would be removed

…o prevent blocking (#33570)

greptile-apps · 2026-03-04T03:02:26Z

Greptile Summary

This PR stabilizes the runtime state machine for Discord message handler runs and Anthropic tool-call lifecycle by synthesizing fixes from #33630 and #33583. It adds a new shared RunStateMachine that tracks busy/activeRuns/heartbeat state and clears stale inherited snapshots on startup, wires per-channel run serialization via KeyedAsyncQueue to prevent concurrent handler races, strips dangling tool_use blocks after compaction/replay in validateAnthropicTurns, and extends the channel health policy with busy/stuck states so that long-running but legitimately active channels are not restarted while truly stale ones are.

Key changes:

run-state-machine.ts – new shared factory with onRunStart/onRunEnd, abort/deactivate guards, and a 60-second heartbeat; emits an immediate { activeRuns: 0, busy: false } reset on init to overwrite stale inherited snapshots
message-handler.ts – replaces fire-and-forget processDiscordMessage calls with KeyedAsyncQueue keyed on session/channel, so queued messages for the same channel serialize without blocking the event loop for other channels
turns.ts – stripDanglingAnthropicToolUses removes tool_use blocks whose IDs have no corresponding tool_result in the immediately following user message, inserting a [tool calls omitted] fallback when the whole content array would otherwise become empty
channel-health-policy.ts – adds busy/stuck evaluation reasons; the busyStateInitializedForLifecycle guard (lastRunActivityAt >= lastStartAt) prevents stale busy flags inherited across a restart from suppressing stuck-channel recovery
protocol/schema/channels.ts + types.core.ts – extends ChannelAccountSnapshot with the three new run-state fields

Confidence Score: 4/5

Safe to merge; the PR implements a well-structured shared state machine with comprehensive test coverage and appropriate error handling across all changed paths.
The PR synthesizes two prior fixes into a cohesive solution with solid test coverage (run-state-machine unit tests, Discord queue tests, health-policy and health-monitor tests, Anthropic turn-validation tests). Key logic paths — queue-based run serialization, busyStateInitializedForLifecycle guard, and tool_use stripping — are all exercised. No functional issues identified. A score of 4 reflects confidence in the implementation with standard rigor for runtime state machinery.
No files require special attention.

_{Last reviewed commit: 3299bd5}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3299bd5274

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Takhoffman · 2026-03-04T03:19:12Z

Addressed the current review feedback in eccd84586.

Fixed items:

Guarded Anthropic assistant content normalization so non-array content values no longer throw during dangling toolUse cleanup.
Added regression coverage for malformed/legacy assistant content in validateAnthropicTurns.
Prevented queued Discord runs from executing after lifecycle deactivation/abort by gating queued task execution on runtime lifecycle activity.
Added regression coverage to verify queued follow-up runs are skipped after handler deactivation.

Validation rerun:

pnpm test -- src/agents/pi-embedded-helpers.validate-turns.test.ts src/discord/monitor/message-handler.queue.test.ts src/channels/run-state-machine.test.ts
pnpm check

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eccd84586e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-04T03:27:27Z

+  // First, strip dangling tool_use blocks from assistant messages
+  const stripped = stripDanglingAnthropicToolUses(messages);
+
  return validateTurnsWithConsecutiveMerge({
-    messages,
+    messages: stripped,


Merge user turns before stripping assistant toolUse blocks

validateAnthropicTurns currently runs stripDanglingAnthropicToolUses before mergeConsecutiveUserTurns, so it only inspects the immediately following user message when deciding whether a toolUse is dangling. If the matching toolResult sits in a second consecutive user turn (a case the validator is meant to normalize), the tool call is removed anyway, and the later merged user turn still contains that toolResult without a matching tool call, which can produce Anthropic turn-validation failures and lose valid tool context.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-04T03:27:28Z

+      result.push({
+        ...assistantMsg,
+        content: filteredContent,


Preserve legacy assistant content when it is not an array

When assistantMsg.content is a legacy non-array value, originalContent is forced to [] and then written back as content: filteredContent, so the validator silently erases the original assistant content whenever the next message is a user turn. This is a regression from previous behavior (which left these messages intact), and it can both drop prompt context and emit empty assistant turns in replayed histories.

Useful? React with 👍 / 👎.

Kevin Shenghui and others added 11 commits March 3, 2026 20:24

fix(discord): decouple MESSAGE_CREATE handler from AI run execution t…

33d4b9d

…o prevent blocking (#33570)

fix(discord): preserve handler ordering and stale busy recovery

f63aac0

chore: fix formatting for blocking listener health changes

ab43221

test: remove redundant deferred type arguments

0550787

fix: reset and heartbeat discord run busy state

4153086

fix: gate discord run status to active lifecycle

ba7cf6f

fix: ignore inherited busy flags before run status init

016404c

fix: deactivate discord status publishing on lifecycle exit

c95f184

runtime: share run-state machine and add replay recovery tests

4d99ed4

changelog: add runtime tool-state synthesis notes

3299bd5

openclaw-barnacle Bot added channel: discord Channel integration: discord app: web-ui App: web-ui gateway Gateway runtime agents Agent runtime and tooling size: XL maintainer Maintainer-authored PR labels Mar 4, 2026

chatgpt-codex-connector Bot reviewed Mar 4, 2026

View reviewed changes

Comment thread src/agents/pi-embedded-helpers/turns.ts Outdated

Comment thread src/discord/monitor/message-handler.ts

runtime: address review feedback for tool content and lifecycle queueing

eccd845

Takhoffman merged commit 9889c6d into main Mar 4, 2026
29 checks passed

Takhoffman deleted the task/workstream-5-state-machine branch March 4, 2026 03:25

chatgpt-codex-connector Bot reviewed Mar 4, 2026

View reviewed changes

This was referenced Mar 4, 2026

fix(anthropic): strip dangling tool_use blocks after compaction #33630

Closed

fix(discord): decouple MESSAGE_CREATE handler from AI run execution to prevent blocking #33583

Closed

clawsweeper Bot mentioned this pull request Apr 29, 2026

Test/run state machine clamp guard #64649

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Runtime: stabilize tool/run state transitions under compaction and backpressure#33826

Runtime: stabilize tool/run state transitions under compaction and backpressure#33826
Takhoffman merged 12 commits intomainfrom
task/workstream-5-state-machine

Takhoffman commented Mar 4, 2026

Uh oh!

greptile-apps Bot commented Mar 4, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Takhoffman commented Mar 4, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Mar 4, 2026

Uh oh!

chatgpt-codex-connector Bot Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Takhoffman commented Mar 4, 2026

Summary

What changed

Regression coverage

Verification

Provenance

Uh oh!

greptile-apps Bot commented Mar 4, 2026

Greptile Summary

Confidence Score: 4/5

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Takhoffman commented Mar 4, 2026

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants