Skip to content

[Fix] Throttle agent event fanout#80335

Merged
jalehman merged 7 commits into
openclaw:mainfrom
samzong:fix/agent-event-throttle
May 14, 2026
Merged

[Fix] Throttle agent event fanout#80335
jalehman merged 7 commits into
openclaw:mainfrom
samzong:fix/agent-event-throttle

Conversation

@samzong

@samzong samzong commented May 10, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Problem: High-volume Gateway agent text events could fan out too frequently while a run was streaming, and the throttle path could drop an intermediate buffered delta when a later post-window delta arrived.
  • Why it matters: Streaming clients should receive fewer redundant events without losing assistant or thinking text bytes.
  • What changed: Added stream-specific Gateway agent-event throttling, cumulative text-delta buffering, lifecycle-safe first text emission, and cleanup for the new throttle state.
  • Follow-up fix: Flush pending assistant/thinking agent deltas before sending a post-window delta, and wire the real chat.abort request path to clear agent throttle buffers.
  • What did NOT change: No protocol shape change, no UI behavior change, no plugin/channel-specific policy, and no live model/provider configuration change.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • This PR fixes a bug or regression

Real behavior proof (required for external PRs)

External contributors must show after-fix evidence from a real OpenClaw setup. Unit tests, mocks, lint, typechecks, snapshots, and CI are supplemental only. Screenshots are encouraged even for CLI, console, text, or log changes; terminal screenshots and copied live output count. Be mindful of private information like IP addresses, API keys, phone numbers, non-public endpoints, or other private details when providing evidence.

  • Behavior or issue addressed: Gateway WebSocket agent-event fanout for assistant text streams must throttle bursts without dropping the buffered middle delta when a later post-window delta arrives.
  • Real environment tested: Local isolated OpenClaw Gateway WebSocket server from this worktree, started with loopback binding, external channels/providers skipped, and a real GatewayClient connected as an auto-approved operator device with operator.read, operator.write, and operator.admin scopes.
  • Exact steps or command run after this patch:
    1. Ran pnpm exec tsx - with a temporary proof script that starts startGatewayServer, connects GatewayClient, registers sessionKey: "main", emits assistant deltas Hel, lo, and ! with 50ms/170ms timing around the 150ms throttle window, and asserts the WebSocket client receives all three deltas in order.
  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): Copied live console output from the Gateway proof script.
    REAL_BEHAVIOR_PROOF agent event throttle
    gateway=ws://127.0.0.1:56626
    authScopes=["operator.admin","operator.read","operator.write"]
    runId=proof-run-fb786c05-5cff-46c4-bb83-873f4ba2b492
    observedAssistantDeltas=["Hel","lo","!"]
    observedAssistantSeqs=[1,2,3]
    result=middle delta delivered before post-window delta; no assistant text loss
    
  • Observed result after fix: The real Gateway WebSocket client received the buffered middle assistant delta (lo) before the later post-window delta (!), preserving assistant text order and content across the throttle boundary.
  • What was not tested: External provider credentials, external channel delivery, and UI screenshots; the changed surface is server-side Gateway event fanout and abort cleanup.
  • Before evidence (optional but encouraged): The previous implementation deleted the buffered assistant delta when the next post-window delta arrived, so a Hel / lo / ! burst could be observed as Hel / !.

Root Cause (if applicable)

  • Root cause: The throttle state originally treated buffered text as replaceable latest-state, so the next post-window text event deleted the pending buffered delta instead of flushing it first. The direct chat.abort request path also did not receive the new agent throttle maps, so it could not clear stale assistant/thinking throttle state.
  • Missing detection / guardrail: Gateway event tests covered burst coalescing but did not assert the boundary case where an in-window buffered delta is followed by a post-window delta.
  • Contributing context (if known): The event path mixes lifecycle and text payloads under the same agent event channel, so throttle state needs to distinguish stream family from run lifecycle and be cleaned through every run-abort path.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/gateway/server-chat.agent-events.test.ts, src/gateway/server-methods/chat.abort-authorization.test.ts, src/gateway/chat-abort.test.ts, src/gateway/server-maintenance.test.ts, src/gateway/server-startup-early.test.ts, src/gateway/server-request-context.test.ts
  • Scenario the test should lock in: High-frequency assistant/thinking text bursts are coalesced with cumulative deltas, buffered deltas are flushed before post-window deltas, lifecycle/start does not seed the text throttle window, and abort/maintenance cleanup removes stream-specific throttle state.
  • Why this is the smallest reliable guardrail: The affected behavior is the Gateway server event fanout path and its internal throttle state; focused Gateway tests exercise that path without requiring provider credentials or a live model.
  • Existing test that already covers this (if any): Existing burst coverage existed but was not specific enough for the post-window buffered-delta flush and real abort wiring.
  • If no new test is added, why not: N/A, new coverage is included.

User-visible / Behavior Changes

Gateway clients may receive fewer redundant high-frequency assistant/thinking text agent events during streaming bursts, while receiving the same cumulative text content after throttled flushes.

Diagram (if applicable)

Before:
[agent text burst] -> [stream throttle] -> [drop pending buffered delta on post-window send]

After:
[agent text burst] -> [stream throttle] -> [flush pending buffered delta before post-window send]

Security Impact (required)

  • New permissions/capabilities? (Yes/No): No
  • Secrets/tokens handling changed? (Yes/No): No
  • New/changed network calls? (Yes/No): No
  • Command/tool execution surface changed? (Yes/No): No
  • Data access scope changed? (Yes/No): No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS local development worktree
  • Runtime/container: Node 22+ with repo pnpm scripts
  • Model/provider: N/A; the real proof uses Gateway WebSocket agent-event fanout directly and does not require provider credentials
  • Integration/channel (if any): Gateway server event fanout
  • Relevant config (redacted): Isolated loopback Gateway state; external channels/providers skipped

Steps

  1. Run the real Gateway WebSocket proof script described above.
  2. Run focused Gateway event, abort, maintenance, startup, and request-context tests.
  3. Run changed-code checks, format validation, and diff validation.

Expected

  • Assistant/thinking text bursts are throttled without losing intermediate deltas.
  • Lifecycle/start events do not consume the text throttle window.
  • Abort and maintenance cleanup remove stream-specific throttle entries.

Actual

  • Real Gateway WebSocket proof observed assistant deltas ["Hel","lo","!"] with seqs [1,2,3].
  • Focused Gateway tests passed: Test Files 6 passed (6), Tests 85 passed (85).
  • pnpm check:changed, pnpm check:changed --staged, pnpm format:check -- <changed gateway files>, and git diff --check passed locally.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: Real Gateway WebSocket assistant delta fanout, cumulative assistant text delta buffering, thinking stream coalescing, lifecycle/start first-text behavior, media/replace non-coalescing behavior, abort cleanup, and maintenance cleanup.
  • Edge cases checked: Stream-specific throttle keys for assistant and thinking text, non-text lifecycle events, media payloads, replace payloads, and completed-run cleanup.
  • What you did not verify: External provider credentials, external channel delivery, and UI screenshots.
  • AI-assisted: Yes, Codex assisted with implementation and review.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No): Yes
  • Config/env changes? (Yes/No): No
  • Migration needed? (Yes/No): No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: Event timing changes can hide regressions if future Gateway event payloads are incorrectly classified as coalescible text.
    • Mitigation: Coalescing is limited to assistant/thinking text deltas, excludes media and replace payloads, and is covered by focused Gateway tests.

@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime size: L triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 10, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 98ec3eeef4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/gateway/server-chat.ts Outdated
Comment thread src/gateway/chat-abort.ts Outdated
@clawsweeper

clawsweeper Bot commented May 10, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs changes before merge.

Summary
The PR adds Gateway assistant/thinking agent-event throttling with buffered delta flushing and abort/maintenance cleanup, plus focused Gateway tests and a plugin runtime config-scope update.

Reproducibility: Partially: current-main source gives a high-confidence path for burst fanout because non-tool agent events broadcast immediately while chat deltas are throttled. I did not execute a live current-main repro in this read-only pass, and the PR body supplies after-fix GatewayClient output for the ordering case.

Real behavior proof
Sufficient (live_output): The PR body supplies copied live output from a real local Gateway WebSocket server and GatewayClient showing the after-fix assistant delta ordering for the changed Gateway path.

Next step before merge
A narrow automated repair can preserve the existing deprecated config helper wrappers, add the missing assertions, and leave the Gateway throttle implementation intact.

Security
Cleared: No concrete security or supply-chain concern found; the diff stays in Gateway/plugin runtime TypeScript and tests without changing dependencies, workflows, permissions, secrets, downloads, or command execution surfaces.

Review findings

  • [P2] Keep deprecated config helpers scoped — src/plugins/registry.ts:2402
Review details

Best possible solution:

Land the Gateway throttle after preserving scoped wrappers for deprecated config helpers and adding coverage that proves all runtime config helpers run under the owning plugin scope.

Do we have a high-confidence way to reproduce the issue?

Partially: current-main source gives a high-confidence path for burst fanout because non-tool agent events broadcast immediately while chat deltas are throttled. I did not execute a live current-main repro in this read-only pass, and the PR body supplies after-fix GatewayClient output for the ordering case.

Is this the best way to solve the issue?

Not yet: the Gateway design is narrow and consistent with the affected fanout path, but the plugin registry update should keep the deprecated load/write wrappers while adding scoped current/mutate/replace wrappers.

Full review comments:

  • [P2] Keep deprecated config helpers scoped — src/plugins/registry.ts:2402
    The returned config object now only wraps current, mutateConfigFile, and replaceConfigFile, so loadConfig and writeConfigFile come from the spread and no longer run inside runWithPluginScope. Those deprecated helpers are still public compatibility surfaces, and their warning key/source attribution reads the plugin runtime scope; legacy plugins will now lose per-plugin diagnostics. Please keep the existing deprecated wrappers while adding wrappers for the newer helpers.
    Confidence: 0.88

Overall correctness: patch is incorrect
Overall confidence: 0.86

Acceptance criteria:

  • node scripts/run-vitest.mjs src/plugins/registry.runtime-config.test.ts
  • node scripts/run-vitest.mjs src/gateway/server-chat.agent-events.test.ts src/gateway/chat-abort.test.ts src/gateway/server-methods/chat.abort-authorization.test.ts src/gateway/server-maintenance.test.ts src/gateway/server-startup-early.test.ts src/gateway/server-request-context.test.ts
  • git diff --check

What I checked:

  • Current-main fanout path: Current main throttles the chat projection at the 150 ms delta window but still broadcasts visible non-tool agent events immediately, so the central Gateway fanout problem is not already solved on main. (src/gateway/server-chat.ts:778, e44b915dbf6b)
  • PR throttle implementation: The PR patch adds stream-specific assistant/thinking throttle keys, buffered agent-event state, flushing before later sends, and abort/maintenance cleanup wiring. (src/gateway/server-chat.ts:658, 1f42171e9a8b)
  • Regression coverage: The patch adds Gateway agent-event tests for coalescing, lifecycle flush, post-window Hel/lo/! ordering, cross-stream ordering, thinking streams, media/replace events, abort cleanup, and maintenance cleanup. (src/gateway/server-chat.agent-events.test.ts:401, 1f42171e9a8b)
  • Prior review comments addressed: The earlier bot review comments asked for buffered-delta flush before post-window sends and required abort cleanup maps; the current patch includes both changes. (src/gateway/server-chat.ts:770, 1f42171e9a8b)
  • Plugin config regression: Current main deliberately wraps deprecated runtime config helpers in plugin scope; the PR replaces those explicit wrappers with current/mutate/replace wrappers while leaving loadConfig/writeConfigFile only through the spread. (src/plugins/registry.ts:2402, e44b915dbf6b)
  • Deprecated helper contract: loadConfig and writeConfigFile remain public deprecated PluginRuntime config methods, and runtime-config warning attribution reads the plugin runtime scope, so dropping the wrappers loses per-plugin/source diagnostics. (src/plugins/runtime/types-core.ts:163, e44b915dbf6b)

Likely related people:

  • steipete: Recent current-main commits cover the same Gateway chat delta/event path and the plugin runtime config deprecation attribution path implicated by the review finding. (role: recent area contributor; confidence: high; commits: a6497b175905, 150bebcd0ce7, 4d8aec82106a; files: src/gateway/server-chat.ts, src/gateway/server-chat.agent-events.test.ts, src/plugins/registry.ts)
  • samzong: Besides proposing this PR, samzong has current-main Gateway delta and agent broadcast payload history in the same streaming surface. (role: recent Gateway delta contributor; confidence: high; commits: 10315ce21593, 443ca4865d61; files: src/gateway/server-chat.ts, src/gateway/server-chat.agent-events.test.ts)
  • jalehman: Josh Lehman has current-main Gateway chat/tool display history and authored several head commits in this PR, including the latest plugin registry adjustment that needs follow-up. (role: recent adjacent contributor; confidence: medium; commits: 4bfd7416f0f9, 30018bddc611, 1f42171e9a8b; files: src/gateway/server-chat.ts, src/gateway/talk-realtime-relay.test.ts, src/plugins/registry.ts)

Remaining risk / open question:

  • The latest head includes a plugin runtime config change that is not covered by the supplied Gateway real-behavior proof.
  • Focused tests were not run in this read-only review; the verdict is based on source, patch, PR discussion, and supplied proof inspection.

Codex review notes: model gpt-5.5, reasoning high; reviewed against e44b915dbf6b.

@samzong samzong force-pushed the fix/agent-event-throttle branch from 98ec3ee to 240d48c Compare May 10, 2026 17:10
@samzong samzong requested a review from a team as a code owner May 10, 2026 17:10
@openclaw-barnacle openclaw-barnacle Bot added app: web-ui App: web-ui triage: blank-template Candidate: PR template appears mostly untouched. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels May 10, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@jalehman jalehman self-assigned this May 10, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@jalehman jalehman force-pushed the fix/agent-event-throttle branch from 97e3934 to 13f6e8a Compare May 10, 2026 21:59
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 10, 2026
@jalehman jalehman force-pushed the fix/agent-event-throttle branch from 13f6e8a to 2f5f574 Compare May 10, 2026 22:12
@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@samzong samzong force-pushed the fix/agent-event-throttle branch from 2f5f574 to 8071fc3 Compare May 11, 2026 15:34
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@samzong samzong force-pushed the fix/agent-event-throttle branch 2 times, most recently from 3daef21 to 2f5f574 Compare May 11, 2026 15:47
@samzong

samzong commented May 11, 2026

Copy link
Copy Markdown
Contributor Author

@jalehman sorry for the branch churn here. I accidentally rebased and force-pushed over your latest work while trying to clean up the PR branch.

I restored the PR head back to your last pushed commit, 2f5f574. Apologies for the disruption.

@clawsweeper clawsweeper Bot added the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 11, 2026
@samzong samzong force-pushed the fix/agent-event-throttle branch from 2f5f574 to 7af45dc Compare May 13, 2026 16:45
@openclaw-barnacle openclaw-barnacle Bot removed the proof: sufficient ClawSweeper judged the real behavior proof convincing. label May 13, 2026
@samzong

samzong commented May 13, 2026

Copy link
Copy Markdown
Contributor Author

@codex review
@clawsweeper re-review

@github-actions github-actions Bot added the dependencies-changed PR changes dependency-related files label May 14, 2026
@jalehman jalehman force-pushed the fix/agent-event-throttle branch from 8fe89bf to 5dddb40 Compare May 14, 2026 05:20
@github-actions github-actions Bot removed the dependencies-changed PR changes dependency-related files label May 14, 2026
@openclaw-barnacle openclaw-barnacle Bot added dependencies-changed PR changes dependency-related files size: L and removed docs Improvements or additions to documentation app: ios App: ios scripts Repository scripts commands Command implementations docker Docker and sandbox tooling agents Agent runtime and tooling extensions: codex size: XL labels May 14, 2026
@jalehman jalehman merged commit bb8aa0c into openclaw:main May 14, 2026
109 of 113 checks passed
@jalehman

Copy link
Copy Markdown
Contributor

Merged via squash.

Thanks @samzong!

github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request May 24, 2026
Merged via squash.

Prepared head SHA: 5dddb40
Co-authored-by: samzong <13782141+samzong@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
jameslcowan pushed a commit to jameslcowan/openclaw that referenced this pull request Jun 2, 2026
Merged via squash.

Prepared head SHA: 5dddb40
Co-authored-by: samzong <13782141+samzong@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
ericksoninco pushed a commit to ericksoninco/openclaw that referenced this pull request Jun 9, 2026
Merged via squash.

Prepared head SHA: 5dddb40
Co-authored-by: samzong <13782141+samzong@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
sablehead pushed a commit to sablehead/openclaw that referenced this pull request Jun 10, 2026
Merged via squash.

Prepared head SHA: 5dddb40
Co-authored-by: samzong <13782141+samzong@users.noreply.github.com>
Co-authored-by: jalehman <550978+jalehman@users.noreply.github.com>
Reviewed-by: @jalehman
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

app: web-ui App: web-ui dependencies-changed PR changes dependency-related files gateway Gateway runtime proof: sufficient ClawSweeper judged the real behavior proof convincing. proof: supplied External PR includes structured after-fix real behavior proof. size: L triage: blank-template Candidate: PR template appears mostly untouched.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants