fix(agents): trim dense text delta snapshots by vincentkoc · Pull Request #91580 · openclaw/openclaw

vincentkoc · 2026-06-09T03:08:37Z

Summary

Drops full accumulated assistant-message snapshots from plain text_delta events for OpenAI-compatible, OpenAI Responses, and Ollama streams.
Keeps full snapshots on stream start/end/done plus tool/thinking events, so final state and structured events still have the full message.
Updates the agent loop to reconstruct current assistant snapshots from lightweight text deltas.
Adds regression coverage for OpenAI-compatible, Responses, Ollama, and agent-loop consumers.

Linked context

Requested by maintainer follow-up on #86599.

Maintainer contract decision

Accept the exported stream contract change: plain AssistantMessageEvent text_delta events may omit partial to avoid retaining one full assistant snapshot per token. Consumers that need the live assistant message should replay delta from the latest start/end partial checkpoint or use the reconstructed message_update.message; text_start, text_end, thinking, tool, terminal, and final events still carry full snapshots.

Real behavior proof (required for external PRs)

Behavior or issue addressed: local/OpenAI-compatible dense streams no longer carry the full answer-so-far inside every plain text delta.
Real environment tested: Blacksmith Testbox / Crabbox on Linux, using a fake OpenAI-compatible local backend that emits 30,000 one-character SSE chunks.
Exact steps or command run after this patch: built the CLI through tsdown/runtime-postbuild, then ran node dist/entry.js infer model run --local --model local/dense --prompt hi --json against the fake backend.
Evidence after fix:

OpenClaw 2026.6.2 (aef1fad)
curl_elapsed=0.05 maxrss_kb=11264
openclaw_local_elapsed=3.51 maxrss_kb=658044
4830164 /tmp/oc86599runtime/curl.sse
  30204 /tmp/oc86599runtime/local.json
served_chunks=30000 elapsed_ms=56
served_chunks=30000 elapsed_ms=52

Observed result after fix: the fake backend served the dense stream in about 52-56ms, raw curl completed in 0.05s, and the built OpenClaw local model run completed in 3.51s without carrying 30k full partial snapshots.
What was not tested: Windows desktop with a real Ollama/llama.cpp model.
Proof limitations or environment constraints: Gateway infer model run --gateway was attempted in isolated Testbox state, but the client was blocked by Gateway device-pairing scope approval. Full pnpm build reached tsdown and runtime-postbuild, then the sparse GWT/Testbox omitted ui/config/control-ui-chunking.ts, so the UI build was not counted as passing proof.
Before evidence: prior current-main Crabbox repro for the same fake 30k-chunk backend measured raw curl at 0.03s, local OpenClaw at 4.11s, and gateway at 10.31s; the root issue was the repeated full partial payload work rather than backend generation time.

Tests and validation

Exact-head changed gate: node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox --timing-json -- env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed on tbx_01ktn8y7eph04qd9x71f4ez3we / violet-crab: passed in 9m21s.
Review-fix targeted proof: crabbox run --provider blacksmith-testbox --timing-json --shell -- 'git diff --check && node scripts/run-vitest.mjs run packages/agent-core/src/agent-loop.test.ts src/agents/openai-transport-stream.test.ts extensions/ollama/src/stream-runtime.test.ts' on tbx_01ktn8tztvfyjc9azkj8ayhpf3 / silver-shrimp: passed 347 tests in 54.3s.
Earlier exact-head targeted run before the review fix: same 3 Vitest shards passed on tbx_01ktn5hrk4akdp5558h8f5k8ws / swift-hermit.
ClawSweeper/autoreview on 9569e072e535d677a0dfa2c747a0b12a4ad8c667: ready for maintainer look, proof sufficient, no concrete contributor-facing blocker left.
GitHub checks on the exact PR head: pass or skip only; no pending/failing checks at merge review time.

Regression coverage added:

agent loop rebuilds message_update.message for immutable, partial-less text_delta snapshots.
OpenAI-compatible and Responses text deltas omit accumulated partial snapshots.
Ollama text deltas omit accumulated partial snapshots while text_end and done keep final content.

Risk checklist

Did user-visible behavior change? Yes

Did config, environment, or migration behavior change? No

Did security, auth, secrets, network, or tool execution behavior change? No

What is the highest-risk area?

Stream consumers that previously assumed every text delta carried partial.

How is that risk mitigated?

The type now marks only plain text-delta partial as optional and documents the replay contract.
Agent-loop reconstruction keeps message updates populated from lightweight deltas.
Tool calls, thinking blocks, text start/end, and final done events still carry snapshots.
Existing and added provider/agent tests cover the touched stream surfaces.

vincentkoc · 2026-06-09T03:08:43Z

@clawsweeper review

clawsweeper · 2026-06-09T03:08:45Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/27181344064
Updated: 2026-06-09T03:23:47.032Z

clawsweeper · 2026-06-09T03:10:34Z

Codex review: needs maintainer review before merge. Reviewed June 9, 2026, 12:13 AM ET / 04:13 UTC.

Summary
The PR makes plain text stream deltas omit accumulated assistant-message snapshots for OpenAI-compatible, Responses, and Ollama streams, then updates agent-loop consumption and regression tests.

PR surface: Source +40, Tests +149. Total +189 across 7 files.

Reproducibility: yes. from source and PR proof: dense streams currently carry accumulated snapshots on every plain text delta, and the PR body reports a 30,000-chunk Testbox reproduction. I did not rerun it in this read-only review.

Review metrics: 1 noteworthy metric.

Stream contract change: 1 exported event field changed from required to optional. AssistantMessageEvent.text_delta.partial is part of the stream/event contract consumed by agent and plugin code.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🦞 diamond lobster
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

none.

Risk before merge

[P1] Existing plugin or internal stream consumers that read assistantMessageEvent.partial on every text_delta will now see it missing for the affected providers unless they replay deltas or use the reconstructed message.
[P1] The PR body reports strong Testbox proof, but maintainers should still confirm the exact current head has green required checks before merge.

Maintainer options:

Accept the optional text-delta contract (recommended)
If maintainers agree text deltas may be lightweight, this PR’s producer/agent-loop/test shape is the right owner boundary for the performance fix.
Keep snapshots in the public contract
If external stream consumers must keep seeing partial on every text delta, replace this with an internal lightweight event or a separate dense-stream mode.
Pause until exact-head checks are green
If the current waiting-on-author state reflects unresolved checks, hold merge until the exact head has green targeted and changed validation.

Next step before merge

[P2] The remaining action is maintainer review of the exported stream-contract compatibility risk plus exact-head checks, not an automated repair.

Security
Cleared: The diff changes TypeScript stream handling and tests only; no dependency, workflow, secret, permission, or code-download surface is introduced.

Review details

Best possible solution:

Land the lightweight-delta path after maintainers accept the exported stream-contract change and the exact PR head passes the targeted and changed checks.

Do we have a high-confidence way to reproduce the issue?

Yes from source and PR proof: dense streams currently carry accumulated snapshots on every plain text delta, and the PR body reports a 30,000-chunk Testbox reproduction. I did not rerun it in this read-only review.

Is this the best way to solve the issue?

Yes, pending maintainer acceptance of the contract change: dropping repeated full snapshots at the provider stream boundary and reconstructing in the agent loop is narrower than adding config, throttling, or per-provider fallback branches.

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against f05e9873c6d2.

Label changes

Label changes:

add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
add status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body provides after-fix Testbox terminal output for a built CLI run against a fake dense OpenAI-compatible backend, plus before/current-main timing context.
remove rating: 🦪 silver shellfish: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.
remove status: ⏳ waiting on author: Current PR status label is status: 👀 ready for maintainer look.

Label justifications:

P1: The PR targets a P1 local-model responsiveness bug and changes a core stream contract used by agent/session consumers.
merge-risk: 🚨 compatibility: The exported AssistantMessageEvent.text_delta.partial shape changes from required to optional, which can affect existing stream consumers.
merge-risk: 🚨 session-state: Partial-less text deltas require consumers to reconstruct assistant message state correctly during active sessions.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🦞 diamond lobster and patch quality is 🐚 platinum hermit.
status: 👀 ready for maintainer look: ClawSweeper has no concrete contributor-facing blocker left for this PR. Sufficient (terminal): The PR body provides after-fix Testbox terminal output for a built CLI run against a fake dense OpenAI-compatible backend, plus before/current-main timing context.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body provides after-fix Testbox terminal output for a built CLI run against a fake dense OpenAI-compatible backend, plus before/current-main timing context.

Evidence reviewed

PR surface:

Source +40, Tests +149. Total +189 across 7 files.

View PR surface stats

Area	Files	Added	Removed	Net
Source	4	51	11	+40
Tests	3	157	8	+149
Docs	0	0	0	0
Config	0	0	0	0
Generated	0	0	0	0
Other	0	0	0	0
Total	7	208	19	+189

What I checked:

Repository policy applied: Root plus scoped agent and extension AGENTS.md files were read fully; the review applied the OpenClaw guidance that agent/plugin stream contracts and bundled-provider behavior are compatibility-sensitive. (AGENTS.md:27, f05e9873c6d2)
Public SDK contract change: openclaw/plugin-sdk/llm exports AssistantMessageEvent and AssistantMessageEventStreamContract, so making text_delta.partial optional is plugin-facing, not only an internal implementation detail. (src/plugin-sdk/llm.ts:20, f05e9873c6d2)
Optional text-delta partial contract: The PR head documents and types plain text_delta as allowing partial?: AssistantMessage, while other start/end/thinking/tool events keep required snapshots. (packages/llm-core/src/types.ts:369, 9569e072e535)
Agent-loop reconstruction: The second PR commit replaces the previous event.partial ?? partialMessage fallback with delta replay, so message_update events advance from partial-less text deltas instead of staying stale. (packages/agent-core/src/agent-loop.ts:69, 9569e072e535)
Provider producers trimmed dense snapshots: Responses and completions still maintain the internal output object but omit partial only from plain text_delta pushes. (src/agents/openai-transport-stream.ts:1577, 9569e072e535)
Ollama producer trimmed dense snapshots: The Ollama stream keeps start/end snapshots and omits partial from each visible text_delta after computing the incremental delta. (extensions/ollama/src/stream.ts:1325, 9569e072e535)

Likely related people:

vincentkoc: Current blame and -S history show this person on the LLM stream event contract, agent-loop update path, OpenAI transport seam, and the current PR commits. (role: recent area contributor; confidence: high; commits: 1019b591d55f, 4798e125f40c, 9569e072e535; files: packages/llm-core/src/types.ts, packages/agent-core/src/agent-loop.ts, src/agents/openai-transport-stream.ts)
Eva H: History search ties this person to Ollama thinking/streaming support that shares the touched Ollama stream runtime. (role: adjacent owner; confidence: medium; commits: d7bf97adb330; files: extensions/ollama/src/stream.ts)
Jakub Rusz: History search ties this person to the Ollama streaming text event implementation changed by this PR. (role: adjacent owner; confidence: medium; commits: 8f44bd642660; files: extensions/ollama/src/stream.ts)
steipete: History shows provider runtime movement and repeated touches across the central agent/provider stream files, so this is a useful routing candidate for owner-boundary review. (role: adjacent owner; confidence: medium; commits: 64bf80d4d50c; files: extensions/ollama/src/stream.ts, src/agents/openai-transport-stream.ts)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

vincentkoc · 2026-06-09T04:16:06Z

Land-ready verification for 9569e072e535d677a0dfa2c747a0b12a4ad8c667:

ClawSweeper/autoreview: ready for maintainer look, proof sufficient, no concrete contributor-facing blocker left.
Crabbox/Testbox changed gate: node scripts/crabbox-wrapper.mjs run --provider blacksmith-testbox --timing-json -- env OPENCLAW_CHECK_CHANGED_REMOTE_CHILD=1 OPENCLAW_CHANGED_LANES_RAW_SYNC=1 corepack pnpm check:changed on tbx_01ktn8y7eph04qd9x71f4ez3we / violet-crab, passed in 9m21s.
Targeted review-fix proof: git diff --check && node scripts/run-vitest.mjs run packages/agent-core/src/agent-loop.test.ts src/agents/openai-transport-stream.test.ts extensions/ollama/src/stream-runtime.test.ts on tbx_01ktn8tztvfyjc9azkj8ayhpf3 / silver-shrimp, passed 347 tests in 54.3s.
GitHub checks: pass or skip only at merge review time; no pending or failing checks.

Maintainer decision recorded in the PR body: accept optional partial on plain text_delta events, with consumers replaying deltas or using reconstructed message_update.message when they need the live assistant snapshot.

osolmaz · 2026-06-09T08:10:15Z

Adding attribution/context now that this landed and closed #86599:

This is directly related to the earlier root-cause analysis in #86599 (comment) and the broader prior implementation attempt in #87558.

The merged shape here is smaller and cleaner than #87558: it changes the stream contract so plain text_delta events may omit the full accumulated partial, and reconstructs live assistant state in the agent loop instead of adding the original createAssistantStreamAccumulator helper. But it addresses the same core issue diagnosed there: dense local-provider streams were repeatedly carrying/reprocessing the full answer-so-far on every text delta.

So I would treat #87558 and the issue analysis as relevant precursor work, while #91580 is the production-ready implementation shape that actually landed.

fix(agents): trim dense text delta snapshots

4ff4989

vincentkoc self-assigned this Jun 9, 2026

openclaw-barnacle Bot added agents Agent runtime and tooling extensions: ollama size: S maintainer Maintainer-authored PR labels Jun 9, 2026

fix(agents): advance lightweight text deltas

9569e07

openclaw-barnacle Bot added size: M and removed size: S labels Jun 9, 2026

vincentkoc marked this pull request as ready for review June 9, 2026 04:16

vincentkoc merged commit 6fcc945 into main Jun 9, 2026
231 of 239 checks passed

vincentkoc deleted the fix-86599-dense-stream-overhead branch June 9, 2026 04:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(agents): trim dense text delta snapshots#91580

fix(agents): trim dense text delta snapshots#91580
vincentkoc merged 2 commits into
mainfrom
fix-86599-dense-stream-overhead

vincentkoc commented Jun 9, 2026 •

edited

Loading

Uh oh!

vincentkoc commented Jun 9, 2026

Uh oh!

clawsweeper Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

vincentkoc commented Jun 9, 2026

Uh oh!

Uh oh!

osolmaz commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

vincentkoc commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Linked context

Maintainer contract decision

Real behavior proof (required for external PRs)

Tests and validation

Risk checklist

Uh oh!

vincentkoc commented Jun 9, 2026

Uh oh!

clawsweeper Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincentkoc commented Jun 9, 2026

Uh oh!

Uh oh!

osolmaz commented Jun 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vincentkoc commented Jun 9, 2026 •

edited

Loading

clawsweeper Bot commented Jun 9, 2026 •

edited

Loading

clawsweeper Bot commented Jun 9, 2026 •

edited

Loading