Skip to content

fix(agents): preserve reasoning_content replay for Gemma 4 openai-completions models#91696

Merged
vincentkoc merged 2 commits into
openclaw:mainfrom
Coder-Wangyankun:fix/gemma4-reasoning-content-tool-replay
Jun 10, 2026
Merged

fix(agents): preserve reasoning_content replay for Gemma 4 openai-completions models#91696
vincentkoc merged 2 commits into
openclaw:mainfrom
Coder-Wangyankun:fix/gemma4-reasoning-content-tool-replay

Conversation

@Coder-Wangyankun

@Coder-Wangyankun Coder-Wangyankun commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #91645 — Gemma 4 models via openai-completions (vLLM, OpenRouter, etc.) were silently losing in-turn reasoning_content during multi-turn tool replay, causing degraded tool-call quality (arguments collapse to {}, repeated identical tool calls).

Root Cause

Two places explicitly excluded Gemma 4 from reasoning replay:

  1. shouldTrustReasoningContentReplayMetadata() in openai-transport-stream.ts — skipped Gemma 4 by model ID, causing sanitizeCompletionsReasoningReplayFields to strip reasoning_content from the outgoing request.

  2. buildOpenAICompatibleReplayPolicy() in provider-replay-helpers.ts — forced dropReasoningFromHistory=true for Gemma 4, discarding reasoning blocks during history sanitization before they even reached the transport layer.

Changes

File Change
src/agents/openai-transport-stream.ts Remove isGemma4ModelId exclusion from shouldTrustReasoningContentReplayMetadata
src/plugins/provider-replay-helpers.ts Remove forced dropReasoningFromHistory override for Gemma 4
src/agents/openai-transport-stream.test.ts Add test fixture and case for Gemma 4 reasoning_content preservation
src/plugins/provider-replay-helpers.test.ts Update assertion: Gemma 4 with explicit dropReasoningFromHistory: false is now respected

Safety

Existing safeguards remain intact:

  • Stock OpenAI providers still strip reasoning fields
  • Non-reasoning models are unaffected
  • OpenRouter Anthropic/xAI model IDs are still correctly gated

Linked context

Which issue does this close?

Closes #91645

Which issues, PRs, or discussions are related?

Related: #88071 (prior reasoning replay metadata work that introduced the Gemma 4 exclusion)

Was this requested by a maintainer or owner?

No.

Real behavior proof (required for external PRs)

  • Behavior or issue addressed: Gemma 4 openai-completions reasoning models (vLLM, OpenRouter) lost in-turn reasoning_content during multi-turn tool replay. The fix removes two explicit Gemma 4 exclusions so reasoning preserves through the existing provider/metadata gate, matching the contract used by DeepSeek, Xiaomi, Kimi, and custom reasoning providers already on the allowlist.
  • Real environment tested: local OpenClaw worktree on Windows, branch fix/gemma4-reasoning-content-tool-replay. Unit test validation via Vitest covers the serialization path exercised by vLLM and OpenRouter Gemma 4 configurations.
  • Exact steps or command run after this patch:
# Unit-level regression coverage for the transport serialization path
node scripts/run-vitest.mjs run --config test/vitest/vitest.unit.config.ts   src/agents/openai-transport-stream.test.ts   src/plugins/provider-replay-helpers.test.ts
  • Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output):
# openai-transport-stream.test.ts
✓ preserves reasoning_content replay for Gemma 4 openai-completions models
  - assistant.reasoning_content === "Need to answer politely."
  - reasoning_details, reasoning, reasoning_text absent

# provider-replay-helpers.test.ts
✓ drops historical reasoning for OpenAI-compatible chat completions replay
  - Gemma 4 with dropReasoningFromHistory: false → not.toHaveProperty("dropReasoningFromHistory")
  - Gemma 3 with dropReasoningFromHistory: false → not.toHaveProperty("dropReasoningFromHistory")
  (Gemma 3 and Gemma 4 now have consistent behavior; Gemma 4 is no longer force-overridden)
  • Observed result after fix: the new Gemma 4 test passes alongside all existing reasoning replay tests. The two Gemma 4 exclusions are removed from the codebase and the transport serializer now treats Gemma 4 consistently with other non-OpenAI reasoning models.
  • What was not tested: a live vLLM + Gemma-4-12B multi-turn agent run with wire-level request capture to confirm reasoning_content is present in replayed assistant messages. This requires GPU hardware (RTX 3090 or cloud GPU) and was not completed as part of this PR.
  • Proof limitations or environment constraints: this PR is a targeted code fix validated through code-path analysis and unit test coverage. The root cause is unambiguous (two model-id gating points that explicitly excluded Gemma 4 from reasoning replay). Full live validation with vLLM + Gemma 4 requires a GPU-equipped environment.
  • Before evidence (optional but encouraged): the issue reporter ([Bug]: In-turn reasoning dropped on multi-turn tool replay for non-400 openai models (gemma4/vLLM) — silent agentic-quality regression #91645) captured TCP-tee evidence between OpenClaw and vLLM showing reasoning_content present on 0/3 in-turn assistant tool-call messages, with 13 exec calls exhibiting empty arguments and repeated identical tool calls within sessions.

Tests and validation

  • node scripts/run-vitest.mjs run --config test/vitest/vitest.unit.config.ts src/agents/openai-transport-stream.test.ts
  • node scripts/run-vitest.mjs run --config test/vitest/vitest.unit.config.ts src/plugins/provider-replay-helpers.test.ts

The test suite covers:

Scenario Test
Gemma 4 reasoning_content preserved preserves reasoning_content replay for Gemma 4 openai-completions models
Gemma 4 dropReasoningFromHistory respected drops historical reasoning for OpenAI-compatible chat completions replay
Stock OpenAI still strips reasoning strips %s from stock OpenAI Chat Completions assistant replay
Custom reasoning models preserved preserves reasoning_content replay for custom reasoning model metadata
DeepSeek/Xiaomi/Z.AI preserved preserves native %s reasoning_content replay
Tier suffixes handled preserves reasoning_content replay despite the %s tier suffix

Risk checklist

Did user-visible behavior change? (Yes/No)

No — reasoning_content was already being captured during streaming; the fix only stops stripping it during replay. Models that never produced reasoning_content are unaffected.

Did config, environment, or migration behavior change? (Yes/No)

No.

Did security, auth, secrets, network, or tool execution behavior change? (Yes/No)

No.

What is the highest-risk area?

A custom provider that sends reasoning_content and does NOT expect it back in assistant messages could receive unexpected fields. This risk is mitigated by the existing provider gating (stock OpenAI still strips, non-reasoning models skip).

How is that risk mitigated?

The fix uses the existing shouldTrustReasoningContentReplayMetadata gate which already correctly handles stock OpenAI, OpenRouter Anthropic/xAI, and non-reasoning models. Gemma 4 was the only model family explicitly excluded despite matching the same criteria as other preserved models.

…pletions models (openclaw#91645)

Gemma 4 models via openai-completions (vLLM, OpenRouter, etc.) were
silently losing in-turn reasoning_content during multi-turn tool replay.
Unlike DeepSeek/Xiaomi which 400 on the missing field, these providers
accept the request but produce degraded tool-call quality — arguments
collapse to {} and identical tool calls repeat.

Root cause: two places explicitly excluded Gemma 4 from reasoning replay:

1. shouldTrustReasoningContentReplayMetadata() in openai-transport-stream
   skipped Gemma 4 by model ID, causing sanitizeCompletionsReasoningReplayFields
   to strip reasoning_content before sending.

2. buildOpenAICompatibleReplayPolicy() in provider-replay-helpers forced
   dropReasoningFromHistory=true for Gemma 4, discarding reasoning blocks
   during history sanitization before they reached the transport layer.

Remove both exclusions. Gemma 4 on non-OpenAI providers now preserves
in-turn reasoning through the existing provider/metadata checks that
already gate this correctly for other reasoning models.
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 9, 2026
@clawsweeper

clawsweeper Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 9, 2026, 10:08 AM ET / 14:08 UTC.

Summary
The PR removes Gemma 4-specific reasoning_content replay exclusions from the OpenAI transport and shared OpenAI-compatible provider replay helper, with focused unit test updates.

PR surface: Source -3, Tests +24. Total +21 across 4 files.

Reproducibility: yes. at source level: current main excludes Gemma 4 from transport reasoning replay and forces Gemma 4 provider policies to drop reasoning. I did not run a live vLLM/Gemma replay, and the PR body says that live proof was not completed.

Review metrics: 1 noteworthy metric.

  • Gemma 4 replay policy gates: 2 removed. The branch removes one transport gate and one shared provider-helper gate, which changes both request serialization and provider replay policy behavior.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🧂 unranked krab
Result: blocked until real behavior proof from a real setup is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • Reconcile the Gemma 4 policy for SGLang and canonical OpenAI-compatible replay, then update the affected tests.
  • [P1] Add redacted after-fix live provider/proxy proof showing Gemma 4 replay carries non-empty reasoning_content.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR body supplies unit-test output but explicitly says no live vLLM plus Gemma 4 replay capture was completed, so external-PR real behavior proof is still needed before merge. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • [P1] Merging as-is changes Gemma 4 replay behavior for SGLang and the canonical OpenAI-compatible replay helper even though current tests explicitly expect those paths to drop reasoning.
  • [P1] The external PR proof is mock/unit-level only; there is no redacted live vLLM/OpenRouter/SGLang request-body capture showing after-fix non-empty in-turn reasoning_content on replay.

Maintainer options:

  1. Reconcile Gemma 4 Provider Policy (recommended)
    Decide whether SGLang and the canonical OpenAI-compatible helper should preserve Gemma 4 reasoning, then update the helper and all affected provider/SDK tests before merge.
  2. Narrow The Provider Scope
    Keep the shared provider helper's Gemma 4 drop behavior and limit the fix to the reported transport/provider paths if maintainers do not want a broad SGLang policy change.
  3. Accept The Broader Change With Proof
    Maintainers can intentionally accept Gemma 4 reasoning replay for all affected OpenAI-compatible providers, but should require live request-body proof and updated compatibility notes first.

Next step before merge

  • [P1] Human review is needed to decide the Gemma 4 provider replay policy and require live after-fix proof; this is not a safe automated repair lane yet.

Security
Cleared: The diff only changes replay conditionals and unit tests; it does not alter dependencies, CI, package scripts, auth, secrets, permissions, or other supply-chain surfaces.

Review findings

  • [P1] Reconcile Gemma 4 replay policy before removing the override — src/plugins/provider-replay-helpers.ts:55-56
Review details

Best possible solution:

Make a maintainer-visible Gemma 4 provider replay policy decision, update every affected helper/provider test to match it, and require redacted live replay proof before merge.

Do we have a high-confidence way to reproduce the issue?

Yes, at source level: current main excludes Gemma 4 from transport reasoning replay and forces Gemma 4 provider policies to drop reasoning. I did not run a live vLLM/Gemma replay, and the PR body says that live proof was not completed.

Is this the best way to solve the issue?

No as submitted: removing both exclusions may be a plausible fix, but it is not yet the best mergeable path because it contradicts SGLang/SDK replay-policy tests and lacks live provider proof. The safer path is to reconcile the provider policy first, then prove the intended behavior on a real OpenAI-compatible Gemma 4 backend.

Full review comments:

  • [P1] Reconcile Gemma 4 replay policy before removing the override — src/plugins/provider-replay-helpers.ts:55-56
    This removes the Gemma 4 override from the shared OpenAI-compatible helper, but current main still has SGLang and SDK tests asserting that Gemma 4 chat-completions replay drops reasoning (extensions/sglang/index.test.ts and src/plugin-sdk/provider-model-shared.test.ts). After this branch, those provider policies no longer return dropReasoningFromHistory, so the PR changes existing provider behavior without updating the adjacent contract/tests or recording the policy decision.
    Confidence: 0.92

Overall correctness: patch is incorrect
Overall confidence: 0.86

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 98d5c465308a.

Label changes

Label justifications:

  • P2: The PR targets a bounded but real agent/provider replay quality bug without evidence of a core outage or security emergency.
  • merge-risk: 🚨 compatibility: The diff changes shipped Gemma 4 replay behavior for shared OpenAI-compatible provider policy paths, including SGLang expectations on current main.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🧂 unranked krab.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body supplies unit-test output but explicitly says no live vLLM plus Gemma 4 replay capture was completed, so external-PR real behavior proof is still needed before merge. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.
Evidence reviewed

PR surface:

Source -3, Tests +24. Total +21 across 4 files.

View PR surface stats
Area Files Added Removed Net
Source 2 2 5 -3
Tests 2 25 1 +24
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 4 27 6 +21

What I checked:

Likely related people:

  • Josh Avant: Blame and history show the current Gemma 4 transport/provider replay exclusions and the matching SGLang/SDK tests came from commit 9fdd56d. (role: recent area contributor; confidence: high; commits: 9fdd56da2106; files: src/agents/openai-transport-stream.ts, src/plugins/provider-replay-helpers.ts, extensions/sglang/index.test.ts)
  • chengzhichao-xydt: A related closed unmerged PR proposed dropReasoningFromHistory configurability for OpenAI-compatible providers, so they have adjacent context but not current-main ownership. (role: related prior proposer; confidence: low; files: src/agents/transcript-policy.ts, src/plugins/provider-replay-helpers.ts)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. labels Jun 9, 2026
@openclaw-barnacle openclaw-barnacle Bot added proof: supplied External PR includes structured after-fix real behavior proof. and removed triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. labels Jun 9, 2026
@vincentkoc vincentkoc self-assigned this Jun 10, 2026
@vincentkoc

Copy link
Copy Markdown
Member

Maintainer verification complete.

  • Rebased onto current origin/main.
  • Focused proof: node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts src/plugins/provider-replay-helpers.test.ts — 261 tests passed.
  • Fresh autoreview: clean, no accepted/actionable findings.
  • Best-fix judgment: yes. This keeps the change at the provider replay-policy boundary and retains the stock OpenAI / unsupported-family safeguards.

@vincentkoc vincentkoc merged commit 78a5e3e into openclaw:main Jun 10, 2026
158 checks passed
github-actions Bot pushed a commit to Desicool/openclaw that referenced this pull request Jun 11, 2026
…pletions models (openclaw#91645) (openclaw#91696)

Gemma 4 models via openai-completions (vLLM, OpenRouter, etc.) were
silently losing in-turn reasoning_content during multi-turn tool replay.
Unlike DeepSeek/Xiaomi which 400 on the missing field, these providers
accept the request but produce degraded tool-call quality — arguments
collapse to {} and identical tool calls repeat.

Root cause: two places explicitly excluded Gemma 4 from reasoning replay:

1. shouldTrustReasoningContentReplayMetadata() in openai-transport-stream
   skipped Gemma 4 by model ID, causing sanitizeCompletionsReasoningReplayFields
   to strip reasoning_content before sending.

2. buildOpenAICompatibleReplayPolicy() in provider-replay-helpers forced
   dropReasoningFromHistory=true for Gemma 4, discarding reasoning blocks
   during history sanitization before they reached the transport layer.

Remove both exclusions. Gemma 4 on non-OpenAI providers now preserves
in-turn reasoning through the existing provider/metadata checks that
already gate this correctly for other reasoning models.

Co-authored-by: wangyk <prowangyankun@foxmail.com>
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request Jun 12, 2026
…26.6.6) (#1040)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.6.5` → `2026.6.6` |

---

### Release Notes

<details>
<summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary>

### [`v2026.6.6`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#202666)

[Compare Source](openclaw/openclaw@v2026.6.5...v2026.6.6)

##### Highlights

- Security boundaries are substantially tighter across transcripts, sandbox binds, host environment inheritance, MCP stdio, Codex HTTP access, native search policy, elevated sender checks, deleted-agent ACP bypasses, loopback tools, Discord moderation, and Teams group actions; exec approvals now fail closed on timeout. ([#&#8203;91529](openclaw/openclaw#91529), [#&#8203;91618](openclaw/openclaw#91618), [#&#8203;91615](openclaw/openclaw#91615), [#&#8203;91619](openclaw/openclaw#91619), [#&#8203;91741](openclaw/openclaw#91741), [#&#8203;91745](openclaw/openclaw#91745), [#&#8203;91746](openclaw/openclaw#91746), [#&#8203;91748](openclaw/openclaw#91748), [#&#8203;91749](openclaw/openclaw#91749), [#&#8203;91750](openclaw/openclaw#91750), [#&#8203;91751](openclaw/openclaw#91751), [#&#8203;91752](openclaw/openclaw#91752), [#&#8203;91763](openclaw/openclaw#91763), [#&#8203;89938](openclaw/openclaw#89938)) Thanks [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;mmaps](https://github.com/mmaps), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;shakkernerd](https://github.com/shakkernerd), and [@&#8203;drobison00](https://github.com/drobison00).
- Telegram delivery is safer and more coherent: account-scoped topics route to the right agent, streamed text survives tool calls, `/compact` works on generic ingress, callback handling uses concrete APIs, draft chunking is shared, durable dispatch dedupe moved into the SDK, and unauthorized DM text stays out of cache and prompt context. ([#&#8203;91189](openclaw/openclaw#91189), [#&#8203;88682](openclaw/openclaw#88682), [#&#8203;89588](openclaw/openclaw#89588), [#&#8203;90212](openclaw/openclaw#90212), [#&#8203;91876](openclaw/openclaw#91876), [#&#8203;91874](openclaw/openclaw#91874), [#&#8203;91904](openclaw/openclaw#91904), [#&#8203;91478](openclaw/openclaw#91478), [#&#8203;91915](openclaw/openclaw#91915)) Thanks [@&#8203;codysai001](https://github.com/codysai001), [@&#8203;alexzhu0](https://github.com/alexzhu0), [@&#8203;joelnishanth](https://github.com/joelnishanth), [@&#8203;snowzlm](https://github.com/snowzlm), [@&#8203;obviyus](https://github.com/obviyus), and [@&#8203;sallyom](https://github.com/sallyom).
- iMessage recovery and delivery now cover always-on inbound restart, durable echo markers, block streaming, idle approval discovery, hardened outbound transport, and actionable inbound startup diagnostics. ([#&#8203;91335](openclaw/openclaw#91335), [#&#8203;91449](openclaw/openclaw#91449), [#&#8203;88969](openclaw/openclaw#88969), [#&#8203;88530](openclaw/openclaw#88530), [#&#8203;91783](openclaw/openclaw#91783), [#&#8203;91785](openclaw/openclaw#91785)) Thanks [@&#8203;omarshahine](https://github.com/omarshahine), [@&#8203;jmissig](https://github.com/jmissig), and [@&#8203;colmbrogan](https://github.com/colmbrogan).
- Browser and MCP connectivity gained existing-session CDP support, discovered WebSocket validation, default-profile `cdpUrl` handling, safer browser-output boundaries, Streamable HTTP loopback transport, corrected OAuth/SSE authorization handling, and broader schema compatibility. ([#&#8203;91422](openclaw/openclaw#91422), [#&#8203;89851](openclaw/openclaw#89851), [#&#8203;91736](openclaw/openclaw#91736), [#&#8203;91747](openclaw/openclaw#91747), [#&#8203;91451](openclaw/openclaw#91451), [#&#8203;80143](openclaw/openclaw#80143)) Thanks [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia), [@&#8203;lifuyue](https://github.com/lifuyue), [@&#8203;eleqtrizit](https://github.com/eleqtrizit), [@&#8203;LiuwqGit](https://github.com/LiuwqGit), and [@&#8203;HemantSudarshan](https://github.com/HemantSudarshan).
- Control UI startup and first-reply latency are lower through cached model metadata, removal of the startup catalog wait, lazy slash-command loading, and first-event tracing with slow-reply diagnostics. ([#&#8203;91531](openclaw/openclaw#91531), [#&#8203;91538](openclaw/openclaw#91538), [#&#8203;91568](openclaw/openclaw#91568), [#&#8203;91583](openclaw/openclaw#91583), [#&#8203;91598](openclaw/openclaw#91598))
- Provider support expands with OpenRouter OAuth onboarding and Claude Fable 5 adaptive thinking, while Codex sessions keep correct compaction ownership, local models skip guardian review, dynamic tool progress normalizes cleanly, and Gemma 4 reasoning replay is preserved. ([#&#8203;91830](openclaw/openclaw#91830), [#&#8203;91882](openclaw/openclaw#91882), [#&#8203;91590](openclaw/openclaw#91590), [#&#8203;88630](openclaw/openclaw#88630), [#&#8203;88768](openclaw/openclaw#88768), [#&#8203;91696](openclaw/openclaw#91696)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;bdjben](https://github.com/bdjben), and [@&#8203;Coder-Wangyankun](https://github.com/Coder-Wangyankun).

##### Changes

- CLI progress: emit Claude CLI commentary progress events and bridge inter-tool commentary into channel progress without exposing internal protocol scaffolding. ([#&#8203;89834](openclaw/openclaw#89834), [#&#8203;90883](openclaw/openclaw#90883)) Thanks [@&#8203;anagnorisis2peripeteia](https://github.com/anagnorisis2peripeteia).
- Observability: allow trusted diagnostics channels to capture tool input/output content, add first-assistant-event traces, and warn on slow initial replies. ([#&#8203;91256](openclaw/openclaw#91256), [#&#8203;91568](openclaw/openclaw#91568), [#&#8203;91583](openclaw/openclaw#91583)) Thanks [@&#8203;amknight](https://github.com/amknight).
- Plugins/ClawHub: dogfood reusable package publishing, let dry runs skip publish approval, allow declared installed trusted hooks, report managed plugin version drift, and warn instead of failing on retired Skill Workshop configuration. ([#&#8203;91574](openclaw/openclaw#91574), [#&#8203;91591](openclaw/openclaw#91591), [#&#8203;90004](openclaw/openclaw#90004), [#&#8203;90927](openclaw/openclaw#90927), [#&#8203;90838](openclaw/openclaw#90838)) Thanks [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen), [@&#8203;brokemac79](https://github.com/brokemac79), and [@&#8203;lonexreb](https://github.com/lonexreb).
- Memory/providers: move the local llama.cpp runtime into its provider plugin, batch embeddings across files, persist the agent model catalog cache, and keep QMD JSON search one-shot while filtering stale REM recall previews. ([#&#8203;91324](openclaw/openclaw#91324), [#&#8203;89138](openclaw/openclaw#89138), [#&#8203;90457](openclaw/openclaw#90457), [#&#8203;91837](openclaw/openclaw#91837), [#&#8203;91851](openclaw/openclaw#91851)) Thanks [@&#8203;osolmaz](https://github.com/osolmaz), [@&#8203;mushuiyu886](https://github.com/mushuiyu886), [@&#8203;ai-hpc](https://github.com/ai-hpc), and [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- Channels/mobile: add the QQBot group mention toggle, improve iPad and iPhone control surfaces, and expose the active connection host in the TUI footer. ([#&#8203;91423](openclaw/openclaw#91423), [#&#8203;91557](openclaw/openclaw#91557), [#&#8203;89909](openclaw/openclaw#89909)) Thanks [@&#8203;cxyhhhhh](https://github.com/cxyhhhhh), [@&#8203;Solvely-Colin](https://github.com/Solvely-Colin), and [@&#8203;baskduf](https://github.com/baskduf).
- Performance: prewarm TUI runtime plugins, deduplicate plugin auto-enable fanout, trim dense text-delta snapshots, and reuse prepared startup model metadata. ([#&#8203;90782](openclaw/openclaw#90782), [#&#8203;89978](openclaw/openclaw#89978), [#&#8203;91580](openclaw/openclaw#91580), [#&#8203;91531](openclaw/openclaw#91531)) Thanks [@&#8203;RomneyDa](https://github.com/RomneyDa) and [@&#8203;ai-hpc](https://github.com/ai-hpc).

##### Fixes

- Agent/session recovery: drop stale approval follow-ups after session rebind, remove drained reply-queue items by identity, recover stale main and visible replies, preserve Codex context-engine compaction ownership, lower the default compaction timeout to 180 seconds while respecting explicit configuration, and keep provider-failure terminal lifecycle state correct. ([#&#8203;85679](openclaw/openclaw#85679), [#&#8203;91450](openclaw/openclaw#91450), [#&#8203;91566](openclaw/openclaw#91566), [#&#8203;91840](openclaw/openclaw#91840), [#&#8203;91590](openclaw/openclaw#91590), [#&#8203;91361](openclaw/openclaw#91361), [#&#8203;91895](openclaw/openclaw#91895)) Thanks [@&#8203;openperf](https://github.com/openperf), [@&#8203;yetval](https://github.com/yetval), [@&#8203;joshavant](https://github.com/joshavant), [@&#8203;wangmiao0668000666](https://github.com/wangmiao0668000666), and [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- User-visible content boundaries: suppress Codex/Harmony protocol artifacts, neutralize browser and LanceDB memory media directives, redact transcript images, and preserve native `/compact` replies through source suppression. ([#&#8203;89151](openclaw/openclaw#89151), [#&#8203;91422](openclaw/openclaw#91422), [#&#8203;91425](openclaw/openclaw#91425), [#&#8203;91529](openclaw/openclaw#91529), [#&#8203;90212](openclaw/openclaw#90212)) Thanks [@&#8203;joelnishanth](https://github.com/joelnishanth), [@&#8203;pgondhi987](https://github.com/pgondhi987), [@&#8203;joshavant](https://github.com/joshavant), and [@&#8203;snowzlm](https://github.com/snowzlm).
- Channel delivery: keep WhatsApp captured replies attached to the successor controller after restart, retry Feishu rate limits, preserve Mattermost thread replies, canonicalize LINE webhook paths, restore Discord reply hydration and runtime timeout exports, and show OpenAI Realtime WebRTC assistant transcripts. ([#&#8203;85823](openclaw/openclaw#85823), [#&#8203;89659](openclaw/openclaw#89659), [#&#8203;91684](openclaw/openclaw#91684), [#&#8203;91649](openclaw/openclaw#91649), [#&#8203;90263](openclaw/openclaw#90263), [#&#8203;91686](openclaw/openclaw#91686), [#&#8203;90426](openclaw/openclaw#90426)) Thanks [@&#8203;itsuzef](https://github.com/itsuzef), [@&#8203;ladygege](https://github.com/ladygege), [@&#8203;jacobtomlinson](https://github.com/jacobtomlinson), [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), and [@&#8203;shushushv](https://github.com/shushushv).
- Cron: cancel active task runs cleanly, preserve terminal timeout/cancel state, and recover no-deliver tool warnings instead of silently losing the outcome. ([#&#8203;90666](openclaw/openclaw#90666), [#&#8203;90678](openclaw/openclaw#90678)) Thanks [@&#8203;ai-hpc](https://github.com/ai-hpc).
- Gateway/config/auth: share the approval runtime socket token, replace arrays explicitly in `config.patch`, skip the deleted-agent guard only for valid ACP harness sessions, surface headless LaunchAgent state, verify SQLite auth migration before cleanup, and arm QMD startup maintenance. ([#&#8203;87105](openclaw/openclaw#87105), [#&#8203;91551](openclaw/openclaw#91551), [#&#8203;91219](openclaw/openclaw#91219), [#&#8203;91614](openclaw/openclaw#91614), [#&#8203;91740](openclaw/openclaw#91740), [#&#8203;91978](openclaw/openclaw#91978)) Thanks [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev) and [@&#8203;scotthuang](https://github.com/scotthuang).
- Providers/Codex: clarify quota errors, restore the Codex synthetic usage line, canonicalize Codex protocol assets, require API-key auth for realtime voice, normalize ACP model refs, preserve Gemma 4 `reasoning_content`, and avoid guardian review for local models. ([#&#8203;91390](openclaw/openclaw#91390), [#&#8203;91709](openclaw/openclaw#91709), [#&#8203;91507](openclaw/openclaw#91507), [#&#8203;91567](openclaw/openclaw#91567), [#&#8203;88630](openclaw/openclaw#88630), [#&#8203;91696](openclaw/openclaw#91696)) Thanks [@&#8203;hxy91819](https://github.com/hxy91819), [@&#8203;brokemac79](https://github.com/brokemac79), [@&#8203;RomneyDa](https://github.com/RomneyDa), [@&#8203;joshavant](https://github.com/joshavant), and [@&#8203;Coder-Wangyankun](https://github.com/Coder-Wangyankun).
- Updates/builds: recover package Gateway restarts after refresh failure, expose plugin convergence repair, fall back to Corepack in PATH-less pnpm environments, seed the correct Docker store packages, and keep ClawHub dry-run and publish paths reusable. ([#&#8203;91581](openclaw/openclaw#91581), [#&#8203;91599](openclaw/openclaw#91599), [#&#8203;91547](openclaw/openclaw#91547), [#&#8203;91591](openclaw/openclaw#91591)) Thanks [@&#8203;fuller-stack-dev](https://github.com/fuller-stack-dev), [@&#8203;sallyom](https://github.com/sallyom), and [@&#8203;Patrick-Erichsen](https://github.com/Patrick-Erichsen).
- UI: require explicit user intent before opening chat sessions and drain restored chat queues after session switches. ([#&#8203;91480](openclaw/openclaw#91480)) Thanks [@&#8203;TurboTheTurtle](https://github.com/TurboTheTurtle).
- Android: avoid the `dataSync` foreground-service type for persistent nodes. ([#&#8203;80082](openclaw/openclaw#80082)) Thanks [@&#8203;davelutztx](https://github.com/davelutztx).
- Native hooks: bound relay lifetimes so abandoned native hook connections cannot linger indefinitely. ([#&#8203;91550](openclaw/openclaw#91550)) Thanks [@&#8203;joshavant](https://github.com/joshavant).

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/1040
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. P2 Normal backlog priority with limited blast radius. proof: supplied External PR includes structured after-fix real behavior proof. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: XS status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: In-turn reasoning dropped on multi-turn tool replay for non-400 openai models (gemma4/vLLM) — silent agentic-quality regression

2 participants