[Bug]: In-turn reasoning dropped on multi-turn tool replay for non-400 openai models (gemma4/vLLM) — silent agentic-quality regression

### Bug type
Behavior bug (incorrect output/state without crash)

### Summary
For openai-completions **reasoning** models whose API does **not** require `reasoning_content` on assistant messages (e.g. **gemma4 / Gemma-4-12B on vLLM**), OpenClaw drops the model's **in-turn** reasoning when replaying multi-turn tool conversations. Since the provider doesn't 400 on the missing field (unlike DeepSeek/Moonshot/Xiaomi), the failure is **silent**: no error, just degraded multi-step tool behavior — tool-call `arguments` intermittently collapse to `{}`, and the model re-issues identical tool calls because it can't see its own prior reasoning.

This is distinct from the known API-compat issues (#70392, #81419, #91558, #89660, #91106): those inject an **empty** `reasoning_content: ""` to satisfy strict provider schemas and avoid a 400. Here the provider is fine without the field — what's missing for *quality* is the **real** in-turn reasoning text, so the model keeps continuity across tool calls.

### Evidence (captured off the wire)
TCP tee between OpenClaw and vLLM; one multi-turn tool exchange via `openclaw agent --local --thinking high`. Deepest request OpenClaw sent (system, user, 3× assistant tool-call, 3× tool):

- ✅ structured `tool_calls` preserved
- ✅ tool-call / tool-result IDs match exactly
- ❌ **`reasoning_content` present on 0 / 3 in-turn assistant tool-call messages**

The Gemma 4 chat template re-injects in-turn reasoning (`reasoning`/`reasoning_content` → `<|channel>thought…<channel|>`, gated to messages after the last user turn — it correctly keeps in-turn reasoning and drops completed-turn reasoning). OpenClaw sends nothing, so re-injection never fires. Across stored trajectories on this install: 13 `exec` calls with empty `arguments` (exec requires a `command`), plus repeated identical tool calls within sessions.

### Suspected root cause
`reasoningContent` appears empty at replay for these models, so the existing populate path never runs:
```js
if (reasoningContent.length > 0) assistantMsg.reasoning_content = reasoningContent.join("\n");
else if (allowReasoningContentReplay) /* inject empty placeholder */
```
`requiresReasoningContentOnAssistantMessages` only governs the `else` (empty-field API-compat shim) and so does **not** restore real reasoning. The gap looks upstream: in-turn reasoning isn't carried from the session store into `reasoningContent` for openai-format models that aren't on the DeepSeek/Xiaomi detect list. (Same populate code in 2026.6.1 and 2026.6.5-beta — not a regression.)

### Proposed direction
Preserve **real in-turn reasoning** (reasoning generated since the last user message) on replayed assistant tool-call messages for openai reasoning models, independent of the empty-field API-compat path. This matches what the Gemma model card + chat template assume, and the community reports of the same symptom (Gemma-4-12B tool-calling PSA on r/LocalLLaMA; Qwen3.6 analogue: earendil-works/pi#3325 — "after 2-3 turns every tool call collapses to `arguments: {}`").

### Steps to reproduce
1. Configure a gemma4 model via `api: openai-completions` on vLLM, `compat.thinkingFormat: "openai"`, `reasoning: true`.
2. Run a multi-step agent task that needs 2+ sequential tool calls, `--thinking high`.
3. Capture the request bodies OpenClaw sends to the model (LLM proxy / tee).
4. Observe: in-turn assistant tool-call messages carry `tool_calls` but no `reasoning_content`; over several turns, arguments degrade / tool calls repeat.

### Environment
- OpenClaw 2026.6.1
- Gemma-4-12B (NVFP4) via vLLM 0.22, `api: openai-completions`, `compat.thinkingFormat: "openai"`, `reasoning: true`, `--thinking high`

### Workaround
Make each tool action self-contained (a single call returns the final value — e.g. server-computed totals, or a `gog | jq | date` one-liner) so turns don't depend on cross-turn reasoning. Restores correctness for those flows but doesn't help genuinely multi-step reasoning.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: In-turn reasoning dropped on multi-turn tool replay for non-400 openai models (gemma4/vLLM) — silent agentic-quality regression #91645

Bug type

Summary

Evidence (captured off the wire)

Suspected root cause

Proposed direction

Steps to reproduce

Environment

Workaround

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: In-turn reasoning dropped on multi-turn tool replay for non-400 openai models (gemma4/vLLM) — silent agentic-quality regression #91645

Description

Bug type

Summary

Evidence (captured off the wire)

Suspected root cause

Proposed direction

Steps to reproduce

Environment

Workaround

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions