Ollama provider: missing response-level reasoning stripper for Kimi models causes inline reasoning leak to chat

## Summary
When using `ollama/kimi-k2.6:cloud` (and likely `kimi-k2.5:cloud`) with `Think: off`, the model's inline reasoning text leaks into the visible chat output. The gateway correctly sends `think: false` (native Ollama) and `thinking: { type: "disabled" }` (Moonshot wrapper) on outgoing requests, but the model still emits reasoning text inline — separated from the actual response by a boundary delimiter. The Ollama provider has no response-level stripper for this inline reasoning, unlike the opencode-go provider which has `stripOpencodeGoKimiReasoningPayload`.

## Environment
- **OpenClaw version:** 2026.5.22 (a374c3a)
- **Provider:** `ollama`
- **Model:** `ollama/kimi-k2.6:cloud` (also observed with `kimi-k2.5:cloud`)
- **Runtime:** `Think: off`
- **OS:** Windows 10.0.26200 (x64)
- **Ollama base URL:** `http://192.168.1.72:11434`

## Reproduction

### Config (relevant excerpt)
```json5
{
  models: {
    providers: {
      ollama: {
        baseUrl: "http://192.168.1.72:11434",
        api: "ollama",
        models: [
          {
            id: "kimi-k2.6:cloud",
            name: "kimi-k2.6:cloud",
            reasoning: false,
            params: {
              num_ctx: 262144
            }
          }
        ]
      }
    }
  },
  agents: {
    defaults: {
      model: { primary: "ollama/kimi-k2.6:cloud" }
    }
  }
}
```

### Steps
1. Set primary model to `ollama/kimi-k2.6:cloud`
2. Ensure `thinking` / `Think` is `off`
3. Send any message that triggers the agent
4. Observe the assistant response in chat

### Actual behavior
The visible response contains the model's internal reasoning monologue, followed by a boundary delimiter, then the actual response. Example from session history:

```
"The user is asking what projects we've worked on so far. Based on my memory files...

...Let me provide a clear summary. ️ So far we've got **3 active projects** tracked:"
```

The text before ` ️` is raw reasoning that should be internal only.

### Expected behavior
Only the text after the reasoning delimiter should be visible to the user. The reasoning block should be stripped, discarded, or stored as internal metadata — never rendered as chat content.

## Root cause analysis

1. **Request side is correct** — `createConfiguredOllamaCompatStreamWrapper` applies both:
   - `createOllamaThinkingWrapper(..., false)` → sets `think: false` on native Ollama payload
   - `createMoonshotThinkingWrapper(..., "disabled")` → sets `thinking: { type: "disabled" }`

2. **Model ignores the disable signal** — `kimi-k2.6` still outputs reasoning inline, likely because the Ollama API passthrough doesn't propagate the disable parameter correctly to the underlying model, or the model inherently emits reasoning regardless.

3. **Missing response stripper** — The `opencode-go` provider has `stripOpencodeGoKimiReasoningPayload` which:
   - Deletes `reasoning`, `reasoning_details`, `reasoning_content`, `reasoning_text` fields
   - Filters out `type: "thinking"` / `type: "reasoning"` content parts from messages
   - Replaces stripped content with `[assistant reasoning omitted]`

   The `ollama` provider has **no equivalent** response sanitizer for Kimi models.

## Cross-references

- **#81988** — `opencode-go/kimi-k2.6`: `reasoning` field leaks through passthrough replay policy (same family, different provider)
- **#83812** — `opencode-go/kimi-k2.6` sends unsupported `reasoning_details` in replayed messages (request-side fix for opencode-go)
- **#6470** — Discord: reasoning content posted as regular messages (general reasoning leak class)
- **ollama/ollama#10456** — Ollama-level discussion on disabling thinking mode

## Suggested fix

Add a Kimi-specific response sanitizer in the Ollama provider, analogous to what exists for opencode-go:

**Option A (provider-level):** In `createConfiguredOllamaCompatStreamWrapper`, when `isOllamaCloudKimiModelRef(modelId)` is true, wrap the stream with a response interceptor that strips inline reasoning text from assistant message content before it reaches the user.

**Option B (gateway-level):** Add a general `stripInlineReasoningFromAssistantText` utility in the message processing pipeline that recognizes the ` ️` (or equivalent) delimiter and splits/omits the reasoning portion.

**Option C (model registry):** Mark `kimi-k2.6:cloud` and `kimi-k2.5:cloud` under the Ollama provider as `reasoning: true` with a `reasoningOutputMode: "inline"` so the gateway knows to apply stripping regardless of what the model claims.

## Workarounds

- Switch to a non-reasoning model (e.g., `llama3.2:3b`, `gemma4`)
- Modify the model's Ollama Modelfile to inject a system prompt forbidding reasoning output

## Impact

- **Severity:** Medium-High — leaks internal decision-making and planning to user-visible chat
- **Affected channels:** All (webchat, Discord, Telegram, etc.)
- **Frequency:** Every multi-turn assistant message with `ollama/kimi-k2.6:cloud`

<img width="452" height="679" alt="Image" src="https://github.com/user-attachments/assets/648db7c4-abd0-478a-b53f-d785091202b8" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ollama provider: missing response-level reasoning stripper for Kimi models causes inline reasoning leak to chat #86129

Summary

Environment

Reproduction

Config (relevant excerpt)

Steps

Actual behavior

Expected behavior

Root cause analysis

Cross-references

Suggested fix

Workarounds

Impact

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Ollama provider: missing response-level reasoning stripper for Kimi models causes inline reasoning leak to chat #86129

Description

Summary

Environment

Reproduction

Config (relevant excerpt)

Steps

Actual behavior

Expected behavior

Root cause analysis

Cross-references

Suggested fix

Workarounds

Impact

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions