Skip to content

Gemini text-tag reasoning conflicts with native thinking — produces unclosed <think>, empty post-tool turn, payloads=0 #69220

@mrbrl

Description

@mrbrl

Summary

OpenClaw 2026.4.14 (and prior 2026.4.x) hardcodes google-generative-ai to tagged reasoning output mode in dist/provider-utils-CWYcvxhN.js:

const BUILTIN_REASONING_OUTPUT_MODES = { "google-generative-ai": "tagged" };

tagged mode injects this directive into the system prompt (via dist/system-prompt-mHpoeHEN.js:358-367):

ALL internal reasoning MUST be inside <think>...</think>.
Format every reply as <think>...</think> then <final>...</final>, with no other text.

But Gemini 2.5-pro / 2.5-flash also receive thinkingConfig: { includeThoughts: true, thinkingBudget|thinkingLevel: ... } from resolveGoogleThinkingConfig in dist/anthropic-vertex-stream-YyrYtvts.js:5385, which enables native thought parts in the response.

Failure mode

When prompts are long/persona-heavy (PM cron loops, devsecops loops), the model:

  1. Emits a native thought:true part (3000+ chars)
  2. Then opens a <think> text block per the directive
  3. Pivots to a tool call before closing </think> and never emits <final>
  4. Post-tool-result assistant turn comes back with content: []

Gateway then logs:

[agent/embedded] incomplete turn detected: stopReason=stop payloads=0 — surfacing error to user

The user sees an error message, but the model has already burned ~140K input tokens for nothing. In our environment this added up to ~$20 USD of wasted spend in a few hours on Gemini 2.5 Pro before we caught it.

Reproduction

  • OpenClaw 2026.4.14
  • google/gemini-2.5-pro or google/gemini-2.5-flash with thinkingDefault: medium
  • Long persona system prompt + a tool call request

We have several reproducible session jsonls if helpful — pattern is consistent: one thinking content block, then a text block beginning <think> (no closing tag, no <final>), then a toolCall, then an empty assistant turn.

Proposed fix

Drop google-generative-ai from BUILTIN_REASONING_OUTPUT_MODES. Gemini already exposes native thinking via thoughtSignature / thought:true parts and the thinkingConfig request field — there is no need for tagged reasoning, and combining both confuses the model.

We've patched the dist file locally ({"google-generative-ai": "native"}) and confirmed:

  • Smoke tests pass with clean text output
  • No `` text-tag pollution in cron sessions
  • Native thinking still works
  • No regression in other Gemini behaviour

Happy to send a PR if useful.

Workaround for others hit by this

If you can't patch the dist file:

  1. Author a tiny provider plugin for `google-generative-ai` that returns `"native"` from `resolveReasoningOutputMode` (hook config: support clawdis.json path for rebranding #17 per docs/plugins/sdk-provider-plugins.md)
  2. Or move primaries off Gemini until fixed

Related code

  • `dist/provider-utils-CWYcvxhN.js:4` — `BUILTIN_REASONING_OUTPUT_MODES`
  • `dist/system-prompt-mHpoeHEN.js:358-367` — tagged-mode directive
  • `dist/anthropic-vertex-stream-YyrYtvts.js:5385` — `resolveGoogleThinkingConfig`
  • `dist/pi-embedded-utils-ZJNz7jk_.js:101` — `splitThinkingTaggedText` returns null on unclosed tag (the parser path that produces the empty visible text)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions