-
-
Notifications
You must be signed in to change notification settings - Fork 79.2k
[Bug]: CLI inference wrapper sends legacy thinkingBudget instead of thinkingLevel for gemini-flash-latest #84688
Copy link
Copy link
Closed
Labels
P2Normal backlog priority with limited blast radius.Normal backlog priority with limited blast radius.bugSomething isn't workingSomething isn't workingclawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.ClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:auth-providerAuth, provider routing, model choice, or SecretRef resolution may break.Auth, provider routing, model choice, or SecretRef resolution may break.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Metadata
Metadata
Assignees
Labels
P2Normal backlog priority with limited blast radius.Normal backlog priority with limited blast radius.bugSomething isn't workingSomething isn't workingclawsweeper:linked-pr-openClawSweeper found an open linked pull request for this issue.ClawSweeper found an open linked pull request for this issue.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:auth-providerAuth, provider routing, model choice, or SecretRef resolution may break.Auth, provider routing, model choice, or SecretRef resolution may break.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.
Type
Fields
Give feedbackNo fields configured for issues without a type.
Bug type
Behavior bug (incorrect output/state without crash)
Beta release blocker
No
Summary
When running openclaw infer model run with --thinking adaptive, OpenClaw incorrectly sends the legacy thinkingBudget: -1 parameter to the Google Gemini API for gemini-flash-latest instead of the required thinkingLevel parameter.
Steps to reproduce
Expected behavior
OpenClaw should recognize gemini-flash-latest (a Gemini 3.5 model) as supporting dynamic/adaptive thinking and natively omit thinkingBudget, or pass thinkingLevel as appropriate, matching the behavior of gemini-3.1-pro-preview and Gemini API documentation.
Actual behavior
The debug proxy payload sent to Google contains thinkingBudget: -1 instead of thinkingLevel.
OpenClaw version
2026.5.18 (50a2481)
Operating system
Linux 6.12.75+rpt-rpi-2712 (arm64) / Node.js v24.14.0
Install method
npm global
Model
google/gemini-flash-latest
Provider / routing chain
openclaw -> google
Additional provider/model setup details
Tested using the openclaw infer model run command, which routes through the generic inference (ZAI) wrapper rather than the primary agent gateway routing.
Logs, screenshots, and evidence
Output from zcat on the captured gemini-flash-latest payload: { "contents": [{"parts":[{"text":"test"}],"role":"user"}], "generationConfig": { "maxOutputTokens": 65536, "thinkingConfig": { "includeThoughts": true, "thinkingBudget": -1 } } } Output from zcat on the captured gemini-3.1-pro-preview payload (working correctly): { "contents": [{"parts":[{"text":"test"}],"role":"user"}], "generationConfig": { "maxOutputTokens": 65536, "thinkingConfig": { "includeThoughts": true, "thinkingLevel": "HIGH" } } }Impact and severity
Affected: Users relying on the CLI inference wrappers or ZAI transports with gemini-flash-latest or gemini-3.5-flash while using adaptive thinking.
Severity: Medium (Sends legacy/deprecated parameters to the Google API, which Google documentation warns may result in unexpected performance).
Frequency: 100% reproducible for this specific model alias and thinking flag via the infer command.
Consequence: The Google API evaluates the request using legacy token budget constraints rather than modern Gemini 3 thinkingLevel semantics.
Additional information
Based on source code observation in dist/provider-stream-D4qSxrOO.js, the generic wrapper's supportsAdaptiveThinking(modelId) function only returns true for Claude 3.7/3.6 models, forcing a fallback to token budgets (-1). Furthermore, resolveGoogleGemini3ThinkingLevel in the Google provider adapter (dist/provider-stream-shared-CPv67a5n.js) fails to rewrite this legacy budget parameter to a thinkingLevel for Flash models (it only handles Pro models).
Temporary workaround: explicitly defining the model in openclaw.json with a fixed thinkingLevel: "high".