Skip to content

[Bug]: thinkingLevel is silently cosmetic for openai/openai-codex Responses — createOpenAIThinkingLevelWrapper misses the existingReasoning === undefined branch #70904

@douglferreira977

Description

@douglferreira977

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

createOpenAIThinkingLevelWrapper in proxy-stream-wrappers.ts never injects body.reasoning when pi-ai leaves payloadObj.reasoning as undefined, so thinkingDefault / thinkingLevel is silently cosmetic for openai/gpt-5.x and openai-codex/gpt-5.x (Responses and Codex-Responses APIs). Confirmed on 2026.4.22 via A/B runtime instrumentation.

Steps to reproduce

  1. On a fresh OpenClaw 2026.4.22 install (npm global, Ubuntu 24.04, Node 22.22.2), configure:
    {
      agents: {
        defaults: {
          thinkingDefault: "high",
          model: { primary: "openai-codex/gpt-5.5" }
        }
      }
    }
    (Same behavior reproduces with gpt-5.4, gpt-5.4-mini, and openai/gpt-5.x via API key — the wrapper path is the same.)
  2. Sign in with Codex OAuth (openclaw models auth login openai-codex).
  3. Start the gateway and send any message through the agent (Telegram or direct).
  4. Inspect the payload entering streamWithPayloadPatch inside createOpenAIThinkingLevelWrapper (line ~188 in the bundled proxy-stream-wrappers-*.js). payloadObj.reasoning is undefined, none of the three branches mutates it, and the HTTP request goes to chatgpt.com/backend-api/codex with no reasoning field.

Minimal instrumentation used to confirm (added, then removed, around line 189):

const existingReasoning = payloadObj.reasoning;
console.error(`[DBG] thinkingLevel=${thinkingLevel} existingReasoning=${JSON.stringify(existingReasoning)} model=${model?.provider}/${model?.api}`);
// ...
// after the last branch:
console.error(`[DBG] final=${JSON.stringify(payloadObj.reasoning)}`);

Expected behavior

With thinkingDefault: "high" and shouldApplyOpenAIReasoningCompatibility(model) === true, payloadObj.reasoning should reach the endpoint as { effort: "high" } (or the mapped value from mapThinkingLevelToReasoningEffort), regardless of whether pi-ai pre-populated payloadObj.reasoning or left it undefined. This matches the semantics already shipped for the proxy path: normalizeProxyReasoningPayload (same file, line 319) explicitly handles the !existingReasoning case and sets payloadObj.reasoning = { effort: mapThinkingLevelToReasoningEffort(thinkingLevel) }.

Actual behavior

createOpenAIThinkingLevelWrapper only reacts to three prior states of payloadObj.reasoning:

  1. thinkingLevel === "off" → deletes reasoning (if present).
  2. existingReasoning === "none" → creates { effort }.
  3. existingReasoning && typeof === "object" && !Array.isArray → mutates effort in place.

Because pi-ai (openai-codex-responses.js and openai-responses.js) only sets body.reasoning when options.reasoningEffort !== undefined, and OpenClaw does not pass reasoningEffort through to pi-ai, payloadObj.reasoning is undefined in every real run — and none of the three branches fires. body.reasoning reaches OpenAI as absent/unset, so server-side defaults apply regardless of user configuration.

Captured A/B (2026-04-24, same session/message, only the wrapper toggled):

# Without the fix (v4.22 upstream, createOpenAIThinkingLevelWrapper unchanged)
[DBG] thinkingLevel=high existingReasoning=undefined model=openai-codex/openai-codex-responses
[DBG] final=undefined

# With the fix applied (branch added)
[DBG] thinkingLevel=high existingReasoning=undefined model=openai-codex/openai-codex-responses
[DBG] final={"effort":"high"}

The trace.metadata.model.reasoningLevel field in the trajectory still reads "off" in both cases because it is populated upstream of the wrapper, so it is not a reliable user-facing signal of the bug. The symptom is only visible via HTTP payload inspection or runtime instrumentation.

OpenClaw version

2026.4.22

Operating system

Ubuntu 24.04 (Linux 6.8.0-107)

Install method

npm global (npm install -g openclaw)

Model

openai-codex/gpt-5.5 (also reproduced with openai-codex/gpt-5.4, openai-codex/gpt-5.4-mini, and would also affect openai/gpt-5.x via direct API key — same wrapper code path)

Provider / routing chain

openclaw -> chatgpt.com/backend-api/codex (Codex OAuth PI runner, api = openai-codex-responses)

Additional provider/model setup details

Config excerpt:

{
  agents: {
    defaults: {
      thinkingDefault: "high",
      model: { primary: "openai-codex/gpt-5.5" },
      models: {
        "openai-codex/gpt-5.5": { params: { fastMode: true, serviceTier: "priority" } }
      }
    }
  }
}

Auth profile: openai-codex:<email> (OAuth via openclaw models auth login openai-codex).
shouldApplyOpenAIReasoningCompatibility(model) returns true for this provider/api combo (confirmed via resolveOpenAIRequestCapabilities(model).supportsOpenAIReasoningCompatPayload which is true for provider ∈ {openai, openai-codex, azure-openai, azure-openai-responses} and api ∈ {openai-completions, openai-responses, openai-codex-responses, azure-openai-responses}).

Logs, screenshots, and evidence

Minimal runtime A/B with temporary console.error inside the wrapper (full log lines above in "Actual behavior"). Same test session, same config, same message, only line 195 (if (existingReasoning === "none") vs if (existingReasoning === void 0 || existingReasoning === null || existingReasoning === "none")) toggled. Gateway restarted between toggles so the bundle re-imports.

For comparison, normalizeProxyReasoningPayload already implements the correct semantics for the proxy path:

// existing — proxy-stream-wrappers.ts line 319
} else if (!existingReasoning) payloadObj.reasoning = { effort: mapThinkingLevelToReasoningEffort(thinkingLevel) };

The same !existingReasoning branch is missing in createOpenAIThinkingLevelWrapper.

Impact and severity

  • Affected: every user running openai/gpt-5.x or openai-codex/gpt-5.x via the built-in PI runner with a non-"off" thinkingDefault/thinkingLevel. Covers the default OpenAI API-key path and the Codex OAuth subscription path — likely the majority of GPT-5 users on OpenClaw.
  • Severity: High for a silent config failure. Users configure thinkingDefault: "high" (or "max", etc.) expecting stronger reasoning; the model runs with server-side default effort instead, and there is no user-visible error, log warning, or status indicator.
  • Frequency: 100% of runs on the affected paths (confirmed across multiple runs on the same session).
  • Consequence: quality regression (shorter/shallower reasoning than configured), wasted thinking-level tuning effort, and misleading observability (trajectory metadata shows thinkLevel: "high" while the request has no reasoning field). Also affects follow-through on complex multi-tool tasks where high reasoning materially improves outcomes.

Additional information

  • Last known version where thinkingLevel was effective for this path: NOT_ENOUGH_INFO (the bug predates 2026.4.21 in our observation — we started patching this locally on 2026-04-23 after first catching it; haven't bisected older versions).
  • Proposed fix (three-line patch, same file, function createOpenAIThinkingLevelWrapper):
    - if (existingReasoning === "none") {
    + // Cover the common case where pi-ai leaves payloadObj.reasoning unset
    + // (options.reasoningEffort is undefined on the OpenClaw path, so pi-ai does not
    + // initialize body.reasoning before the wrapper runs).
    + if (existingReasoning === void 0 || existingReasoning === null || existingReasoning === "none") {
          payloadObj.reasoning = { effort: mapThinkingLevelToReasoningEffort(thinkingLevel) };
          return;
      }
    This aligns createOpenAIThinkingLevelWrapper with the semantics already shipped in normalizeProxyReasoningPayload.
  • Related surface: mapThinkingLevelToReasoningEffort("off") === "none""none" is already a string branch the wrapper handles, so adding undefined/null is a strict extension.
  • Affected bundled file on 2026.4.22 (npm global install): proxy-stream-wrappers-Cnx4hyoz.js. Function createOpenAIThinkingLevelWrapper starts at line 183.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions