Skip to content

fix(pi-embedded): inject body.reasoning when payload has no prior reasoning (#70904)#70911

Closed
hclsys wants to merge 2 commits intoopenclaw:mainfrom
hclsys:fix/openai-thinking-level-wrapper-undefined-reasoning-70904
Closed

fix(pi-embedded): inject body.reasoning when payload has no prior reasoning (#70904)#70911
hclsys wants to merge 2 commits intoopenclaw:mainfrom
hclsys:fix/openai-thinking-level-wrapper-undefined-reasoning-70904

Conversation

@hclsys
Copy link
Copy Markdown
Contributor

@hclsys hclsys commented Apr 24, 2026

Summary

Closes #70904.

`createOpenAIThinkingLevelWrapper` (`src/agents/pi-embedded-runner/openai-stream-wrappers.ts`) only reacted to three prior states of `payloadObj.reasoning`:

  1. `thinkingLevel === "off"` → delete
  2. `existingReasoning === "none"` → replace with `{ effort }`
  3. existing `{ effort }` object → mutate in place

pi-ai's Responses / Codex-Responses clients only set `body.reasoning` when `options.reasoningEffort` is threaded through, which OpenClaw does not do for this wrapper. So `payloadObj.reasoning` is `undefined` in every real run, none of the three branches fires, and `thinkingDefault` / `thinkingLevel` is silently cosmetic for `openai-codex/gpt-5.x` and `openai/gpt-5.x` — the server-side default applies regardless of user configuration.

The reporter confirmed via A/B runtime instrumentation on 2026.4.22.

Change

Add the missing `!existingReasoning` branch, mirroring `normalizeProxyReasoningPayload`'s already-shipped handling in `proxy-stream-wrappers.ts:43`. Still gated behind the existing `shouldApplyOpenAIReasoningCompatibility` check, so non-reasoning-capable models are untouched.

Test plan

…soning (openclaw#70904)

`createOpenAIThinkingLevelWrapper` only reacted to three prior states of
`payloadObj.reasoning`: "off" (delete), `"none"` string (replace), and an
existing `{ effort }` object (mutate in place). Because pi-ai's
Responses/Codex-Responses clients only populate `body.reasoning` when
`options.reasoningEffort` is threaded through — which OpenClaw does not do
for this wrapper — `payloadObj.reasoning` is `undefined` in every real run.

Result: `thinkingDefault` / `thinkingLevel` was silently cosmetic for
`openai-codex/gpt-5.x` and `openai/gpt-5.x` on the wrapper path, and server-
side defaults applied regardless of user configuration.

Mirror `normalizeProxyReasoningPayload`'s `!existingReasoning` branch
(proxy-stream-wrappers.ts:43): when the payload arrives without a
reasoning field and thinkingLevel is active, populate
`{ effort: mapThinkingLevelToReasoningEffort(thinkingLevel) }`.

Gated behind the existing `shouldApplyOpenAIReasoningCompatibility` check,
so non-reasoning-capable models are untouched.

Added regression test:
- "injects reasoning on reasoning-capable model when payload has no prior reasoning (openclaw#70904)"

Closes openclaw#70904
@openclaw-barnacle openclaw-barnacle Bot added agents Agent runtime and tooling size: XS labels Apr 24, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR adds the missing !existingReasoning branch to createOpenAIThinkingLevelWrapper so that thinkingLevel/thinkingDefault is actually forwarded to openai-codex and openai-responses endpoints when the pi-ai client never populated body.reasoning in the first place. The fix is correct and mirrors the existing normalizeProxyReasoningPayload pattern in proxy-stream-wrappers.ts.

  • The existing test "does not add reasoning for non-reasoning models without existing reasoning payload" (line 82) uses openaiModel (api: "openai-responses", provider: "openai", no baseUrl), which resolves to supportsOpenAIReasoningCompatPayload: true — the same gate the fix passes through. This test will assert undefined but the fixed code will now produce { effort: "medium" }, causing a test failure.

Confidence Score: 4/5

The runtime fix is correct, but a pre-existing test contradicts the new behavior and will fail in CI.

One P1 finding: the test at line 82–88 is rendered incorrect by the new branch and will fail, since openaiModel without a baseUrl is reasoning-capable. The production code change itself is sound.

src/agents/pi-embedded-runner/openai-stream-wrappers.test.ts — the stale 'non-reasoning models' test at line 82 needs updating.

Comments Outside Diff (1)

  1. src/agents/pi-embedded-runner/openai-stream-wrappers.test.ts, line 82-88 (link)

    P1 Stale test will fail after the fix

    openaiModel (api: "openai-responses", provider: "openai", no baseUrl) is a reasoning-capable model: endpointClass resolves to "default" (no configured base URL), so usesExplicitProxyLikeEndpoint is false, and supportsOpenAIReasoningCompatPayload evaluates to true in provider-attribution.ts. The new if (existingReasoning === undefined || existingReasoning === null) branch will now fire for it, injecting { effort: "medium" } — so the assertion expect(payloads[0]?.reasoning).toBeUndefined() fails.

    If this test is meant to cover a truly non-reasoning model (e.g. a proxy route with a custom baseUrl), the fixture needs to match. If it's meant to confirm that openaiModel without pre-existing reasoning also gets injection after this fix, it should be updated to toEqual({ effort: "medium" }) with a descriptive rename.

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: src/agents/pi-embedded-runner/openai-stream-wrappers.test.ts
    Line: 82-88
    
    Comment:
    **Stale test will fail after the fix**
    
    `openaiModel` (`api: "openai-responses"`, `provider: "openai"`, no `baseUrl`) is a reasoning-capable model: `endpointClass` resolves to `"default"` (no configured base URL), so `usesExplicitProxyLikeEndpoint` is `false`, and `supportsOpenAIReasoningCompatPayload` evaluates to `true` in `provider-attribution.ts`. The new `if (existingReasoning === undefined || existingReasoning === null)` branch will now fire for it, injecting `{ effort: "medium" }` — so the assertion `expect(payloads[0]?.reasoning).toBeUndefined()` fails.
    
    If this test is meant to cover a truly non-reasoning model (e.g. a proxy route with a custom `baseUrl`), the fixture needs to match. If it's meant to confirm that `openaiModel` without pre-existing reasoning also gets injection after this fix, it should be updated to `toEqual({ effort: "medium" })` with a descriptive rename.
    
    
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/agents/pi-embedded-runner/openai-stream-wrappers.test.ts
Line: 82-88

Comment:
**Stale test will fail after the fix**

`openaiModel` (`api: "openai-responses"`, `provider: "openai"`, no `baseUrl`) is a reasoning-capable model: `endpointClass` resolves to `"default"` (no configured base URL), so `usesExplicitProxyLikeEndpoint` is `false`, and `supportsOpenAIReasoningCompatPayload` evaluates to `true` in `provider-attribution.ts`. The new `if (existingReasoning === undefined || existingReasoning === null)` branch will now fire for it, injecting `{ effort: "medium" }` — so the assertion `expect(payloads[0]?.reasoning).toBeUndefined()` fails.

If this test is meant to cover a truly non-reasoning model (e.g. a proxy route with a custom `baseUrl`), the fixture needs to match. If it's meant to confirm that `openaiModel` without pre-existing reasoning also gets injection after this fix, it should be updated to `toEqual({ effort: "medium" })` with a descriptive rename.

```suggestion
  it("does not add reasoning for proxy-routed models without existing reasoning payload", () => {
    const { baseStreamFn, payloads } = createPayloadCapture();
    const wrapped = createOpenAIThinkingLevelWrapper(baseStreamFn, "medium");
    void wrapped(
      {
        api: "openai-responses",
        provider: "openai",
        id: "gpt-5.2",
        baseUrl: "https://proxy.example.com/v1",
      } as Model<"openai-responses">,
      { messages: [] },
      {},
    );

    expect(payloads[0]?.reasoning).toBeUndefined();
  });
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix(pi-embedded): inject body.reasoning ..." | Re-trigger Greptile

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 23ed33c6e6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +103 to +105
void wrapped(codexModel, { messages: [] }, {});

expect(payloads[0]?.reasoning).toEqual({ effort: "high" });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reconcile conflicting missing-reasoning expectations

This new case asserts that absent payload.reasoning should be populated, but the existing test does not add reasoning for non-reasoning models without existing reasoning payload in the same file still expects undefined for openaiModel when reasoning is absent. Because shouldApplyOpenAIReasoningCompatibility covers both openai/openai-responses and openai-codex/openai-codex-responses, these expectations diverge after this change and will leave the test suite red unless the older fixture/expectation is updated.

Useful? React with 👍 / 👎.

@douglferreira977
Copy link
Copy Markdown

Reporter here. Confirming the runtime change in openai-stream-wrappers.ts is exactly the fix — mirrors normalizeProxyReasoningPayload at proxy-stream-wrappers.ts:319 with the same shouldApplyOpenAIReasoningCompatibility gate.

Have had the equivalent manual patch active on the bundled proxy-stream-wrappers-*.js since v2026.4.21 (re-applied cleanly through the v2026.4.22 upgrade). Three agents running openai-codex/gpt-5.4 family with thinkingDefault high/medium/high: A/B on identical prompts confirms body.reasoning reaches chatgpt.com/backend-api/codex as { effort: "<level>" } with the branch and is absent without it. No regressions observed.

On the 3 checks-node-core-* failures: Greptile's diagnosis on openai-stream-wrappers.test.ts:82 is right. openaiModel has no baseUrlendpointClass === "default"supportsOpenAIReasoningCompatPayload === true, so the new branch now injects { effort: "medium" } and breaks toBeUndefined(). The test was passing only because the bug suppressed injection. Minimal fix: rename it to something like "proxy-routed models without prior reasoning" and add baseUrl: "https://proxy.example.com/v1" to the fixture so it covers a genuinely non-compat route. Happy to push the test update if that unblocks.

…rl (openclaw#70904)

The existing test asserted "does not add reasoning for non-reasoning models
without existing reasoning payload" but the fixture used
{ api: "openai-responses", provider: "openai", id: "gpt-5.2" } with no
baseUrl — resolveOpenAIRequestCapabilities treats that as the default
endpoint class, which has supportsOpenAIReasoningCompatPayload === true.
The test was passing only because the pre-fix wrapper silently skipped
undefined payload.reasoning.

After the openclaw#70904 fix, the new !existingReasoning branch correctly injects
{ effort: "medium" } for that compat-capable route, so the assertion
breaks.

Rename the test to reflect what it should actually cover (proxy routes
that fail the compat gate) and add baseUrl: "https://proxy.example.com/v1"
to force shouldApplyOpenAIReasoningCompatibility === false. With that, the
wrapper returns early and payload.reasoning stays undefined, as intended.

Diagnosis + suggested fix from the openclaw#70904 reporter (@douglferreira977)
and Greptile.
@hclsys
Copy link
Copy Markdown
Contributor Author

hclsys commented Apr 24, 2026

Thanks for the runtime A/B confirmation and the Greptile diagnosis — both spot-on. Pushed your suggested test fix in 4dbf49a: renamed to "proxy-routed models without prior reasoning" and added baseUrl: "https://proxy.example.com/v1" so the fixture exercises a genuine non-compat route. CI should go green on the next run.

Copy link
Copy Markdown
Contributor

@steipete steipete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex review: verified current main still has the bug from #70904: createOpenAIThinkingLevelWrapper only patches body.reasoning when it is "none" or an existing object, so the common undefined case never gets reasoning.effort (src/agents/pi-embedded-runner/openai-stream-wrappers.ts:246).

This PR is the right minimal fix. It adds the missing undefined/null branch and the stale non-reasoning fixture was already corrected to a proxy-routed model. Current check rollup is green. I would land this.

Copy link
Copy Markdown
Contributor

@steipete steipete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codex follow-up: I’m retracting my earlier “land this” recommendation unless we can identify a real broken runtime path.

I traced the normal embedded-agent path more deeply:

  • OpenClaw passes the selected/default think level into createAgentSession as thinkingLevel: mapThinkingLevel(params.thinkLevel) (src/agents/pi-embedded-runner/run/attempt.ts:1235).
  • pi-agent-core forwards that session thinking level into model calls as options.reasoning (node_modules/@mariozechner/pi-agent-core/dist/agent.js:280).
  • pi-ai OpenAI Responses maps options.reasoning to reasoningEffort before calling buildParams (node_modules/@mariozechner/pi-ai/dist/providers/openai-responses.js:105-109), then emits params.reasoning = { effort, summary } when reasoningEffort is present (:169-175).
  • pi-ai OpenAI Codex Responses does the same at node_modules/@mariozechner/pi-ai/dist/providers/openai-codex-responses.js:197-200 and :231-235.

That means the wrapper’s missing existingReasoning === undefined branch does not prove production breakage. In the normal Pi run path, the upstream provider should already create the reasoning object from options.reasoning; the OpenClaw wrapper then only has to override existing reasoning or the "none" sentinel.

The current PR test exercises a bare wrapper with a synthetic base stream that does not model pi-agent-core/pi-ai option propagation, so it proves the helper’s isolated behavior but not the reported runtime bug.

Could you point to the concrete production code path that is broken? Specifically: where OpenClaw has a selected/default think level, but the OpenAI/Codex stream call reaches pi-ai without options.reasoning, resulting in an outbound request with no body.reasoning. A captured outbound payload or a production-shaped regression test through applyExtraParamsToAgent + the real Pi options would make this landable.

Until then I would not merge this as a bug fix. At most it is defensive hardening, and it risks masking a test that is currently not representative of the actual run path.

@steipete
Copy link
Copy Markdown
Contributor

Thanks for the follow-up and the runtime A/B notes. I dug through the production path and the upstream Pi layers before deciding this.

The normal embedded OpenAI/Codex path is already covered by the session-level reasoning path:

  • src/agents/pi-embedded-runner/run/attempt.ts passes the selected thinkLevel into createAgentSession as thinkingLevel.
  • @mariozechner/pi-agent-core converts non-off session thinking into options.reasoning for the stream call.
  • @mariozechner/pi-ai then serializes that options.reasoning into the outgoing OpenAI/Codex Responses payload as reasoning.effort.

I added a maintainer-side regression contract on main in 0211280 (test: cover OpenAI thinking payload contract) that proves the path directly:

  • OpenAI + Codex sessions with thinkingLevel: high forward options.reasoning: "high".
  • streamSimpleOpenAIResponses serializes that to reasoning: { effort: "high", summary: "auto" }.
  • streamSimpleOpenAICodexResponses serializes that to the same payload shape.
  • thinkingLevel: off is pinned too: Codex leaves reasoning absent, OpenAI Responses emits the existing explicit disabled payload.

Verification on current main:

  • pnpm test src/agents/openai-thinking-contract.test.ts => 8 passed
  • pnpm check:changed => green before commit

Given that, I do not want to land this wrapper change as-is. It makes createOpenAIThinkingLevelWrapper compensate for a missing options.reasoning path that we still cannot reproduce in the normal embedded runtime, and it would make the wrapper responsible for papering over a lower-level contract break instead of exposing it.

If you can show the concrete production code path where a selected/default thinking level exists but pi-ai receives no options.reasoning, please open a fresh PR with that repro/trace or a failing test. That would be the right thing to fix. Closing this PR for now to avoid a speculative payload mutation.

@steipete steipete closed this Apr 24, 2026
@hclsys
Copy link
Copy Markdown
Contributor Author

hclsys commented Apr 24, 2026

Thanks for the detailed path trace and the regression contract in 0211280 — that's the right place to guarantee the behavior. Agreed the wrapper change would have hidden a lower-level contract break. Not reopening; will leave the wrapper alone and file a fresh PR only if I can produce a concrete repro on the embedded runtime path that reaches the wrapper with missing options.reasoning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling size: XS

Projects

None yet

3 participants