fix: dedupe OpenAI strict schema downgrade diagnostics by galiniliev · Pull Request #82933 · openclaw/openclaw

galiniliev · 2026-05-17T05:33:14Z

Summary

Problem: OpenAI/Azure OpenAI strict tool schema downgrade diagnostics repeated on nearly every request when the visible tool inventory was not strict-compatible.
Why it matters: repeated debug lines hid new provider diagnostics and made the same known downgrade look like fresh failures.
What changed: the downgrade diagnostic is now keyed by provider/model/transport/tool-violation signature and logged once per unique signature per process.
What did NOT change (scope boundary): this does not alter strict-mode selection or repair incompatible schemas; incompatible inventories still send strict=false.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes [Bug]: OpenAI strict tool schema downgrade diagnostic repeats on nearly every request #82930
Related Azure OpenAI Responses stalls before first event when memory tools are exposed #80926
This PR fixes a bug or regression

Real behavior proof (required for external PRs)

Behavior addressed: repeated OpenAI responses tool schema strict mode downgraded to strict=false diagnostics for the same Azure OpenAI Responses provider/model/tool-violation signature.
Real environment tested: local OpenClaw worktree, Node v24.15.0, focused OpenAI transport test file exercising the request-build diagnostic seam with a mocked OpenAI transport logger.
Exact steps or command run after this patch: node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts --reporter=verbose
Evidence after fix (screenshot, recording, terminal capture, console output, redacted runtime log, linked artifact, or copied live output): copied terminal capture:

RUN  v4.1.6

✓ |agents-core| ../../src/agents/openai-transport-stream.test.ts > openai transport stream > deduplicates repeated OpenAI strict schema downgrade diagnostics 256ms
✓ |agents-support| ../../src/agents/openai-transport-stream.test.ts > openai transport stream > deduplicates repeated OpenAI strict schema downgrade diagnostics 180ms

Test Files  2 passed (2)
Tests  338 passed (338)
Duration  16.04s

Observed result after fix: the new regression imports the transport module in isolation with debug logging enabled, calls buildOpenAIResponsesParams twice for the same strict-incompatible provider/model/tool signature, verifies both payloads keep strict=false, and verifies the mocked logger receives exactly one downgrade diagnostic.
What was not tested: no live Azure OpenAI Responses rerun was performed because the evidence came from a private affected gateway log/session; private deployment names, session identifiers, local paths, and job names were redacted and not reused publicly.
Before evidence (optional but encouraged): redacted local log evidence showed 2,068 lines matching strict mode downgraded; lines 227, 238, 247, 259, and 268 repeated OpenAI responses tool schema strict mode downgraded to strict=false for azure-openai-responses/gpt-5.3-codex because 13 tool schema(s) are not strict-compatible within seconds.

Root Cause (if applicable)

Root cause: resolveOpenAIStrictToolFlagWithDiagnostics logged every debug-enabled strict downgrade without remembering that the same provider/model/transport/tool-violation set had already been reported.
Missing detection / guardrail: there was no regression coverage for repeated downgrade diagnostics.
Contributing context (if known): OpenAI strict schema compatibility is inventory-wide, so one incompatible visible tool inventory can downgrade every request until schemas are fixed or the tool surface changes.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/agents/openai-transport-stream.test.ts
Scenario the test should lock in: building the same native OpenAI Responses request twice with the same strict-incompatible tool signature keeps strict=false while emitting one downgrade diagnostic.
Why this is the smallest reliable guardrail: the failure is in request-build diagnostic emission, before provider generation; an isolated module import with a mocked logger exercises the production buildOpenAIResponsesParams path without adding test-only source exports or requiring live provider credentials.
Existing test that already covers this (if any): none.
If no new test is added, why not: N/A.

User-visible / Behavior Changes

OpenAI strict schema downgrade debug logs are less noisy for repeated identical tool-schema incompatibilities. Provider payload strictness behavior is unchanged.

Diagram (if applicable)

Before:
[same incompatible tool inventory] -> [strict=false] -> [debug line on every request]

After:
[same incompatible tool inventory] -> [strict=false] -> [debug line once per unique signature]

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: local source checkout; original affected OS not enough info from redacted evidence
Runtime/container: Node v24.15.0
Model/provider: azure-openai-responses/gpt-5.3-codex in before evidence; local regression uses native OpenAI Responses model metadata
Integration/channel (if any): OpenAI transport
Relevant config (redacted): native Azure OpenAI Responses route with strict tool shaping enabled and 13 strict-incompatible schemas in the visible tool inventory

Steps

Enable debug logging for OpenAI transport/tool schema diagnostics.
Send repeated requests using the same native OpenAI/Azure OpenAI Responses model and same strict-incompatible tool inventory.
Observe repeated downgrade diagnostics before the patch; after the patch, identical signatures are suppressed after the first log.

Expected

Identical strict downgrade diagnostics are emitted once per unique provider/model/transport/tool-violation signature.

Actual

Before the patch, the same downgrade diagnostic repeated on nearly every request.
After the patch, the focused regression test proves identical signatures are deduplicated.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios: focused OpenAI transport test file passes after adding the behavior-level dedupe regression.
Edge cases checked: the repeated same-signature request path keeps strict=false for both payload builds while suppressing only the duplicate debug diagnostic.
What you did not verify: live Azure OpenAI Responses rerun with the private affected session/provider setup.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: a later identical downgrade diagnostic is suppressed even if an operator wanted repeated per-request reminders.
- Mitigation: changed provider/model/transport/tool-violation signatures still log, and strict false behavior is unchanged.

clawsweeper · 2026-05-17T05:34:20Z

Codex review: needs real behavior proof before merge.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
Adds a process-local keyed cache to log OpenAI strict tool-schema downgrade diagnostics once per provider/model/transport/violation signature, plus focused transport coverage and a changelog entry.

Reproducibility: yes. at source level: current main emits the downgrade log every time the strict-false debug branch is reached, and the linked report supplies repeated real log lines. I did not establish a live Azure reproduction in this read-only review.

PR rating
Overall: 🦪 silver shellfish
Proof: 🦪 silver shellfish
Patch quality: 🦞 diamond lobster
Summary: The implementation is focused and well-covered at the seam, but mock-only after-fix proof keeps the PR below normal merge readiness.

Rank-up moves:

Post redacted after-fix runtime output, terminal capture, copied live output, or logs from a real OpenClaw/Azure or native OpenAI run showing one downgrade diagnostic and suppression on a repeat.

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Needs real behavior proof before merge: The PR body provides before logs and a focused mocked test run, but no after-fix live OpenClaw/Azure or native OpenAI runtime output showing suppression in a real setup. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

No after-fix live OpenClaw/Azure or native OpenAI runtime output has been posted showing the repeated diagnostic suppressed outside the focused mocked test seam.
The process-local cache intentionally suppresses identical debug reminders until process restart or the 256-key cache clears, so maintainers should accept that operator-facing logging behavior before merge.

Maintainer options:

Decide the mitigation before merge
Merge only after redacted after-fix runtime proof is added or a maintainer explicitly overrides the proof gate, while preserving the existing strict=false payload behavior.
Pause or close
Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge
Manual review is needed because the PR has a protected maintainer label and needs contributor or maintainer-approved real runtime proof, not an automated code repair.

Security
Cleared: Cleared: the diff adds in-memory diagnostic state plus test/changelog coverage, with no new dependency, network, secret, permission, or execution surface.

Review details

Best possible solution:

Merge only after redacted after-fix runtime proof is added or a maintainer explicitly overrides the proof gate, while preserving the existing strict=false payload behavior.

Do we have a high-confidence way to reproduce the issue?

Yes at source level: current main emits the downgrade log every time the strict-false debug branch is reached, and the linked report supplies repeated real log lines. I did not establish a live Azure reproduction in this read-only review.

Is this the best way to solve the issue?

Yes: the bounded signature cache is a narrow maintainable fix for repeated diagnostics and leaves strict-mode selection unchanged. The remaining blocker is proof and protected-label maintainer handling, not an obvious code defect.

Acceptance criteria:

node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts --reporter=verbose
Redacted after-fix OpenClaw/Azure or native OpenAI runtime proof with debug logging showing one downgrade diagnostic for an identical signature and suppression on repeat

What I checked:

Current main repeats the downgrade diagnostic: On current main, resolveOpenAIStrictToolFlagWithDiagnostics calls log.debug every time strict mode is requested, resolves to false, and debug logging is enabled; there is no process-local dedupe in that branch. (src/agents/openai-transport-stream.ts:971, 2ab3a4e422a0)
PR branch adds bounded diagnostic dedupe: The PR head adds a 256-entry process-local Set and hashes transport, provider, model, and strict schema diagnostics before deciding whether to emit the debug line; it returns the same strict value either way. (src/agents/openai-transport-stream.ts:973, eebb15ed0bce)
PR branch covers the request-build seam: The added test imports the OpenAI transport with a mocked subsystem logger, builds the same strict-incompatible Responses payload twice, verifies both payloads keep strict=false, and expects one downgrade debug call. (src/agents/openai-transport-stream.test.ts:3018, eebb15ed0bce)
Strict-mode selection remains separate: The existing strict tool setting resolver enables native OpenAI/Azure strict shaping independently of the new diagnostic cache, so the PR does not change provider payload strictness selection. (src/agents/openai-strict-tool-setting.ts:42, 2ab3a4e422a0)
Proof gate remains unresolved: The PR body includes before-log evidence and a post-patch focused test run, but explicitly says no live Azure OpenAI Responses rerun was performed; the live PR context also has maintainer and status: 📣 needs proof labels. (eebb15ed0bce)
Feature-history signal: A bounded blame/history pass ties the current main strict downgrade diagnostic branch to commit 6899eff in the available local history; older authorship is not exposed by the checked-out history. (src/agents/openai-transport-stream.ts:944, 6899eff155ec)

Likely related people:

@shakkernerd: The available main-branch blame for the strict downgrade diagnostic resolver and adjacent OpenAI transport code points to this commit as the current implementation history signal; the trail is broad, so confidence is limited. (role: current-main code history; confidence: low; commits: 6899eff155ec; files: src/agents/openai-transport-stream.ts, src/agents/openai-tool-schema.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 2ab3a4e422a0.

clawsweeper · 2026-05-20T02:33:24Z

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?

The egg game starts only after the PR passes the real-behavior proof check.
Before that, no creature or rarity is rolled. The treat waits for real proof.
This is still just collectible flavor: proof affects review readiness, not creature quality.

galiniliev · 2026-05-20T03:48:09Z

Maintainer landing proof for eebb15ed0bce067078f36ca6ac6d40a7be316f9d:

Local focused proof after rebase: node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts --reporter=verbose
- Result: 2 files passed, 338 tests passed.
Azure Crabbox proof: provider azure, lease cbx_bb323ff13ba1, slug blue-shrimp, remote HEAD eebb15ed0bce067078f36ca6ac6d40a7be316f9d.
- Remote command: node scripts/run-vitest.mjs src/agents/openai-transport-stream.test.ts --reporter=verbose
- Result lines included both agents-core and agents-support passing deduplicates repeated OpenAI strict schema downgrade diagnostics, with Test Files 2 passed (2) and Tests 338 passed (338).
GitHub CI at landing check: merge state CLEAN, mergeable MERGEABLE, Real behavior proof SUCCESS, CI SUCCESS, CodeQL/security/workflow checks success, skipped, or neutral only.

Known proof gap: no live Azure OpenAI Responses rerun with the private affected credentials/session was performed. I’m accepting the focused OpenAI transport behavior proof here because the bug is the request-build diagnostic emission seam, the test exercises production buildOpenAIResponsesParams twice with debug logging enabled, and the patch leaves strict payload behavior unchanged.

…026.5.20) (#615) This PR contains the following updates: | Package | Update | Change | |---|---|---| | [ghcr.io/openclaw/openclaw](https://openclaw.ai) ([source](https://github.com/openclaw/openclaw)) | patch | `2026.5.19` → `2026.5.20` | --- > ⚠️ **Warning** > > Some dependencies could not be looked up. Check the [Dependency Dashboard](issues/567) for more information. --- ### Release Notes <details> <summary>openclaw/openclaw (ghcr.io/openclaw/openclaw)</summary> ### [`v2026.5.20`](https://github.com/openclaw/openclaw/blob/HEAD/CHANGELOG.md#2026520) [Compare Source](openclaw/openclaw@v2026.5.19...v2026.5.20) ##### Changes - Exec approvals: remove the old `cat SKILL.md && printf ... && <skill-wrapper>` allowlist compatibility path so skill files must be loaded with the read tool and only the real skill executable is auto-allowed. - Discord: let voice sessions follow configured Discord users into voice channels, with allowed-channel checks, multi-user handoff, bounded reconciliation, and DAVE recovery preservation. ([#84264](openclaw/openclaw#84264)) Thanks [@fuller-stack-dev](https://github.com/fuller-stack-dev). - Discord/voice: include bounded `IDENTITY.md`, `USER.md`, and `SOUL.md` profile context in realtime voice session instructions by default, with `voice.realtime.bootstrapContextFiles: []` available to disable it. ([#84499](openclaw/openclaw#84499)) Thanks [@fuller-stack-dev](https://github.com/fuller-stack-dev). - Dependencies: bump the bundled Codex harness to `@openai/codex` `0.132.0` and refresh the app-server model-list docs for the new catalog. - CLI/policy: add the bundled Policy plugin for policy-backed channel conformance checks, doctor lint findings, and opt-in workspace repair. ([#80407](openclaw/openclaw#80407)) Thanks [@giodl73-repo](https://github.com/giodl73-repo). - Agents/config: allow `agents.list[].experimental.localModelLean` so lean local-model mode can be enabled for one configured agent instead of globally. - Providers/xAI: add device-code OAuth login so remote and headless setups can authorize xAI without a localhost browser callback. ([#84005](openclaw/openclaw#84005)) Thanks [@fuller-stack-dev](https://github.com/fuller-stack-dev). - Providers/OpenRouter: honor provider-level `params.provider` routing policy for OpenRouter requests, with model and agent params overriding the defaults. Thanks [@amknight](https://github.com/amknight). ##### Fixes - CLI/tasks: include stale-running task maintenance decisions in `openclaw tasks maintenance --json` so retained and reconcile candidates explain backing-session, cron, CLI, and wedged-subagent state. ([#84691](openclaw/openclaw#84691)) Thanks [@efpiva](https://github.com/efpiva). - Codex app-server: keep system-prompt reports working when bootstrap hooks provide workspace files with only a path and content, so hook-supplied SOUL/IDENTITY/TOOLS/USER context still reports injected characters correctly. ([#84736](openclaw/openclaw#84736)) Thanks [@JARVIS-Glasses](https://github.com/JARVIS-Glasses). - Providers/MiniMax music: stop advertising `durationSeconds` control and remove prompt-injected duration hints, so `music_generate` reports MiniMax duration as an unsupported override instead of suggesting MiniMax can enforce track length. Fixes [#84508](openclaw/openclaw#84508). Thanks [@neeravmakwana](https://github.com/neeravmakwana). - Doctor: warn when sandbox tool policy hides configured MCP server tools before provider requests. ([#84699](openclaw/openclaw#84699)) Thanks [@nxmxbbd](https://github.com/nxmxbbd). - WhatsApp: update Baileys to `7.0.0-rc12`. - Build: suppress per-locale `rolldown-plugin-dts:fake-js` CommonJS dts warnings emitted while bundling the intentionally-inlined `zod/v4/locales/*.d.cts` files, so `pnpm build` output stays readable after the 0.25.1 plugin bump. Thanks [@romneyda](https://github.com/romneyda). - CLI/nodes: route lazy plugin-registration logs to stderr for JSON-mode `openclaw nodes` commands so stdout stays parseable. ([#84684](openclaw/openclaw#84684)) Thanks [@TurboTheTurtle](https://github.com/TurboTheTurtle). - Approvals: route manual `/approve` decisions through the trusted approval runtime so active exec and plugin approvals no longer look unknown or expired. - Mac app: update the About settings copyright year to 2026. ([#84385](openclaw/openclaw#84385)) Thanks [@pejmanjohn](https://github.com/pejmanjohn). - Dependencies: update `@openclaw/fs-safe` to `0.2.7` so OpenClaw's default Python-helper-off policy keeps best-effort Node write fallbacks for private stores, secret writes, run logs, and media attachments on Linux/macOS. - Infra/secrets: restore the fail-closed contract for `tryReadSecretFileSync` so credential loaders that pass `rejectSymlink: true` (Telegram, LINE, Zalo, IRC, Nextcloud Talk tokens) refuse symlinked credential files instead of silently accepting them, and the infra-state CI shard's secret-file symlink test passes again. Thanks [@romneyda](https://github.com/romneyda). - Browser: honor the configured image sanitization limit for screenshots and labeled snapshots so browser-captured images follow the same resize policy as other image results. ([#84595](openclaw/openclaw#84595)) - Doctor: remove unrecognized `models.providers.*.models[*].compat.thinkingFormat` values during `doctor --fix` so stale provider model config can validate after upgrade. Fixes [#77803](openclaw/openclaw#77803). - Doctor: warn when `openclaw.json` stores plaintext secret-bearing config fields, including model provider API keys and sensitive provider headers. ([#84718](openclaw/openclaw#84718)) Thanks [@lukaIvanic](https://github.com/lukaIvanic). - Status: show the configured default, session-selected model, reason, clear hint, and docs link when a session remains pinned to a model that differs from `agents.defaults.model.primary`. - WebChat: clear stale typing indicators when session change events mark the active chat run complete. - Mac app: keep local packaging signed with a stable app identity for permission testing and fix Control UI production builds under current Vite/Highlight.js exports. - macOS app: update the embedded Peekaboo bridge to 3.2.1 so OpenClaw-hosted UI automation works with current Peekaboo CLI capture flows. - Cron: deliver preferred final assistant output for successful scheduled runs when trailing plain tool warnings remain in diagnostics instead of marking the run failed. - fix(mattermost): fail closed on missing channel type \[AI]. ([#84091](openclaw/openclaw#84091)) Thanks [@pgondhi987](https://github.com/pgondhi987). - Recheck rebuilt system.run argv \[AI]. ([#84090](openclaw/openclaw#84090)) Thanks [@pgondhi987](https://github.com/pgondhi987). - CLI: keep the private QA subcommand out of exported command descriptors unless `OPENCLAW_ENABLE_PRIVATE_QA_CLI=1`, so root help and subcommand markers match runtime registration. ([#84519](openclaw/openclaw#84519)) - CLI/cron: bound `openclaw cron show` job lookup pagination so non-advancing or unbounded `cron.list` responses fail instead of hanging the command. Fixes [#83856](openclaw/openclaw#83856). ([#83989](openclaw/openclaw#83989)) - Agents/messages: stop message-tool-only turns after a successful source-channel `message` send while keeping transcript mirrors under the session write lock. ([#84289](openclaw/openclaw#84289)) - Agents: filter silent heartbeat response-tool transcript artifacts out of embedded context snapshots so later user turns are not polluted by heartbeat no-op messages. ([#83477](openclaw/openclaw#83477)) Thanks [@fuller-stack-dev](https://github.com/fuller-stack-dev). - Agents/OpenAI: log repeated strict tool-schema downgrade diagnostics once per provider/model/tool signature, reducing duplicate debug noise while preserving `strict=false` fallback behavior. Fixes [#82930](openclaw/openclaw#82930). ([#82933](openclaw/openclaw#82933)) Thanks [@galiniliev](https://github.com/galiniliev). - Agents/code mode: spell out the `exec` tool's JavaScript/TypeScript, no Node module, and catalog-bridge constraints in model-visible schema text so agents can use enabled tools without trial-and-error. ([#84269](openclaw/openclaw#84269)) Thanks [@Kaspre](https://github.com/Kaspre). - Codex: give `image_generate` dynamic-tool calls a 120s default watchdog when no per-call or configured image timeout is set, so image generation no longer falls back to the generic 30s bridge timeout. ([#84254](openclaw/openclaw#84254)) Thanks [@moritzmmayerhofer](https://github.com/moritzmmayerhofer). - Codex: avoid duplicate dynamic tool terminal diagnostics while large diagnostic backlogs drain without blocking tool responses. ([#82937](openclaw/openclaw#82937)) Thanks [@galiniliev](https://github.com/galiniliev). - CLI/message: include a stable top-level `messageId` in `openclaw message --json` output when channel sends return one. ([#84191](openclaw/openclaw#84191)) Thanks [@100menotu001](https://github.com/100menotu001). - Cron: preserve legacy top-level array `jobs.json` stores when loading or adding scheduled jobs so old cron jobs are no longer treated as an empty store during upgrade. Fixes [#60799](openclaw/openclaw#60799). ([#84433](openclaw/openclaw#84433)) Thanks [@IWhatsskill](https://github.com/IWhatsskill). - Gateway/agents: use an agent's `identity.name` in Gateway agent summaries when `agents.list[].name` is unset, so configured agent labels remain visible in clients. ([#84355](openclaw/openclaw#84355); refs [#57835](openclaw/openclaw#57835)) Thanks [@luoyanglang](https://github.com/luoyanglang). - Channels/replies: keep normal `/verbose` failed-tool progress compact in message-tool replies and prevent late text-only tool output from appearing after the final answer. ([#84303](openclaw/openclaw#84303)) Thanks [@VACInc](https://github.com/VACInc). - Plugins/hooks: apply a default 30-second timeout to `before_compaction` and `after_compaction` hooks so a hung plugin handler no longer blocks compaction completion. ([#84153](openclaw/openclaw#84153)) - Discord: preserve disabled presentation buttons when adapting and rendering Discord message controls. ([#84188](openclaw/openclaw#84188)) Thanks [@100menotu001](https://github.com/100menotu001). - Twitch: add a test-only client-manager registry reset helper so non-isolated Twitch tests can clear cached managers between cases. Fixes [#83887](openclaw/openclaw#83887). ([#84244](openclaw/openclaw#84244)) Thanks [@hclsys](https://github.com/hclsys). - Cron: run main-session scheduled work on a cron-owned wake lane while preserving reply delivery context, so background cron turns no longer block human main-session chat. Fixes [#82766](openclaw/openclaw#82766). ([#82767](openclaw/openclaw#82767)) Thanks [@galiniliev](https://github.com/galiniliev). - Cron: use structured embedded-run denial metadata for isolated scheduled tasks so blocked exec requests fail the job without treating ordinary assistant prose as a denial. ([#84067](openclaw/openclaw#84067)) Thanks [@abnershang](https://github.com/abnershang). - Cron: keep recovered tool warnings diagnostic for successful scheduled runs so final cron output is delivered instead of being replaced by a post-processing warning. ([#84045](openclaw/openclaw#84045)) Thanks [@abnershang](https://github.com/abnershang). - Plugins/perf: thread explicit plugin discovery results through `loadBundledCapabilityRuntimeRegistry`, `resolveBundledPluginSources`, and `listChannelCatalogEntries` so callers that already hold a discovery result skip redundant filesystem walks. Thanks [@SebTardif](https://github.com/SebTardif). - harden update restart script creation \[AI]. ([#84088](openclaw/openclaw#84088)) Thanks [@pgondhi987](https://github.com/pgondhi987). - Docker: keep the bundled Codex plugin in official release image keep lists so the default OpenAI agent harness remains available after Docker pruning. Fixes [#83613](openclaw/openclaw#83613). ([#83626](openclaw/openclaw#83626)) Thanks [@YuanHanzhong](https://github.com/YuanHanzhong). - CLI/channels: preserve the first line of `openclaw channels logs` output when the rolling tail window starts exactly on a line boundary, mirroring the already-fixed `readLogSlice` behavior in `src/logging/log-tail.ts`. - Control UI: treat terminal session status as authoritative over stale active-run flags so completed terminal runs stop showing abort/live UI. ([#84057](openclaw/openclaw#84057)) - CLI: preserve embedded equals signs in inline root option values instead of truncating after the second separator. ([#83995](openclaw/openclaw#83995)) Thanks [@ThiagoCAltoe](https://github.com/ThiagoCAltoe). - Matrix/config: accept `messages.queue.byChannel.matrix` queue overrides and keep queue provider schema/type keys aligned for Matrix, Google Chat, and Mattermost. Thanks [@bdjben](https://github.com/bdjben). - CLI: format `openclaw acp client` failures through the shared error formatter so object-shaped errors stay readable instead of printing `[object Object]`. Fixes [#83904](openclaw/openclaw#83904). ([#84080](openclaw/openclaw#84080)) - Providers/Ollama: default unknown-capabilities models to tool-capable so discovered native Ollama models can use tools when `/api/show` omits capabilities. ([#84055](openclaw/openclaw#84055)) Thanks [@dutifulbob](https://github.com/dutifulbob). - Installer/Windows: launch `install.ps1` onboarding as an attached child process so fresh native Windows installs do not freeze visibly at `Starting setup...` or corrupt the wizard's terminal rendering. - CLI/update: keep restart health checks working across one-version CLI/Gateway protocol skew and use the managed Gateway service Node for all follow-up commands even when the package root is unchanged, so `openclaw update` no longer silently switches the gateway to a different Node binary when multiple Node installations are present. Thanks [@amknight](https://github.com/amknight). - CLI/gateway: include the running Gateway version in `gateway status` JSON output, preserving existing server metadata while falling back to status RPC data for read probes. Fixes [#56222](openclaw/openclaw#56222). Thanks [@galiniliev](https://github.com/galiniliev). - Memory/search: close local embedding providers when active-memory searches time out so pending local model loads and embedding contexts are aborted and released. ([#83858](openclaw/openclaw#83858)) Thanks [@brokemac79](https://github.com/brokemac79). - CLI/nodes: request pending node surface approval scopes before `openclaw nodes approve` so exec-capable node approval can use admin-scoped Gateway credentials instead of failing with `missing scope: operator.admin`. ([#84392](openclaw/openclaw#84392)) Thanks [@joshavant](https://github.com/joshavant). - Gateway: reject slow node event sends before outbound buffers grow unbounded and log the rejected payload diagnostic. ([#84387](openclaw/openclaw#84387)) Thanks [@samzong](https://github.com/samzong). - Agents: include bounded trajectory queued-writer diagnostics in `pi-trajectory-flush` timeout warnings so flush stalls show pending writes, queued bytes, and append state. Fixes [#82961](openclaw/openclaw#82961). ([#82962](openclaw/openclaw#82962)) Thanks [@galiniliev](https://github.com/galiniliev). - Agents/subagents: recover stale completion announces by retrying unsupported transcript-wait wakes without transcript waiting and forcing a message-tool handoff when the requester run is already stale. Fixes [#83699](openclaw/openclaw#83699). ([#83700](openclaw/openclaw#83700)) Thanks [@galiniliev](https://github.com/galiniliev). - Agents/subagents: constrain wildcard subagent target allowlists to configured agents while preserving explicitly listed compatibility targets. Fixes [#84040](openclaw/openclaw#84040). ([#84357](openclaw/openclaw#84357)) Thanks [@joshavant](https://github.com/joshavant). - Providers/Anthropic: route Anthropic model refs selected with Claude CLI auth through the Claude CLI runtime so shorthand refs such as `anthropic/opus-4.7` no longer fall back to embedded Anthropic billing. Fixes [#84222](openclaw/openclaw#84222). ([#84374](openclaw/openclaw#84374)) Thanks [@joshavant](https://github.com/joshavant). - Agents: honor explicit `models.providers.<id>.timeoutSeconds` values above the default idle watchdog for cloud and self-hosted providers, so long first-token waits no longer fall back at \~120s when the provider timeout is higher. ([#83979](openclaw/openclaw#83979)) Thanks [@yujiawei](https://github.com/yujiawei). - Agents/Codex: keep encrypted Responses reasoning replay provenance-bound so stale mirrored Codex transcripts drop invalid encrypted content before request assembly while preserving matching same-session replay. Fixes [#83836](openclaw/openclaw#83836). ([#84367](openclaw/openclaw#84367)) Thanks [@joshavant](https://github.com/joshavant). - Agents/subagents: skip stale embedded-run wake probes for dormant completion requesters, so late subagent completions go straight to requester-agent/direct handoff instead of producing `reason=no_active_run` queue noise. ([#82964](openclaw/openclaw#82964)) Thanks [@galiniliev](https://github.com/galiniliev). - CLI: retry config snapshot reads after a transient failure so one rejected read no longer poisons later commands in the same process. ([#83931](openclaw/openclaw#83931)) Thanks [@honor2030](https://github.com/honor2030). - Media: decode URL path basenames before using them as remote media fallback filenames, so files like `My%20Report.pdf` are surfaced as `My Report.pdf`. Fixes [#84050](openclaw/openclaw#84050). ([#84052](openclaw/openclaw#84052)) Thanks [@jbetala7](https://github.com/jbetala7). - WhatsApp: clarify inbound group diagnostics so observed but unregistered groups point to `channels.whatsapp.groups` without changing routing or sender authorization. ([#83846](openclaw/openclaw#83846)) Thanks [@neeravmakwana](https://github.com/neeravmakwana). - WhatsApp: drain pending outbound deliveries on a 30s periodic timer in addition to the reconnect handler, so messages enqueued while the provider is already connected no longer wait for the next reconnect to send. ([#79083](openclaw/openclaw#79083)) Thanks [@Oviemudiaga](https://github.com/Oviemudiaga). - CLI/TUI: include gateway plugin slash commands in TUI autocomplete, so connected sessions can suggest plugin-owned commands exposed by the running Gateway. ([#83640](openclaw/openclaw#83640)) Thanks [@se7en-agent](https://github.com/se7en-agent). - Gateway/mobile: restore QR setup-code handoff of bounded operator tokens for iOS and Android onboarding while keeping admin and pairing scopes out of bootstrap. ([#83684](openclaw/openclaw#83684)) Thanks [@ngutman](https://github.com/ngutman). - iOS: repair Release archive compilation for the TestFlight build. ([#84255](openclaw/openclaw#84255)) Thanks [@ngutman](https://github.com/ngutman). - Agents/compaction: bound plugin-owned CLI transcript compaction with the host safety timeout so a hung context engine can no longer stall post-turn cleanup. ([#84083](openclaw/openclaw#84083)) Thanks [@100yenadmin](https://github.com/100yenadmin). - Control UI/usage: truncate long context skill, tool, and file names in the usage panel while keeping the full name available on hover. ([#42197](openclaw/openclaw#42197)) Thanks [@Rain120](https://github.com/Rain120). - Codex: respect explicit `models auth order set` and `config.auth.order` precedence over stale `lastGood` in `/codex account`, and show `no working credential` when every explicit-order profile is ineligible instead of marking a lower-ranked profile as active. Fixes [#84386](openclaw/openclaw#84386). ([#84412](openclaw/openclaw#84412)) Thanks [@openperf](https://github.com/openperf). - Agents: honor `messages.suppressToolErrors` for mutating tool failures so configured chat surfaces do not receive separate warning payloads. ([#81561](openclaw/openclaw#81561)) Thanks [@moeedahmed](https://github.com/moeedahmed). - Agents/fallback: surface billing guidance for mixed rate-limit plus billing fallback exhaustion instead of generic failure copy. Fixes [#79396](openclaw/openclaw#79396). ([#79489](openclaw/openclaw#79489)) Thanks [@aayushprsingh](https://github.com/aayushprsingh). </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this PR and you won't be reminded about these updates again. --- - [ ] If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).  Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/615

* fix: dedupe openai strict schema downgrade logs * test: align openai transport helper export * test: cover openai downgrade log behavior * docs: note openai downgrade diagnostic dedupe --------- Co-authored-by: Galin Iliev <Galin.Iliev@microsoft.com>

openclaw-barnacle Bot added agents Agent runtime and tooling size: S maintainer Maintainer-authored PR labels May 17, 2026

clawsweeper Bot mentioned this pull request May 17, 2026

[Bug]: OpenAI strict tool schema downgrade diagnostic repeats on nearly every request #82930

Closed

clawsweeper Bot added the P2 Normal backlog priority with limited blast radius. label May 17, 2026

galiniliev assigned steipete and galiniliev and unassigned steipete May 20, 2026

clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels May 20, 2026

openclaw-barnacle Bot added size: M and removed size: S labels May 20, 2026

galiniliev force-pushed the bug-006-openai-strict-downgrade-log branch from a4441d2 to 10c6bcb Compare May 20, 2026 03:16

openclaw-barnacle Bot added size: S and removed size: M labels May 20, 2026

galiniliev and others added 4 commits May 20, 2026 03:20

fix: dedupe openai strict schema downgrade logs

c9121fc

test: align openai transport helper export

5902ef0

test: cover openai downgrade log behavior

5dd892d

docs: note openai downgrade diagnostic dedupe

eebb15e

galiniliev force-pushed the bug-006-openai-strict-downgrade-log branch from 10c6bcb to eebb15e Compare May 20, 2026 03:20

galiniliev removed the status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. label May 20, 2026

galiniliev merged commit c982358 into openclaw:main May 20, 2026
101 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: dedupe OpenAI strict schema downgrade diagnostics#82933

fix: dedupe OpenAI strict schema downgrade diagnostics#82933
galiniliev merged 4 commits into
openclaw:mainfrom
galiniliev:bug-006-openai-strict-downgrade-log

galiniliev commented May 17, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 17, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented May 20, 2026

Uh oh!

galiniliev commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

galiniliev commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Real behavior proof (required for external PRs)

Root Cause (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Uh oh!

clawsweeper Bot commented May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented May 20, 2026

Uh oh!

galiniliev commented May 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

galiniliev commented May 17, 2026 •

edited

Loading

clawsweeper Bot commented May 17, 2026 •

edited

Loading