Skip to content

fix: clarify custom provider HTML responses#84703

Open
zerone0x wants to merge 1 commit into
openclaw:mainfrom
zerone0x:fix/custom-provider-html-response
Open

fix: clarify custom provider HTML responses#84703
zerone0x wants to merge 1 commit into
openclaw:mainfrom
zerone0x:fix/custom-provider-html-response

Conversation

@zerone0x

Copy link
Copy Markdown
Contributor

Summary

  • Reject custom-provider verification probes that return successful-looking non-JSON responses, with a /v1 base URL hint.
  • Surface a clearer OpenAI-compatible runtime error when the provider returns HTML instead of an API response.
  • Add regression coverage for onboarding verification and OpenAI-compatible streaming HTML responses.

Fixes #84697

Verification

  • node scripts/run-vitest.mjs src/commands/onboard-custom.test.ts src/commands/onboard-custom-config.test.ts src/agents/openai-transport-stream.test.ts
  • git diff --check

Real behavior proof

Behavior addressed: OpenAI-compatible custom providers that point at a web root can return text/html; OpenClaw now rejects that during setup and reports a base URL/API-path hint at runtime instead of falling through to a generic response failure.
Real environment tested: Local checkout with a loopback HTTP server in src/agents/openai-transport-stream.test.ts returning text/html to the OpenAI-compatible Chat Completions request path.
Exact steps or command run after this patch: node scripts/run-vitest.mjs src/commands/onboard-custom.test.ts src/commands/onboard-custom-config.test.ts src/agents/openai-transport-stream.test.ts
Evidence after fix: The new runtime regression asserts the error includes "returned HTML instead of an API response" and "baseUrl includes the provider API path, such as /v1"; the onboarding regression asserts a successful HTTP response with invalid JSON is rejected with the same base URL guidance.
Observed result after fix: 4 test files passed; 396 tests passed.
What was not tested: Live spanagent.xyz provider traffic; the regression uses a local HTTP server to reproduce the HTML response class safely.

@openclaw-barnacle openclaw-barnacle Bot added commands Command implementations agents Agent runtime and tooling size: S triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI. labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge.

Latest ClawSweeper review: 2026-05-22 15:57 UTC / May 22, 2026, 11:57 AM ET.

Workflow note: Future ClawSweeper reviews update this same comment in place.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Summary
The branch rejects successful-looking non-JSON custom-provider probes, adds OpenAI-compatible HTML-response baseUrl hints and tests, and also changes Codex app-server node/sandbox tool routing plus the changelog.

Reproducibility: yes. Source inspection shows current main accepts any 2xx custom-provider verification response and the OpenAI SDK posts Chat Completions to /chat/completions, while the PR head clearly removes the current-main node-host Codex routing guard and tests.

PR rating
Overall: 🧂 unranked krab
Proof: 🦪 silver shellfish
Patch quality: 🧂 unranked krab
Summary: The provider fix has focused tests, but mock-only proof plus an unrelated Codex execution regression make the current PR not quality-ready.

Rank-up moves:

  • Rebase on current main and restore the Codex host=node routing code, tests, and changelog entry.
  • Add redacted real OpenClaw/custom-provider proof showing the new setup or runtime baseUrl hint after the patch.
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

Real behavior proof
Needs real behavior proof before merge: The PR body reports loopback Vitest coverage only; it needs redacted real OpenClaw/custom-provider terminal output, logs, screenshot, recording, or linked artifact after the patch before merge unless maintainers override proof. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Risk before merge

  • Merging the current head would undo current-main node-host Codex execution routing, so /exec host=node can lose OpenClaw exec/process and fall back to native app-server Code Mode on the gateway instead of the selected node.
  • The PR body only shows loopback Vitest coverage; it does not show a real OpenClaw/custom-provider setup surfacing the new baseUrl hint after the patch.
  • The branch is not focused on the custom-provider bug anymore because it also changes Codex execution routing and removes the corresponding changelog entry from current main.

Maintainer options:

  1. Rebase and restore node-host routing (recommended)
    Update the branch onto current main and keep the isEffectiveExecHostNode plus addNodeShellDynamicToolsIfNeeded behavior and regression tests before considering the custom-provider fix.
  2. Split the unrelated Codex change out
    If the contributor intended to change Codex app-server tool routing, move that into a separate maintainer-reviewed PR with explicit rationale and proof.
  3. Require proof after repair
    After the code regression is fixed, require redacted terminal/log/screenshot proof from a real OpenClaw custom-provider run or an explicit maintainer proof override.

Next step before merge
The remaining action is contributor or maintainer handling: rebase away the unrelated Codex regression and provide real behavior proof or an explicit proof override.

Security
Needs attention: The branch removes a current-main guard that keeps node-host shell execution on the OpenClaw-selected node path instead of native Codex app-server execution.

Review findings

  • [P1] Restore node-host Codex tool routing — extensions/codex/src/app-server/run-attempt.ts:3669-3675
Review details

Best possible solution:

Rebase on current main, keep the custom-provider JSON/HTML diagnostic changes, restore the node-host Codex routing fix and its tests/changelog, then add redacted real OpenClaw/custom-provider proof or get an explicit maintainer proof override.

Do we have a high-confidence way to reproduce the issue?

Yes. Source inspection shows current main accepts any 2xx custom-provider verification response and the OpenAI SDK posts Chat Completions to /chat/completions, while the PR head clearly removes the current-main node-host Codex routing guard and tests.

Is this the best way to solve the issue?

No for the current branch. The custom-provider diagnostic approach is narrow in isolation, but the force-pushed Codex/changelog regression must be removed before this is a safe fix path.

Label changes:

  • add P1: The current PR head can regress a recently fixed command execution routing path for users relying on Codex app-server with exec host=node.
  • add merge-risk: 🚨 compatibility: The force-pushed branch removes current-main behavior that preserves existing host=node exec routing setups.
  • add merge-risk: 🚨 security-boundary: Dropping the node-host guard can run shell-capable native Code Mode on the gateway rather than through OpenClaw's selected node execution path.
  • add merge-risk: 🚨 availability: Node-targeted Codex runs can lose the OpenClaw exec/process tools needed to complete remote-node command workflows.
  • add rating: 🧂 unranked krab: Current PR rating is 🧂 unranked krab because proof is 🦪 silver shellfish, patch quality is 🧂 unranked krab, and The provider fix has focused tests, but mock-only proof plus an unrelated Codex execution regression make the current PR not quality-ready.
  • remove P2: Current review triage priority is P1, so this older priority label is no longer current.
  • remove rating: 🦪 silver shellfish: Current PR rating is rating: 🧂 unranked krab, so this older rating label is no longer current.

Label justifications:

  • P1: The current PR head can regress a recently fixed command execution routing path for users relying on Codex app-server with exec host=node.
  • merge-risk: 🚨 compatibility: The force-pushed branch removes current-main behavior that preserves existing host=node exec routing setups.
  • merge-risk: 🚨 security-boundary: Dropping the node-host guard can run shell-capable native Code Mode on the gateway rather than through OpenClaw's selected node execution path.
  • merge-risk: 🚨 availability: Node-targeted Codex runs can lose the OpenClaw exec/process tools needed to complete remote-node command workflows.
  • rating: 🧂 unranked krab: Current PR rating is 🧂 unranked krab because proof is 🦪 silver shellfish, patch quality is 🧂 unranked krab, and The provider fix has focused tests, but mock-only proof plus an unrelated Codex execution regression make the current PR not quality-ready.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body reports loopback Vitest coverage only; it needs redacted real OpenClaw/custom-provider terminal output, logs, screenshot, recording, or linked artifact after the patch before merge unless maintainers override proof. After adding proof, update the PR body; ClawSweeper should re-review automatically. If it does not, the PR author or someone with repository write access can comment @clawsweeper re-review.

Full review comments:

  • [P1] Restore node-host Codex tool routing — extensions/codex/src/app-server/run-attempt.ts:3669-3675
    Current main added isEffectiveExecHostNode and addNodeShellDynamicToolsIfNeeded so /exec host=node disables native Codex Code Mode and still exposes OpenClaw exec/process. This branch removes that path, so node-targeted Codex runs can execute on the gateway/native app-server surface instead of the selected node and lose the tools that route through OpenClaw. Please rebase and keep the current-main node-host routing behavior and tests.
    Confidence: 0.93

Overall correctness: patch is incorrect
Overall confidence: 0.91

Security concerns:

  • [high] Node-host shell routing guard removed — extensions/codex/src/app-server/run-attempt.ts:3669
    By removing the host=node native-surface guard and OpenClaw exec/process reinjection, the PR can route shell-capable Codex turns to the wrong execution boundary.
    Confidence: 0.9

What I checked:

  • Current main custom-provider verification behavior: Current main still treats any 2xx custom-provider verification probe as successful, which matches the linked report that an HTML web-root response can pass setup without a useful baseUrl hint. (src/commands/onboard-custom.ts:110, 5cc0dbce8628)
  • OpenAI SDK endpoint contract: The OpenAI Node SDK Chat Completions resource posts to /chat/completions, so OpenAI-compatible providers served under /v1 need that path in the configured baseURL.
  • PR custom-provider implementation: The branch adds successful-response JSON validation for onboarding probes and a runtime OpenAI-compatible HTML-response hint that includes a redacted configured baseUrl. (src/commands/onboard-custom.ts:95, 0bba7f652d68)
  • Regression introduced by current PR head: The PR head removes current-main node-host handling by calling only addSandboxShellDynamicToolsIfAvailable, so node-targeted Codex runs no longer re-add OpenClaw exec and process after native Code Mode is disabled. (extensions/codex/src/app-server/run-attempt.ts:3669, 0bba7f652d68)
  • Current-main node-host fix provenance: Current main commit 5cc0dbc added isEffectiveExecHostNode, addNodeShellDynamicToolsIfNeeded, regression tests, and a changelog entry for routing /exec host=node through OpenClaw tools instead of native Codex Code Mode. (extensions/codex/src/app-server/run-attempt.ts:3730, 5cc0dbce8628)
  • Diff sanity check: The diff is whitespace-clean, but it includes unrelated Codex and changelog changes beyond the custom-provider diagnostic fix. (0bba7f652d68)

Likely related people:

  • vincentkoc: Authored current-main commit 5cc0dbc that added the Codex node-host routing behavior this PR head now removes. (role: recent area contributor; confidence: high; commits: 5cc0dbce8628; files: extensions/codex/src/app-server/run-attempt.ts, extensions/codex/src/app-server/run-attempt.test.ts, CHANGELOG.md)
  • steipete: Blame on the custom-provider onboarding, guarded fetch, and surrounding Codex app-server code in this shallow checkout points mostly to Peter Steinberger's recent baseline commit. (role: adjacent owner; confidence: medium; commits: f4bdfd46a99a; files: src/commands/onboard-custom.ts, src/agents/provider-transport-fetch.ts, src/agents/openai-transport-stream.ts)
  • openperf: Git history for node-host execution includes prior fixes around exec host=node routing, making this a useful routing candidate if maintainers need more context. (role: related historical contributor; confidence: medium; commits: d98eaba4c3b3, 77509024b8c6; files: extensions/codex/src/app-server/run-attempt.ts)

Codex review notes: model gpt-5.5, reasoning high; reviewed against 5cc0dbce8628.

@clawsweeper clawsweeper Bot added rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. labels May 20, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@zerone0x zerone0x force-pushed the fix/custom-provider-html-response branch from 294c81d to 0bba7f6 Compare May 22, 2026 15:50
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. P1 High-priority user-facing bug, regression, or broken workflow. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. and removed rating: 🦪 silver shellfish Thin PR readiness signal; proof, validation, or implementation needs work. P2 Normal backlog priority with limited blast radius. labels May 22, 2026
@osolmaz osolmaz self-assigned this May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling commands Command implementations merge-risk: 🚨 availability 🚨 May cause crashes, hangs, restart loops, stalls, or process outages. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 security-boundary 🚨 May affect sandboxing, authorization, credentials, or sensitive data. P1 High-priority user-facing bug, regression, or broken workflow. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. size: S status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. triage: mock-only-proof Candidate: PR proof only shows tests, mocks, snapshots, lint, typecheck, or CI.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Custom OpenAI-compatible provider with baseUrl without /v1 fails with cryptic 'incomplete terminal response' error

2 participants