Skip to content

fix: sanitize OpenAI compat gateway images and surface streaming errors#62070

Open
htplbc wants to merge 5 commits into
openclaw:mainfrom
htplbc:fix/openai-compat-image-sanitization-and-streaming-error
Open

fix: sanitize OpenAI compat gateway images and surface streaming errors#62070
htplbc wants to merge 5 commits into
openclaw:mainfrom
htplbc:fix/openai-compat-image-sanitization-and-streaming-error

Conversation

@htplbc

@htplbc htplbc commented Apr 6, 2026

Copy link
Copy Markdown

Summary

Fixes #59913

  • Bug 1: Gateway-provided images (via OpenAI compat /v1/chat/completions image_url parts) bypassed the sanitization/resize pipeline. detectAndLoadPromptImages() early-returned existingImages unsanitized when no image refs were found in the prompt text. Large images (e.g. Mac Retina screenshots >5MB) were forwarded to Anthropic at full resolution, hitting the 5MB per-image limit.
  • Bug 2: In streaming mode (stream: true), when a lifecycle "error" event fired before the async catch block could run, the stream was closed with only data: [DONE] — the error was completely swallowed and the client received no error content.

Changes

  • src/agents/pi-embedded-runner/run/images.ts: Apply sanitizeImagesWithLog() to existingImages on the early-return path (no prompt image refs detected)
  • src/gateway/openai-http.ts: Write error content chunk in the lifecycle "error" handler before closing the stream, when no assistant content was already sent

Test plan

  • New test: existingImages are sanitized even when prompt has no image references (images.test.ts)
  • New test: error content appears in SSE stream when lifecycle error fires without prior assistant deltas (openai-http.test.ts)
  • All 32 existing image tests pass
  • All 6 existing gateway openai-http tests pass

AI-assisted: Built with Claude Code. Fully tested locally. I understand what the code does.

🤖 Generated with Claude Code

…ompat endpoint

Gateway-provided images (via OpenAI compat /v1/chat/completions) bypassed
the sanitization/resize pipeline because detectAndLoadPromptImages() early-
returned existingImages unsanitized when no image refs were found in prompt
text. Large images (e.g. Mac Retina screenshots) hit Anthropic's 5MB limit.

Additionally, in streaming mode, when a lifecycle "error" event fired before
the async catch block, the stream was closed with only [DONE] and no error
content — silently swallowing the failure.

Fixes openclaw#59913

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@openclaw-barnacle openclaw-barnacle Bot added gateway Gateway runtime agents Agent runtime and tooling size: S labels Apr 6, 2026
@greptile-apps

greptile-apps Bot commented Apr 6, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Fixes two bugs: gateway-provided images now run through sanitizeImagesWithLog on the early-return path in detectAndLoadPromptImages (instead of being forwarded unsanitized), and the streaming lifecycle "error" handler now writes an error content chunk before closing the SSE stream so clients receive a visible error instead of a bare [DONE]. Both fixes are internally consistent with their surrounding patterns and have appropriate new test coverage.

Confidence Score: 5/5

Safe to merge — both fixes are correct and the only open item is a minor test-reliability suggestion.

No P0 or P1 issues found. The image sanitization fix passes the same limits as the non-early-return path (only maxDimensionPx; maxBytes is also omitted in the existing path). The streaming error fix correctly relies on the existing closed flag and the if (!closed) guard in the finally block to prevent double-writes and double-closes. Single P2 comment is about replacing a wall-clock sleep with Promise.resolve() for more reliable event-loop ordering in the new test.

src/gateway/openai-http.test.ts — consider replacing the 50 ms setTimeout with await Promise.resolve() in the new lifecycle-error streaming test.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/gateway/openai-http.test.ts
Line: 861-862

Comment:
**Timing-dependent test synchronization**

The 50 ms `setTimeout` races against event-loop scheduling — slow CI could let the lifecycle error handler fire *after* the `await`, making this test intermittently miss the fix. `emitAgentEvent` is synchronous and its listeners run in the same tick, so `await Promise.resolve()` is sufficient to yield without a wall-clock delay.

```suggestion
      // Yield the microtask queue so the synchronous event handler runs before we return
      await Promise.resolve();
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "fix: sanitize gateway images and surface..." | Re-trigger Greptile

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7e29768fa0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/gateway/openai-http.ts
Comment thread src/gateway/openai-http.test.ts Outdated
htplbc and others added 3 commits April 6, 2026 11:41
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…or path

Set sawAssistantDelta before emitting the lifecycle error event in the catch
block, so the lifecycle handler skips writing a second error chunk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@htplbc

htplbc commented Apr 6, 2026

Copy link
Copy Markdown
Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@clawsweeper

clawsweeper Bot commented May 2, 2026

Copy link
Copy Markdown
Contributor

Codex review: needs real behavior proof before merge. Reviewed June 6, 2026, 12:45 AM ET / 04:45 UTC.

Summary
Review failed before ClawSweeper could summarize the requested change.

PR surface: Source +16, Tests +51. Total +67 across 4 files.

Reproducibility: unclear. The review failed before ClawSweeper could establish a reproduction path.

Review metrics: none identified.

Merge readiness
Overall: 🌊 off-meta tidepool
Proof: 🌊 off-meta tidepool
Patch quality: 🌊 off-meta tidepool
Result: rating does not apply to this item.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Risk before merge

  • [P1] No close action taken because the review did not complete.

Maintainer options:

  1. Decide the mitigation before merge
    Retry the Codex review after fixing the execution failure.
  2. Pause or close
    Do not merge this PR until maintainers decide whether the risk is worth taking.

Next step before merge

  • [P1] Review did not complete, so no work-lane recommendation was made.
Review details

Best possible solution:

Retry the Codex review after fixing the execution failure.

Do we have a high-confidence way to reproduce the issue?

Unclear. The review failed before ClawSweeper could establish a reproduction path.

Is this the best way to solve the issue?

Unclear. Retry the review first so ClawSweeper can evaluate the actual issue and fix direction.

AGENTS.md: unclear because the file could not be read completely.

Codex review notes: model gpt-5.5, reasoning high; reviewed against 9313471fa579.

Label changes

Label changes:

  • remove P2: Current review triage priority is none.
  • remove merge-risk: 🚨 compatibility: Current PR review selected no merge-risk labels.
  • remove merge-risk: 🚨 message-delivery: Current PR review selected no merge-risk labels.

Label justifications:

  • rating: 🌊 off-meta tidepool: Overall readiness is 🌊 off-meta tidepool; proof is 🌊 off-meta tidepool and patch quality is 🌊 off-meta tidepool.
Evidence reviewed

PR surface:

Source +16, Tests +51. Total +67 across 4 files.

View PR surface stats
Area Files Added Removed Net
Source 2 17 1 +16
Tests 2 51 0 +51
Docs 0 0 0 0
Config 0 0 0 0
Generated 0 0 0 0
Other 0 0 0 0
Total 4 68 1 +67

What I checked:

  • failure reason: codex execution failed.
  • codex failure detail: Codex review failed for this PR with exit 1.
  • codex stdout: Per-item Codex failure; continuing with the rest of the shard.

Likely related people:

  • unknown: Codex failed before it could trace repository history. (role: review did not complete; confidence: low)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label May 3, 2026
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal backlog priority with limited blast radius. merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. labels May 19, 2026
@openclaw-barnacle openclaw-barnacle Bot added the triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup. label May 19, 2026
@clawsweeper

clawsweeper Bot commented May 20, 2026

Copy link
Copy Markdown
Contributor

ClawSweeper PR egg

🎁 Pass real behavior proof to wake the egg and unlock a hatchable treat.

Where did the egg go?
  • The egg game starts only after the PR passes the real-behavior proof check.
  • Before that, no creature or rarity is rolled. The treat waits for real proof.
  • This is still just collectible flavor: proof affects review readiness, not creature quality.

@openclaw-barnacle

Copy link
Copy Markdown

This pull request has been automatically marked as stale due to inactivity.
Please add updates or it will be closed.

@openclaw-barnacle openclaw-barnacle Bot added the stale Marked as stale due to inactivity label Jun 5, 2026
@clawsweeper clawsweeper Bot added rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. and removed rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. labels Jun 5, 2026
@clawsweeper clawsweeper Bot removed the status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. label Jun 5, 2026
@openclaw-barnacle openclaw-barnacle Bot removed the stale Marked as stale due to inactivity label Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent runtime and tooling gateway Gateway runtime merge-risk: 🚨 compatibility 🚨 May break existing users, config, migrations, defaults, or upgrade paths. merge-risk: 🚨 message-delivery 🚨 May drop, duplicate, misroute, suppress, or wrongly target messages. P2 Normal backlog priority with limited blast radius. rating: 🌊 off-meta tidepool PR readiness rating does not apply to this item. size: S triage: needs-real-behavior-proof Candidate: external PR needs after-fix proof from a real setup.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OpenAI compat gateway images bypass sanitization/resize, hit Anthropic 5MB limit

1 participant