Skip to content

OpenAI compat gateway images bypass sanitization/resize, hit Anthropic 5MB limit #59913

@htplbc

Description

@htplbc

Summary

Images sent via the OpenAI-compatible /v1/chat/completions endpoint (as image_url content parts) bypass the image sanitization/resize pipeline. Large images (e.g., Mac Retina screenshots) are forwarded to Anthropic at full resolution and size, hitting Anthropic's 5MB per-image limit and returning Could not process image or image exceeds 5 MB maximum.

A secondary issue: in streaming mode (stream: true), the Anthropic API error is completely swallowed — the client receives only data: [DONE] with no error content.

Steps to Reproduce

  1. Run an OpenClaw gateway with an Anthropic provider
  2. Send a large image (>5MB decoded) via the OpenAI compat endpoint:
curl -X POST https://AGENT_HOST/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "model": "openclaw",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "Describe this image"},
        {"type": "image_url", "image_url": {"url": "data:image/png;base64,<large-base64-png>"}}
      ]
    }],
    "stream": false
  }'
  1. Non-streaming: Response is 200 OK with error text in content: LLM request rejected: messages.0.content.1.image.source.base64: image exceeds 5 MB maximum
  2. Streaming: Response is 200 OK with only data: [DONE] — error is completely lost

Root Cause

Bug 1: Gateway images skip sanitization

In src/agents/pi-embedded-runner/run/images.ts, detectAndLoadPromptImages() has an early return when no image references are found in the prompt text:

// images.ts lines 500-507
const allRefs = detectImageReferences(params.prompt);

if (allRefs.length === 0) {
    return {
      images: params.existingImages ?? [],  // ← returned WITHOUT sanitization
      detectedRefs: [],
      loadedCount: 0,
      skippedCount: 0,
    };
}

When images arrive via the OpenAI compat gateway (openai-http.ts), they are extracted from image_url parts and passed as existingImages. The prompt text contains only the user's text (no file path references), so allRefs.length === 0 and the images skip sanitizeImagesWithLog() entirely.

The sanitization pipeline (via sanitizeContentBlocksImagesresizeImageBase64IfNeeded) would normally resize images exceeding 1200px to JPEG and enforce size limits. Without it, large PNGs pass through to Anthropic at full resolution.

Affected path:

  1. Gateway: openai-http.tsparseImageUrlToSource()extractImageContentFromSource()ImageContent
  2. Agent command: attempt-execution.ts:458 passes images to runEmbeddedPiAgent
  3. Embedded runner: attempt.ts:1477 calls detectAndLoadPromptImages({ existingImages: params.images })
  4. Early return at line 502 → images go directly to activeSession.prompt(text, { images }) unsanitized
  5. pi-ai convertMessages() converts to Anthropic format → 400 error from API

Working path (for comparison): Images detected from prompt text (file paths, media URIs) go through the full pipeline: loadImageFromRefmergePromptAttachmentImagessanitizeImagesWithLog → resize to 1200px JPEG.

Bug 2: Streaming error swallowed

In src/gateway/openai-http.ts, there's a race between the lifecycle event listener and the async catch block:

// Event listener (lines 573-581)
if (evt.stream === "lifecycle") {
    const phase = evt.data?.phase;
    if (phase === "end" || phase === "error") {
        closed = true;       // ← sets closed before catch runs
        unsubscribe();
        writeDone(res);
        res.end();
    }
}

// Async catch block (lines 613-628)
} catch (err) {
    if (closed) return;      // ← sees closed=true, skips error content
    writeAssistantContentChunk(res, {
        content: "Error: internal error",  // ← never reached
        ...
    });
}

When the lifecycle "error" event fires first, it closes the stream. The catch block then sees closed = true and returns without writing any error content.

Expected Behavior

  1. Gateway-provided images should go through the same sanitizeImagesWithLog pipeline as prompt-detected images (resize to max 1200px, JPEG conversion, size enforcement)
  2. Streaming mode should surface errors to the client (either as an SSE error event or as content before [DONE])

Suggested Fix

For Bug 1

In detectAndLoadPromptImages(), apply sanitization to existingImages even on the early-return path:

if (allRefs.length === 0) {
    const sanitized = await sanitizeImagesWithLog(
      params.existingImages ?? [],
      "prompt:images",
      { maxDimensionPx: params.maxDimensionPx },
    );
    return {
      images: sanitized,
      detectedRefs: [],
      loadedCount: 0,
      skippedCount: 0,
    };
}

For Bug 2

In the streaming handler, ensure the catch block writes the error before the lifecycle listener can close the stream, or write the error in the lifecycle error handler itself.

Environment

  • OpenClaw version: latest (as of 2026-04-02)
  • Provider: Anthropic (claude-sonnet-4-5-20250929)
  • pi-ai: 0.64.0
  • Anthropic SDK: 0.81.0
  • Hosting: GKE Autopilot (gVisor), accessed via OpenAI compat gateway

Impact

Any client sending images >5MB (decoded) via the OpenAI-compatible API will get silent failures in streaming mode. Common scenario: Mac Retina screenshots (5120×2880 PNG, typically 3-10MB) pasted into a chat UI that uses the OpenAI format.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions