-
-
Notifications
You must be signed in to change notification settings - Fork 79.1k
media: hardcoded image size limits (bytes/input-pixels) are not user-configurable across sanitize layers #67031
Copy link
Copy link
Closed
Labels
P2Normal backlog priority with limited blast radius.Normal backlog priority with limited blast radius.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.staleMarked as stale due to inactivityMarked as stale due to inactivity
Metadata
Metadata
Assignees
Labels
P2Normal backlog priority with limited blast radius.Normal backlog priority with limited blast radius.clawsweeper:needs-maintainer-reviewClawSweeper marked this issue as needing maintainer review before automation.ClawSweeper marked this issue as needing maintainer review before automation.clawsweeper:needs-product-decisionClawSweeper marked this issue as needing a product or behavior decision.ClawSweeper marked this issue as needing a product or behavior decision.clawsweeper:no-new-fix-prClawSweeper does not recommend queueing a new automated fix PR for this issue.ClawSweeper does not recommend queueing a new automated fix PR for this issue.clawsweeper:source-reproClawSweeper found a high-confidence source-level issue reproduction.ClawSweeper found a high-confidence source-level issue reproduction.impact:message-lossChannel message delivery can be lost, duplicated, or misrouted.Channel message delivery can be lost, duplicated, or misrouted.issue-rating: 🦞 diamond lobsterVery strong issue quality with high-confidence source-level or clear reproduction.Very strong issue quality with high-confidence source-level or clear reproduction.staleMarked as stale due to inactivityMarked as stale due to inactivity
Type
Fields
Give feedbackNo fields configured for issues without a type.
Bug type
Feature request (configurability gap)
Summary
OpenClaw has at least 10 hardcoded image size limits scattered across different bundles, and only one of them (
DEFAULT_IMAGE_MAX_DIMENSION_PXinsrc/agents/tool-images.ts) is user-configurable (viaagents.defaults.imageMaxDimensionPx). The corresponding byte limits for tool-result image sanitization, the generic media/mime layer, and the OpenAI-compat HTTP client outbound path are all hardcoded, so operators running OpenClaw against vision models that comfortably handle high-resolution photos (Qwen3-VL, GPT-4o, etc.) still get rejection errors on routine inputs like smartphone camera photos.This is a companion request to the sandbox-only #40880 / #40950 which only covers
src/media/store.ts.Hardcoded limits I could find in
dist/(v2026.4.14)mime-*.jsMAX_IMAGE_BYTESsrc/media/constants.ts)tool-images-*.jsDEFAULT_IMAGE_MAX_BYTEStool-images-*.jsDEFAULT_IMAGE_MAX_DIMENSION_PXagents.defaults.imageMaxDimensionPximage-ops-*.jsMAX_IMAGE_INPUT_PIXELSlimitInputPixelshard cap (rejects input before resize)store-*.jsMEDIA_MAX_BYTESconstants-*.jsDEFAULT_WEB_MEDIA_BYTESinput-files-*.jsDEFAULT_INPUT_IMAGE_MAX_BYTESopenai-http-*.jsDEFAULT_OPENAI_MAX_TOTAL_IMAGE_BYTESdispatch-acp-*.jsACP_ATTACHMENT_MAX_BYTESsubagent-spawn-*.jsattachments.maxTotalBytesfallbackattachments.maxTotalBytesconfig)Why the current situation is painful
25 MP sharp cap in
image-ops— iPhone 14 Pro Max (48 MP), Samsung S24 Ultra (200 MP), Xiaomi/OPPO/Vivo flagship (50–200 MP) routinely exceed 25 MP. These photos get rejected byexceedsImagePixelLimit()inimage-ops.tsbefore any resize is attempted.5 MB
DEFAULT_IMAGE_MAX_BYTESintool-images— the resize loop (resizeImageBase64IfNeeded) targets 5 MB post-resize; the iteration oversideGrid × IMAGE_REDUCE_QUALITY_STEPSalways fits typical JPEGs, but there is no config to loosen the target when the downstream provider can handle 10–20 MB (GPT-4o, Gemini 1.5, Qwen3-VL).1200 px
DEFAULT_IMAGE_MAX_DIMENSION_PX— a reasonable default but too small for vision models that prefer 2048 px native input (Qwen3-VL / Qwen2-VL mmproj reportsimage_max_pixels: 4194304≈ 2048×2048). This one is already configurable, but discoverability is poor — there's nodocs/reference/config.mdentry I could find.Outbound
openai-httplimits (10 MB per image, 20 MB total) — when OpenClaw acts as an OpenAI-compat client and forwardsimage_urlparts to the underlying provider, these limits trigger with error strings likeTotal image payload too large (…; limit 20971520). There is no obvious config path;gateway.http.endpoints.chatCompletions.imagesacceptsallowUrlbut notmaxBytes/maxTotalBytes.Proposed schema
A single
media.image.*top-level block inopenclaw.jsonthat's threaded to all the above layers (similar to how #40950 addsmedia.maxBytes). Per-scope overrides for advanced users:{ "media": { "image": { // applies to every image-sizing layer unless overridden below "maxBytes": 33554432, // 32 MB "maxInputPixels": 150000000, // 150 MP (covers modern smartphone cameras) "maxDimensionPx": 2048, // post-resize target "maxTotalBytesPerRequest": 67108864, // 64 MB // per-scope overrides (optional) "scopes": { "channel": { "maxBytes": 33554432 }, // replaces MAX_IMAGE_BYTES "toolResult": { "maxBytes": 16777216, "maxDimensionPx": 2048 }, "openaiHttp": { "maxBytes": 33554432, "maxTotalBytes": 67108864 }, "sandbox": { "maxBytes": 33554432 } // overlaps with #40950 `media.maxBytes` } } } }Minimum ask: expose
media.image.maxBytes+media.image.maxInputPixels+media.image.maxDimensionPxand have each layer read them at boot with sensible defaults (current hardcoded values).agents.defaults.imageMaxDimensionPxcan be kept as an alias for back-compat.Why this matters to us
We're running OpenClaw against a local Qwen3-VL 8B (llama.cpp + mmproj) that happily decodes up to 48 MP JPEG inputs when called directly. But OpenClaw's built-in tool-images sanitizer forces all agent-observable images through a 5 MB / 1200 px funnel, so our agents see low-resolution thumbnails when the model could see the real pixels. We worked around it by sed-patching the bundles after every
docker compose build, but that's obviously not sustainable.Related
MEDIA_MAX_BYTES)tools.media.image.maxBytesdefined but not applied to inbound)Environment
ghcr.io/openclaw/openclaw:2026.4.14)image_max_pixels: 4194304)