fix(vision): clamp image dimensions before inline base64 encode#25838
Open
yoniebans wants to merge 8 commits into
Open
fix(vision): clamp image dimensions before inline base64 encode#25838yoniebans wants to merge 8 commits into
yoniebans wants to merge 8 commits into
Conversation
Contributor
🔎 Lint report:
|
| with Image.open(image_path) as img: | ||
| return (img.width, img.height) | ||
| except Exception as exc: | ||
| logger.debug("Could not read image dimensions for %s: %s", image_path, exc) |
7650717 to
ffa991c
Compare
Anthropic's Messages API rejects any image whose width or height
exceeds 8000 px with a non_retryable_client_error 400:
messages.N.content.M.image.source.base64.data:
At least one of the image dimensions exceed max allowed size: 8000 pixels
The native vision fast path inlined oversized screenshots (e.g. tall
or panoramic captures from browser_vision / vision_analyze) directly
into the tool-result envelope before any size check. Once present in
the message history, every subsequent request replayed the same
oversized image and got the same 400 — permanently bricking the
session, since the error is non-retryable. Recovery required manually
editing the session JSON to drop the poisoned tool result.
Fix:
* Add _MAX_IMAGE_DIMENSION = 7999 (one px under Anthropic's cap).
* Add _get_image_dimensions / _image_exceeds_pixel_cap helpers
(header-only Pillow read, no full decode).
* _resize_image_for_vision now clamps proportionally to the cap
before any byte-size work.
* Three call sites (native fast path + legacy path initial check)
trigger resize on dimension overflow as well as byte overflow.
Pillow remains a soft dependency: when missing, the dimension check
returns False and the existing byte-size guard remains the last line
of defence (same behaviour as today).
Adds TestPixelDimensionCap covering the helpers, the Pillow-missing
fallback, and the 10000x100 / 100x10000 regression cases. All 125
tests pass across vision_tools, vision_native_fast_path,
image_shrink_recovery, and image_rejection_fallback.
ffa991c to
2882899
Compare
Anthropic is the only major provider that hard-rejects >8000 px images. Clamping unconditionally silently downscaled images for OpenAI/Gemini/custom hosts that could handle larger inputs. Gate the clamp on the active provider and add an opt-in clamp_dimensions kwarg to _resize_image_for_vision.
Manual script that hits real Anthropic API to confirm: (1) >8000 px images are still rejected with the same error message, (2) our clamp produces an image Anthropic accepts. Run when threshold drift is suspected.
…mpression paths - Broaden _is_anthropic_provider to cover claude/claude-code aliases and aggregators that proxy Claude (openrouter, nous, vertex, bedrock, anthropic-vertex, google-vertex) — same set as _supports_media_in_tool_results. - Wire clamp_dimensions through browser_tool screenshot resize and conversation_compression image-shrink recovery, both of which were bypassing the clamp. - Promote Pillow-missing log to warning when clamp was requested. - Add parametrized tests for _is_anthropic_provider covering 19 cases.
…script - Inline the provider check via _ANTHROPIC_IMAGE_PROVIDERS frozenset instead of duplicating the predicate logic in a function body. - Drop scripts/verify_anthropic_pixel_cap.py — it was a one-off development probe, not a repeatable utility. Moved to local workspace.
…ature The _fake_resize mock in test_image_shrink_recovery.py predates the clamp_dimensions kwarg on _resize_image_for_vision. Add it to keep the mock signature aligned.
This was referenced May 31, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Fixes a session-bricking bug on Anthropic vision. The Messages API rejects images >8000 px on either axis with a non-retryable 400:
The native vision fast path inlined oversized images (e.g. tall page screenshots) into the tool-result envelope before any size check. Once the bad image lands in message history, every subsequent call hits the same error — session unrecoverable without manually editing the JSON.
The fix proportionally clamps oversized images to 7999 px before sending. Critically, the clamp is gated on Anthropic-shaped providers only (native Anthropic + aliases + Claude-routing aggregators:
openrouter,nous,vertex,bedrock,anthropic-vertex,google-vertex). Other providers (OpenAI, Gemini, custom hosts) auto-downscale server-side; clamping universally would silently degrade them.Related Issue
Fixes #25837
Type of Change
Changes Made
tools/vision_tools.py—_MAX_IMAGE_DIMENSION = 7999;_get_image_dimensions/_image_exceeds_pixel_caphelpers (header-only Pillow read);_ANTHROPIC_IMAGE_PROVIDERSfrozenset +_is_anthropic_provider()predicate;_resize_image_for_vision(..., clamp_dimensions=False)kwarg that proportionally shrinks before the byte-size halving looptools/vision_tools.py— three call sites in_vision_analyze_nativeandvision_analyze_toolnow passclamp_dimensions=_is_anthropic_provider()tools/browser_tool.py— screenshot resize now gated on Anthropic provideragent/conversation_compression.py— image-shrink recovery path now gated on Anthropic providertests/tools/test_vision_tools.py— 33 new tests (cap helpers, proportional resize, parametrized provider matrix, wiring regressions)tests/run_agent/test_image_shrink_recovery.py— mock signature aligned with the new kwargHow to Test
browser_visionon any page producing a screenshot taller/wider than 8000 px (long articles, wide dashboards).Local verification: 188/188 pass across
test_vision_tools.py,test_image_shrink_recovery.py,test_image_rejection_fallback.py,test_image_routing.py. End-to-end live-verified against the Anthropic Messages API (raw 8500×100 PNG rejected with the expected error; clamped 7999×94 accepted).Checklist
Code
fix(vision):,test(vision):,refactor(vision):)Documentation & Housekeeping
cli-config.yaml.exampleCONTRIBUTING.md/AGENTS.mdvision_analyze/browser_visionexternal behaviour unchanged