Skip to content

fix: auto-compress screenshots to reduce context window size#4751

Open
PHclaw wants to merge 2 commits into
browser-use:mainfrom
PHclaw:fix/screenshot-compression
Open

fix: auto-compress screenshots to reduce context window size#4751
PHclaw wants to merge 2 commits into
browser-use:mainfrom
PHclaw:fix/screenshot-compression

Conversation

@PHclaw

@PHclaw PHclaw commented Apr 28, 2026

Copy link
Copy Markdown

Summary

Without llm_screenshot_size configured, screenshots were passed through at full resolution (~500KB-2MB for 1920x1080 PNG). Multiple screenshots per conversation quickly exhaust the context window, causing API 400/413 errors.

Changes

_resize_screenshot() (prompts.py)

Before: Returned screenshot as-is when llm_screenshot_size was None.

After: Always compresses images:

  • Default max width: 1280px (preserves visual detail, ~95% size reduction)
  • JPEG format for non-transparent images (~85% compression vs PNG)
  • PNG only for images with transparency (RGBA/LA/P modes)
  • Return type changed to uple[str, str] (base64, media_type)

Call site (prompts.py)

Updated to unpack the tuple and use dynamic media_type in the data URL:
`python
processed_screenshot, media_type = self._resize_screenshot(screenshot)

data:{media_type};base64,{processed_screenshot}

`

Impact

Before After
1920x1080 PNG: ~500KB-2MB 1920x1080 → 1280px JPEG: ~50-100KB
Multiple screenshots = context overflow Multiple screenshots fit in context
API 400/413 errors Normal operation

Closes #4742


Summary by cubic

Auto-compress screenshots to shrink LLM context payloads and stop 400/413 errors with multiple images. Always resize and set the correct image format and media type.

  • Bug Fixes
    • Always resize/compress screenshots; respect llm_screenshot_size when set, otherwise scale to 1280px width (typical 1920×1080 drops to ~50–100KB).
    • Use JPEG (quality 85) for non-transparent images; keep PNG for images with transparency.
    • _resize_screenshot now returns (base64, media_type); call site uses data:{media_type};base64,....
    • Graceful fallback: on processing errors, keep the original image and set media type to image/png.

Written for commit 3dea9d6. Summary will update on new commits. Review in cubic

Problem: Without llm_screenshot_size configured, screenshots were passed
through at full resolution (~500KB-2MB for 1920x1080 PNG). Multiple screenshots
per conversation quickly exhaust the context window, causing API 400/413 errors.

Fix:
- _resize_screenshot() now always compresses images (not just when
  llm_screenshot_size is configured)
- Default max width: 1280px (preserves ~95% size reduction for typical screenshots)
- JPEG format for non-transparent images (~85% compression vs PNG)
- PNG only for images with transparency (RGBA/LA/P modes)
- Return type changed to tuple(base64_str, media_type) for correct MIME type
- Call site updated to use dynamic media type in data URL

Closes browser-use#4742
@CLAassistant

CLAassistant commented Apr 28, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@PHclaw

PHclaw commented Apr 28, 2026

Copy link
Copy Markdown
Author

This PR fixes the context overflow by always compressing screenshots to 1280px JPEG by default. A 1920x1080 screenshot goes from ~1MB PNG to ~80KB JPEG, a 90%+ reduction.

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="browser_use/agent/prompts.py">

<violation number="1" location="browser_use/agent/prompts.py:400">
P0: Unterminated string literal causes a module-level SyntaxError, preventing this file from importing.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread browser_use/agent/prompts.py Outdated
@github-actions

Copy link
Copy Markdown

👋 This PR has been automatically marked as stale because it hasn't had activity for 45 days.

To keep this PR open:

  • Rebase against the latest main branch
  • Address any review feedback or merge conflicts
  • Add a comment explaining the current status
  • Add the work-in-progress label if you're still actively working on this

This will be automatically closed in 14 days if no further activity occurs.

Thanks for contributing to browser-use! 🤖

@github-actions github-actions Bot added the stale label Jun 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Screenshot blob in tool result poisons conversation context → API 400 on all subsequent turns

2 participants