Skip to content

fix(browser): persist remote browser_vision screenshots locally#11752

Open
sgaofen wants to merge 1 commit into
NousResearch:mainfrom
sgaofen:codex/browser-vision-remote-screenshot
Open

fix(browser): persist remote browser_vision screenshots locally#11752
sgaofen wants to merge 1 commit into
NousResearch:mainfrom
sgaofen:codex/browser-vision-remote-screenshot

Conversation

@sgaofen

@sgaofen sgaofen commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Summary

  • stop passing Hermes-local screenshot output paths to remote cloud browser sessions
  • copy the temp screenshot returned by agent-browser back into Hermes' managed cache before vision analysis
  • add regression coverage for both cloud screenshot capture and unchanged local-mode behavior

Root Cause

browser_vision() always passed a local ~/.hermes/cache/screenshots/... path into the screenshot command and then immediately expected that same host path to exist. That works for local Chromium, but remote Browserbase/CDP sessions cannot write into WSL or other host-local filesystem paths.

Testing

  • python3 -m py_compile /Users/stephenyu/Documents/hermes-agent-wt-11729/tools/browser_tool.py /Users/stephenyu/Documents/hermes-agent-wt-11729/tests/tools/test_browser_console.py
  • uv run --directory /Users/stephenyu/Documents/hermes-agent-wt-11729 --extra dev pytest -o addopts='' /Users/stephenyu/Documents/hermes-agent-wt-11729/tests/tools/test_browser_console.py -q

Platform Tested

  • macOS

Contribution Guide Notes

@sgaofen

sgaofen commented Apr 17, 2026

Copy link
Copy Markdown
Contributor Author

CI note: the failing test check appears to be an unrelated upstream failure, not a regression in this browser_vision patch. This PR and #11750 are both red on the same test: tests/hermes_cli/test_opencode_go_in_model_list.py::test_opencode_go_appears_when_api_key_set, with the opencode-go model order coming back as ['kimi-k2.5', 'glm-5.1', 'glm-5', ...] instead of the asserted order. The browser-specific targeted regression suite for this change still passes locally (27 passed).

@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists tool/browser Browser automation (CDP, Playwright) tool/vision Vision analysis and image generation labels Apr 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P2 Medium — degraded but workaround exists tool/browser Browser automation (CDP, Playwright) tool/vision Vision analysis and image generation type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: browser_vision screenshot fails on WSL2 with Browserbase remote backend

2 participants