Summary
screenshot (and snapshot -a / snapshot -H) default to fullPage: true. On any page taller than ~2000px scroll height, the resulting PNG exceeds the Anthropic vision API's 2000px-per-side limit for many-image requests. The failing image stays in conversation history, so every subsequent turn re-fails on the same envelope — the entire session is bricked, not just one tool call.
This is a soft footgun: defaults are otherwise excellent (viewport 1280×720, deviceScaleFactor 1), so the user has no signal that "long page" is the boundary condition until they hit it.
Repro
browse goto https://example.com/some-long-doc (any page with body height > ~2000px) 2. browse screenshot /tmp/shot.png 3. Have Claude Read the resulting PNG 4. Claude returns: image exceeds 2000 pixels on the longest edge 5. The PNG is now stuck in transcript — every following turn fails
Where in code
Affected sites (refs at commit 6209163):
browse/src/snapshot.ts:419 — await page.screenshot({ path, fullPage: true }) (annotate) - browse/src/snapshot.ts:539 — same (heatmap)
- (plus the bare
screenshot command — didn't trace the exact line, but the docstring
at commands.ts:138 documents it as full-page by default)
Proposed fix
Three options, in order of how invasive:
-
Cheapest: in browser-manager.ts add a post-capture guard. After every page.screenshot(...), read PNG dims (zero-dep: parse IHDR chunk, ~20 lines) and either downscale or split if either side > 1800px. Emit a [browse] log line so the agent sees what happened. No new deps.
-
Behavioral: flip the default. Make screenshot capture the visible viewport unless --full-page is passed explicitly. The current --viewport flag becomes the default; add --full-page as opt-in. Mirrors how Playwright's own API treats fullPage: false as the default.
-
Config knob: ~/.gstack/config.yaml exposes screenshot_max_height: 1800 (default), and the screenshot command auto-splits above it. Lets power users opt out by setting it to 0 / Infinity.
I'd vote for (1) + (2) together: agents can't accidentally produce poison images, and --full-page stays available for users who explicitly want it.
Why this matters
Once one bad screenshot lands in conversation history, the user has to either clear context or ask Claude to forget. Both are friction. Since gstack is explicitly designed for "agent QA-ing a site", the failure mode lands almost entirely on agent users, which is precisely the audience.
Happy to send a PR if the maintainers agree on which option to take.
— Reported via Claude Opus 4.7, gstack v1.12.2.0
Summary
screenshot(andsnapshot -a/snapshot -H) default tofullPage: true. On any page taller than ~2000px scroll height, the resulting PNG exceeds the Anthropic vision API's 2000px-per-side limit for many-image requests. The failing image stays in conversation history, so every subsequent turn re-fails on the same envelope — the entire session is bricked, not just one tool call.This is a soft footgun: defaults are otherwise excellent (
viewport1280×720,deviceScaleFactor1), so the user has no signal that "long page" is the boundary condition until they hit it.Repro
browse goto https://example.com/some-long-doc(any page with body height > ~2000px) 2.browse screenshot /tmp/shot.png3. Have ClaudeReadthe resulting PNG 4. Claude returns:image exceeds 2000 pixels on the longest edge5. The PNG is now stuck in transcript — every following turn failsWhere in code
Affected sites (refs at commit
6209163):browse/src/snapshot.ts:419—await page.screenshot({ path, fullPage: true })(annotate) -browse/src/snapshot.ts:539— same (heatmap)screenshotcommand — didn't trace the exact line, but the docstringat
commands.ts:138documents it as full-page by default)Proposed fix
Three options, in order of how invasive:
Cheapest: in
browser-manager.tsadd a post-capture guard. After everypage.screenshot(...), read PNG dims (zero-dep: parse IHDR chunk, ~20 lines) and either downscale or split if either side > 1800px. Emit a[browse]log line so the agent sees what happened. No new deps.Behavioral: flip the default. Make
screenshotcapture the visible viewport unless--full-pageis passed explicitly. The current--viewportflag becomes the default; add--full-pageas opt-in. Mirrors how Playwright's own API treatsfullPage: falseas the default.Config knob:
~/.gstack/config.yamlexposesscreenshot_max_height: 1800(default), and the screenshot command auto-splits above it. Lets power users opt out by setting it to 0 / Infinity.I'd vote for (1) + (2) together: agents can't accidentally produce poison images, and
--full-pagestays available for users who explicitly want it.Why this matters
Once one bad screenshot lands in conversation history, the user has to either clear context or ask Claude to forget. Both are friction. Since gstack is explicitly designed for "agent QA-ing a site", the failure mode lands almost entirely on agent users, which is precisely the audience.
Happy to send a PR if the maintainers agree on which option to take.
— Reported via Claude Opus 4.7, gstack v1.12.2.0