Skip to content

fix(computer-use): cap AX elements array to prevent context blowup (#22865)#30145

Merged
teknium1 merged 3 commits into
mainfrom
hermes/hermes-02d0efaa
May 22, 2026
Merged

fix(computer-use): cap AX elements array to prevent context blowup (#22865)#30145
teknium1 merged 3 commits into
mainfrom
hermes/hermes-02d0efaa

Conversation

@teknium1

Copy link
Copy Markdown
Contributor

Salvages #22891 onto current main. Fixes #22865computer_use(action='capture', mode='ax') against dense AX trees (Electron apps like Obsidian) no longer dumps the full elements array into the tool result.

The bug

Reporter's session captured an Obsidian window and got back ~597 AX elements. _format_elements() truncated the human-readable summary but _capture_response() still returned the full elements array in the JSON payload — blowing up context or tripping compression failures.

The fix

  • tools/computer_use/schema.py — new optional max_elements integer parameter (minimum: 1, maximum: 1000, default: 100); schema documents both default and upper bound.
  • tools/computer_use/tool.py_DEFAULT_MAX_ELEMENTS = 100, _MAX_ALLOWED_MAX_ELEMENTS = 1000, new _coerce_max_elements() validator that falls back to default for malformed input (negative, zero, non-int) so a caller can't accidentally re-introduce unbounded behavior. _capture_response() slices cap.elements[:max_elements], surfaces total_elements and truncated_elements fields, and appends a "(response truncated to N of M elements; raise max_elements or pass app= to narrow)" note to the human summary so the model knows the JSON view is partial.
  • tests/tools/test_computer_use.py — 9 regression tests: schema exposure (2), default cap behavior on 600-element tree (1), explicit override (1), below-cap backwards-compat (1), invalid input fallback (4).

Validation

Coverage

Credit @briandevans (PR #22891). Author already in AUTHOR_MAP.

Closes #22891.

Infographic

pr-22891-computer-use-ax-cap

briandevans and others added 3 commits May 21, 2026 18:06
…22865)

`computer_use(action='capture', mode='ax')` returned the full AX element
list verbatim in the JSON response. Dense Electron / Obsidian / JetBrains
UIs publish 500+ AX nodes (one reproduction in #22865 returned 597
elements against Obsidian), so a single capture could consume enough
context to trigger compression failures or render the session unusable.
The human-readable `_format_elements` summary is already capped at 40
lines, so the truncation gap was invisible to anyone reading the summary
output.

Add a `max_elements` argument to the tool schema, default 100, that
trims the AX `elements` array. When the cap fires, the response surfaces
`total_elements` and `truncated_elements` and appends a "raise
max_elements or pass app= to narrow" hint to the summary so the model
knows the JSON view is partial and can re-issue with a tighter scope.

Validation is centralized in `_coerce_max_elements`: missing /
non-integer / sub-1 inputs fall back to the default cap, so the
protection can never be silently disabled by a malformed tool-call
argument. The cap only affects AX-mode JSON; `mode='som'` and
`mode='vision'` keep returning a screenshot + image-aware summary
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Four findings from Copilot's review on PR #22891, all in the AX
elements-array cap added by 22fa1ed:

1. The truncation note ("response truncated to N of M elements") was
   appended unconditionally — including in the som/vision multimodal
   path, whose response carries a screenshot rather than an `elements`
   array. The note described a payload field that wasn't present.
   Moved the note into the AX-text branch where the array actually
   appears.

2. `_format_elements(cap.elements)` ran on the full untrimmed list with
   its own `max_lines=40` cap, so a caller passing `max_elements=10`
   would see summary lines referencing `#11..#40` even though the JSON
   `elements` array only held #1..#10. Format on `visible_elements`
   instead so the summary indices always exist in the response.

3. `_coerce_max_elements` enforced a lower bound but no upper bound,
   so `max_elements=10_000_000` silently disabled the safeguard and
   reintroduced the original context-blow-up. Added a hard cap
   (`_MAX_ALLOWED_MAX_ELEMENTS = 1000`) that clamps oversized values.

4. The schema string said "Default 100" but the property carried no
   `default` field, and claimed `max_elements` had no effect on som/
   vision while the image-missing fallback path can still return an
   elements array. Added `"default": 100`, `"maximum": 1000`, and
   clarified the fallback-path wording.

Each finding gets a regression test:

- test_capture_ax_clamps_oversized_max_elements_to_hard_cap
- test_capture_ax_summary_indices_match_returned_elements
- test_capture_multimodal_summary_omits_truncation_note
- test_schema_max_elements_documents_default_and_upper_bound

Verified with `pytest tests/tools/test_computer_use.py` (53 passed,
including the 5 new cases). Confirmed each new test fails on the
pre-fix code path before applying the production change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cherry-pick of #22891 (max_elements cap) reshuffled _capture_response
so summary was assigned inside both the multimodal and AX branches,
but #30126's aux-vision routing call (_route_capture_through_aux_vision)
fires BEFORE either branch and references the not-yet-bound name.

Compute summary once up-front, keep the AX-branch rebuild for the
truncation note.
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: hermes/hermes-02d0efaa vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 9008 on HEAD, 9008 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 4762 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@daimon-nous daimon-nous Bot added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/tools Tool registry, model_tools, toolsets labels May 22, 2026
@daimon-nous

daimon-nous Bot commented May 22, 2026

Copy link
Copy Markdown
Contributor

Salvage of #22891 rebased onto current main (post-#30126 aux-vision routing). Fixes #22865.

@teknium1 teknium1 merged commit 0e2873a into main May 22, 2026
19 of 20 checks passed
@teknium1 teknium1 deleted the hermes/hermes-02d0efaa branch May 22, 2026 02:07
AhmetArif0 added a commit to AhmetArif0/hermes-agent that referenced this pull request May 22, 2026
_route_capture_through_aux_vision returned cap.elements verbatim, so
dense SOM captures (600+ AX nodes on Electron/Slack) routed via
auxiliary.vision still produced oversized tool results that could exhaust
session context — the same NousResearch#22865 shape that PR NousResearch#30145 fixed for the
AX-only path.

Fix: pass visible_elements (already capped by max_elements in
_capture_response) to _route_capture_through_aux_vision and use it in
the returned JSON. Add total_elements and truncated_elements fields for
parity with the AX path so the model knows the response is partial.

3 regression tests added: default cap (600→100), explicit override
(300→50), no truncated_elements field when under cap.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

computer_use capture can blow context on Electron/Obsidian AX trees

2 participants