fix(tools): improve cua window targeting by davetist · Pull Request #24919 · NousResearch/hermes-agent

davetist · 2026-05-13T08:59:25Z

What does this PR do?

Fixes generic cua-driver window targeting issues in the computer_use backend:

explicit capture(app=...) and focus_app(app=...) requests no longer silently fall back to an unrelated frontmost window
app/window matching now ranks candidates by confidence across PID, bundle ID, app name, list_apps metadata, and window title
app identity matches outrank title-only matches, so a document title from another app does not steal an app-targeted request
on-screen and off-screen candidates are ranked together, so a strong off-screen app identity match beats a weak on-screen title match
capture_after=true prefers capture_active() when the backend supports it, preserving the active target after actions
same-process tied windows prefer titled content windows over titleless utility surfaces

The change is deliberately generic. It does not special-case any app or bundle ID.

Related Issue

Related to #24170.

Also searched related open PRs before opening this draft:

fix(computer_use): correct type_text MCP tool name and implement drag action #24181 covers type/drag cua-driver behavior
fix(computer_use): preserve app context for capture_after; fix element label parsing (#24170 bugs 2 & 5) #24242 covers capture-after app context and label parsing
fix(computer-use): surface app=… filter no-match instead of silently using frontmost (#24170 bug 1) #24324 covers no-match fallback behavior for capture(app=...)

This draft is narrower than a full issue rollup but broader than a single-app workaround: it focuses on generic app/window selection and active capture semantics.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
🔒 Security fix
📝 Documentation update
✅ Tests (adding or improving test coverage)
♻️ Refactor (no behavior change)
🎯 New skill (bundled or hub)

Changes Made

tools/computer_use/cua_backend.py
- normalizes app/window identifiers conservatively
- merges list_apps metadata into window records by PID
- selects explicit app targets by ranked match confidence instead of falling back to the first window
- supports bundle ID, PID, app name, list-app name, and title matching without app-specific branches
- adds capture_active() to preserve the selected pid/window after actions
- prefers titled content windows over titleless utility windows for equal-confidence same-app matches
tools/computer_use/tool.py
- updates capture_after handling to call backend capture_active() when available
tests/tools/test_computer_use.py
- adds regression coverage for no unrelated fallback, decorated app names, bundle ID matching, off-screen app matching, app identity vs title ranking, titled-window preference, focus_app, and capture_after

How to Test

Run targeted backend tests:

python -m compileall -q tools/computer_use/cua_backend.py tests/tools/test_computer_use.py
python -m pytest tests/tools/test_computer_use.py -q -o 'addopts='
git diff --check -- tools/computer_use/cua_backend.py tools/computer_use/tool.py tests/tools/test_computer_use.py

Run lint/static guardrails:

ruff check .
python scripts/check-windows-footguns.py --all

Manual macOS verification with cua-driver-backed computer_use:
- capture(app="<target app>") returns the requested app/window instead of a random frontmost app
- focus_app(app="<target app>") targets the requested app or returns a no-window-found error
- actions with capture_after=true capture the active target instead of reselecting an unrelated window

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this fix/feature (no unrelated commits)
I've run pytest tests/ -q and all tests pass
I've added tests for my changes (required for bug fixes, strongly encouraged for features)
I've tested on my platform: macOS 15.7.4

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) — or N/A
I've updated cli-config.yaml.example if I added/changed config keys — or N/A
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
I've updated tool descriptions/schemas if I changed tool behavior — or N/A

For New Skills

N/A.

Current CI Note

The PR test job is failing with unrelated suite-wide failures that also reproduce on current main's latest push CI (25782318256), including missing optional dependencies (botocore, faster_whisper, numpy) and unrelated DingTalk/Feishu/CLI failures. The touched computer_use tests pass locally, and the other PR checks are green.

Screenshots / Logs

Validation run:

python -m compileall -q tools/computer_use/cua_backend.py tests/tools/test_computer_use.py
python -m pytest tests/tools/test_computer_use.py -q -o 'addopts='
.....................................................                    [100%]
53 passed in 1.41s

git diff --check -- tools/computer_use/cua_backend.py tools/computer_use/tool.py tests/tools/test_computer_use.py
# passed

ruff check .
All checks passed!

python scripts/check-windows-footguns.py --all
✓ No Windows footguns found (421 file(s) scanned).

Note: I attempted a local full-suite run with python -m pytest tests/ -q --ignore=tests/integration --ignore=tests/e2e --tb=short -n auto, then with -n 4. Those local runs did not complete cleanly on this macOS checkout due unrelated large-suite failures/hangs, including an initial Too many open files failure under -n auto. This draft keeps the full-suite checkbox unchecked pending CI or a cleaner local full-suite run.

Prevent explicit computer_use app requests from falling back to unrelated windows, preserve active-window captures after actions, and add regression coverage for generic cua-driver window matching.

fix(tools): improve cua window targeting

ac258bd

Prevent explicit computer_use app requests from falling back to unrelated windows, preserve active-window captures after actions, and add regression coverage for generic cua-driver window matching.

alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/tools Tool registry, model_tools, toolsets labels May 13, 2026

davetist marked this pull request as ready for review May 13, 2026 09:18

alt-glitch mentioned this pull request May 14, 2026

fix: target explicit computer use app windows #25674

Open

hanzckernel mentioned this pull request May 18, 2026

fix(tools): harden cua app window targeting #28128

Open

alt-glitch mentioned this pull request May 20, 2026

fix(computer-use): target app windows across Spaces #29031

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(tools): improve cua window targeting#24919

fix(tools): improve cua window targeting#24919
davetist wants to merge 1 commit into
NousResearch:mainfrom
davetist:fix/computer-use-window-targeting

davetist commented May 13, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

davetist commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

For New Skills

Current CI Note

Screenshots / Logs

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

davetist commented May 13, 2026 •

edited

Loading