Skip to content

fix(tools): improve cua window targeting#24919

Open
davetist wants to merge 1 commit into
NousResearch:mainfrom
davetist:fix/computer-use-window-targeting
Open

fix(tools): improve cua window targeting#24919
davetist wants to merge 1 commit into
NousResearch:mainfrom
davetist:fix/computer-use-window-targeting

Conversation

@davetist

@davetist davetist commented May 13, 2026

Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes generic cua-driver window targeting issues in the computer_use backend:

  • explicit capture(app=...) and focus_app(app=...) requests no longer silently fall back to an unrelated frontmost window
  • app/window matching now ranks candidates by confidence across PID, bundle ID, app name, list_apps metadata, and window title
  • app identity matches outrank title-only matches, so a document title from another app does not steal an app-targeted request
  • on-screen and off-screen candidates are ranked together, so a strong off-screen app identity match beats a weak on-screen title match
  • capture_after=true prefers capture_active() when the backend supports it, preserving the active target after actions
  • same-process tied windows prefer titled content windows over titleless utility surfaces

The change is deliberately generic. It does not special-case any app or bundle ID.

Related Issue

Related to #24170.

Also searched related open PRs before opening this draft:

This draft is narrower than a full issue rollup but broader than a single-app workaround: it focuses on generic app/window selection and active capture semantics.

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • tools/computer_use/cua_backend.py
    • normalizes app/window identifiers conservatively
    • merges list_apps metadata into window records by PID
    • selects explicit app targets by ranked match confidence instead of falling back to the first window
    • supports bundle ID, PID, app name, list-app name, and title matching without app-specific branches
    • adds capture_active() to preserve the selected pid/window after actions
    • prefers titled content windows over titleless utility windows for equal-confidence same-app matches
  • tools/computer_use/tool.py
    • updates capture_after handling to call backend capture_active() when available
  • tests/tools/test_computer_use.py
    • adds regression coverage for no unrelated fallback, decorated app names, bundle ID matching, off-screen app matching, app identity vs title ranking, titled-window preference, focus_app, and capture_after

How to Test

  1. Run targeted backend tests:

    python -m compileall -q tools/computer_use/cua_backend.py tests/tools/test_computer_use.py
    python -m pytest tests/tools/test_computer_use.py -q -o 'addopts='
    git diff --check -- tools/computer_use/cua_backend.py tools/computer_use/tool.py tests/tools/test_computer_use.py
  2. Run lint/static guardrails:

    ruff check .
    python scripts/check-windows-footguns.py --all
  3. Manual macOS verification with cua-driver-backed computer_use:

    • capture(app="<target app>") returns the requested app/window instead of a random frontmost app
    • focus_app(app="<target app>") targets the requested app or returns a no-window-found error
    • actions with capture_after=true capture the active target instead of reselecting an unrelated window

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS 15.7.4

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

For New Skills

N/A.

Current CI Note

The PR test job is failing with unrelated suite-wide failures that also reproduce on current main's latest push CI (25782318256), including missing optional dependencies (botocore, faster_whisper, numpy) and unrelated DingTalk/Feishu/CLI failures. The touched computer_use tests pass locally, and the other PR checks are green.

Screenshots / Logs

Validation run:

python -m compileall -q tools/computer_use/cua_backend.py tests/tools/test_computer_use.py
python -m pytest tests/tools/test_computer_use.py -q -o 'addopts='
.....................................................                    [100%]
53 passed in 1.41s

git diff --check -- tools/computer_use/cua_backend.py tools/computer_use/tool.py tests/tools/test_computer_use.py
# passed

ruff check .
All checks passed!

python scripts/check-windows-footguns.py --all
✓ No Windows footguns found (421 file(s) scanned).

Note: I attempted a local full-suite run with python -m pytest tests/ -q --ignore=tests/integration --ignore=tests/e2e --tb=short -n auto, then with -n 4. Those local runs did not complete cleanly on this macOS checkout due unrelated large-suite failures/hangs, including an initial Too many open files failure under -n auto. This draft keeps the full-suite checkbox unchecked pending CI or a cleaner local full-suite run.

Prevent explicit computer_use app requests from falling back to unrelated windows, preserve active-window captures after actions, and add regression coverage for generic cua-driver window matching.
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists comp/tools Tool registry, model_tools, toolsets labels May 13, 2026
@davetist davetist marked this pull request as ready for review May 13, 2026 09:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/tools Tool registry, model_tools, toolsets P2 Medium — degraded but workaround exists type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants