Skip to content

cua-driver: overlay z-ordering, background focus fixes, hermes form-fill tests#1375

Merged
ddupont808 merged 2 commits into
mainfrom
cua-driver/overlay-z-order-and-focus-fixes
Apr 24, 2026
Merged

cua-driver: overlay z-ordering, background focus fixes, hermes form-fill tests#1375
ddupont808 merged 2 commits into
mainfrom
cua-driver/overlay-z-order-and-focus-fixes

Conversation

@ddupont808

@ddupont808 ddupont808 commented Apr 24, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Overlay z-ordering: Changed agent-cursor overlay from NSWindowLevel.floating to .normal so it is sandwiched just above the target window ([target, overlay, fg-windows]). Foreground windows above the target correctly occlude the cursor — background interaction is visually clear.
  • Continuous 30 fps repin: Replaced the sparse one-shot defensive-repin schedule [60, 180, 360, 600, 900, 1200ms] with a continuous ~30 fps loop. macOS raises background window z-levels asynchronously after AX actions; the old schedule allowed up to 300ms where the overlay sat below the target. At 33ms it snaps back within one frame.
  • Background Safari form-fill: 800ms post-AXPress delay for AXTextField/AXTextArea so WebKit DOM focus is established before type_text_chars fires; AXPopUpButton handling (list options, hint to use set_value); SetValueTool with AX-child-press + JS-injection fallback for WebKit <select> elements.
  • FocusMonitorApp: Added TrackingTextField that holds keyboard focus throughout tests; three-level focus-loss counters (app, window key, text field) written to /tmp/.
  • Structured content: list_windows and launch_app now return structuredContent alongside human-readable text.
  • Agent cursor focus rect: Cyan glowing rect drawn around the targeted AX element's screen bbox after each click.

New files

  • Tests/integration/fixtures/form_all_inputs.html — HTML form with all input types for hermes testing
  • Tests/integration/test_hermes_form_fill.py — 9 tests driving individual HTML inputs in backgrounded Safari via hermes CLI; asserts no focus stolen from FocusMonitorApp
  • Tests/integration/test_overlay_z_order.py — asserts overlay appears at layer=0 (not floating/layer 3) and is sandwiched between target and foreground windows

Test plan

  • CUA_DRIVER_BINARY=.build/CuaDriver.app/Contents/MacOS/cua-driver python3 -m unittest test_overlay_z_order -v — new z-order test passes
  • ANTHROPIC_API_KEY=<key> CUA_DRIVER_BINARY=.build/CuaDriver.app/Contents/MacOS/cua-driver python3 -m unittest test_hermes_form_fill -v — all 9 form-fill tests pass
  • No hardcoded API keys in source (requires ANTHROPIC_API_KEY env var at test runtime)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added visual focus indicator highlighting targeted interface elements with a cyan-bordered rectangle.
    • Enhanced popup menu handling with intelligent option selection based on element attributes.
  • Bug Fixes

    • Fixed overlay window z-ordering behavior for proper layering with other windows.
    • Improved text field activation with timing adjustments to prevent race conditions.
  • Tests

    • Added comprehensive form interaction test suite validating input handling.
    • Added z-order validation tests for overlay window positioning.

…d hermes form-fill tests

## Overlay z-ordering (NSWindowLevel.normal)
Changed the agent-cursor overlay from NSWindowLevel.floating (above all
normal windows) to NSWindowLevel.normal so it is sandwiched just above
the target window: [target, overlay, fg-windows]. Foreground windows
above the target now correctly occlude the cursor, making it visually
clear which window the agent is interacting with.

## Continuous repin loop (30 fps)
Replaced the sparse one-shot defensive-repin schedule [60, 180, 360,
600, 900, 1200ms] with a continuous ~30 fps loop. macOS sometimes
raises a background window's z-level asynchronously after an AX action;
the old schedule left a gap of up to 300ms where the overlay sat below
the target. At 33ms the overlay snaps back within one frame.

## Background Safari form-fill fixes
- ClickTool: 800ms post-AXPress delay for AXTextField/AXTextArea so
  WebKit establishes DOM focus before type_text_chars fires (email/text
  field race condition).
- ClickTool: detect AXPopUpButton clicks, list available options and
  hint to use set_value instead of click (native popup closes on bg).
- SetValueTool: two-strategy popup selection — AX child AXPress (native
  AppKit) with JavaScript injection fallback for Safari/WebKit selects
  that expose no AX children when the popup is closed.
- AXInput: added screenBoundingRect(), children(), stringAttribute()
  helpers used by ClickTool and SetValueTool.

## Agent cursor focus rect
AgentCursorRenderer/View/Cursor: draw a cyan glowing rounded rect around
the targeted AX element's screen bbox after each click so the user sees
which element the agent targeted.

## FocusMonitorApp enhancements
Added TrackingTextField (NSTextField subclass) that holds keyboard focus
throughout the test run and tracks field-level resignFirstResponder
events. Three-level focus-loss counters (app, window key, text field)
written to /tmp/focus_monitor_{losses,key_losses,field_losses}.txt.

## Structured content for list_windows / launch_app
Both tools now return structuredContent alongside their human-readable
text so integration tests can access window/app data without parsing.

## New integration tests
- test_hermes_form_fill.py: 9 tests driving individual HTML input types
  (text, password, email, number, textarea, checkbox, select, radio,
  submit) in a backgrounded Safari window via the hermes CLI, asserting
  no focus is stolen from FocusMonitorApp.
- test_overlay_z_order.py: asserts overlay appears at layer 0 (not
  floating/layer 3) and is sandwiched between the target and any
  foreground windows above it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@vercel

vercel Bot commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
docs Ready Ready Preview, Comment Apr 24, 2026 8:26pm

Request Review

@coderabbitai

coderabbitai Bot commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 59a8b852-8a2c-4e00-8a1f-2e86be8e0f78

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The PR modernizes the cursor overlay system by switching from defensive repin tasks to continuous ~30 fps repinning and changing the overlay window level from .floating to .normal. It introduces focus rectangle visualization for targeted UI elements and enhances interaction tools with focus-aware behavior for form controls. Integration tests validate form filling and overlay z-ordering.

Changes

Cohort / File(s) Summary
Overlay System Repin & Window Level
AgentCursor.swift, AgentCursorOverlayWindow.swift
Replaces scheduleDefensiveRepin with startContinuousRepin() for continuous ~30 fps pinning loop; changes overlay window level from .floating to .normal with order(.above, relativeTo:) positioning semantics.
Focus Rect Rendering
AgentCursorRenderer.swift, AgentCursorView.swift
Exposes new public focusRect: CGRect? property on renderer; implements focus highlight rendering in view with rounded rectangle fill, cyan border, and glow stroke.
AX Input Utilities
AXInput.swift
Adds three new public utility methods: screenBoundingRect(of:), children(of:), and stringAttribute(_:of:) for safe element attribute access.
Tool Enhancements
ClickTool.swift, SetValueTool.swift, LaunchAppTool.swift, ListWindowsTool.swift
ClickTool adds 800ms delay post-click for text inputs and focus rect display; SetValueTool implements popup option selection via AX children or Safari JavaScript fallback; LaunchAppTool and ListWindowsTool add structured response payloads alongside text content.
Focus Monitor Test Helper
FocusMonitorApp.swift
Tracks app resign-active, window key-loss, and text field first-responder loss events separately; updates shared status label and persists counts to separate /tmp files.
Integration Test Fixtures & Tests
form_all_inputs.html, test_hermes_form_fill.py, test_overlay_z_order.py
Adds comprehensive HTML form test fixture with all input types; introduces Hermes-driven form fill integration tests validating field interactions without focus loss; adds z-ordering test ensuring overlay sits above target window but below pre-existing foreground windows.

Sequence Diagram(s)

sequenceDiagram
    participant ClickTool as ClickTool
    participant AXInput as AXInput
    participant AgentCursor as AgentCursor
    participant Renderer as AgentCursorRenderer
    participant View as AgentCursorView
    participant Screen as On-Screen Display

    ClickTool->>AXInput: screenBoundingRect(of: element)
    AXInput-->>ClickTool: CGRect (screen space)
    ClickTool->>AgentCursor: showFocusRect(rect)
    AgentCursor->>Renderer: focusRect = rect
    Renderer-->>View: focusRect property updated
    View->>View: drawFocusHighlight()
    Note over View: Rounded rect + cyan border + glow
    View->>Screen: Render focus highlight overlay
    Screen-->>View: ✓ Visible on screen
Loading
sequenceDiagram
    participant SetValueTool as SetValueTool
    participant AXInput as AXInput
    participant App as Target App
    participant Safari as Safari (fallback)

    SetValueTool->>SetValueTool: Detect AXPopUpButton role
    alt AX Children Available
        SetValueTool->>AXInput: children(of: popupElement)
        AXInput-->>SetValueTool: [AXUIElement]
        loop Each Child
            SetValueTool->>SetValueTool: Compare AXTitle/AXValue to target value
        end
        SetValueTool->>App: AXPress on matched option
        App-->>SetValueTool: ✓ Option selected
    else No AX Children & Safari
        SetValueTool->>Safari: osascript: inject & execute JavaScript
        Note over Safari: DOM search + dispatch change event
        Safari-->>SetValueTool: ✓ select option changed
    else Error
        SetValueTool-->>SetValueTool: Return error with available titles
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 Whiskers twitch with glee—
Continuous repin loops now flow,
Focus rects in cyan glow,
Popups bend to our will,
Forms filled with skill and flair!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the three main changes: overlay z-ordering, background focus fixes, and hermes form-fill tests—all of which are substantively addressed in the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cua-driver/overlay-z-order-and-focus-fixes

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursorOverlayWindow.swift (1)

3-11: ⚠️ Potential issue | 🟡 Minor

Class-level doc comment is stale after the level change.

The summary still states the overlay is "installed at .statusBar + 1", but the window level is now .normal (line 45). Please update this paragraph so the top-of-file contract matches the new sandwiched-in-regular-stack behavior described at lines 39-45.

✏️ Suggested wording
 /// Transparent, click-through, borderless window used to host the agent
 /// cursor overlay. Shares the classic "floating HUD" recipe: no window
 /// chrome, no shadow, clear background, ignores mouse events so the user
-/// keeps using the machine normally, installed at `.statusBar + 1` so
-/// it stays above ordinary windows but below system menus and modals.
+/// keeps using the machine normally. Installed at `.normal` level and
+/// continuously re-pinned one slot above the target window by
+/// `AgentCursor.pinAbove(pid:)`, so foreground windows above the target
+/// correctly occlude the cursor (ordering: [target, overlay, fg-windows]).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursorOverlayWindow.swift`
around lines 3 - 11, Update the class-level doc for AgentCursorOverlayWindow to
remove the outdated claim that the overlay is "installed at `.statusBar + 1`"
and instead describe the current behavior: the window uses `.normal` level and
is intentionally sandwiched in the regular window stack (above ordinary windows
but below system menus/modals) while remaining borderless, transparent,
click-through, and non-key/non-main; edit the opening summary sentence to
reflect this new window level and stacking contract so the top-of-file comment
matches the implementation in AgentCursorOverlayWindow.
🧹 Nitpick comments (3)
libs/cua-driver/Sources/CuaDriverServer/Tools/ClickTool.swift (1)

272-289: Consider narrowing the 800 ms text-field delay to web views.

The rationale documents a WebKit DOM-focus race, but the guard only checks role == "AXTextField" || role == "AXTextArea", so every element-indexed click on a native Cocoa text field (TextEdit, Notes, Mail compose, Finder rename, etc.) now adds ~0.8 s of mandatory latency per click even though those flows don't need it. Options:

  • Gate on the owning app's bundle id (e.g. only delay for Safari / WebKit-hosted content), or
  • Expose a post_press_delay_ms argument / config knob so interactive native-app clicks don't pay the WebKit tax, or
  • Poll AXFocusedUIElement == element with a shorter timeout instead of sleeping a fixed duration.

Not blocking — the current constant is empirically justified — but it's worth a follow-up once there's a signal path to tell web-content fields apart from native ones.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/cua-driver/Sources/CuaDriverServer/Tools/ClickTool.swift` around lines
272 - 289, The unconditional 800 ms sleep after AXPress for any element whose
AXInput.describe(element) role is "AXTextField" or "AXTextArea" is adding
latency for native Cocoa text fields; narrow the condition in ClickTool.swift by
either (1) gating the delay on the owning app (check the element's owning app
bundle id returned by AXInput.describe/owner info and only apply the sleep for
WebKit/Safari bundle ids), (2) adding a configurable post_press_delay_ms
parameter (expose a param on the ClickTool/press API and use that value instead
of the hardcoded 800 ms), or (3) replace the fixed Task.sleep(for:
.milliseconds(800)) with a short polling loop that repeatedly reads
AXFocusedUIElement and returns as soon as AXFocusedUIElement == element (with a
small max timeout), and update the condition around axAction == "AXPress" and
target.role to incorporate the chosen gate.
libs/cua-driver/Tests/integration/test_overlay_z_order.py (1)

142-144: Stale comment and over-long sleep after switching to continuous repin.

The comment still references the old [60, 180, 360, 600, 900, 1200]ms defensive-repin schedule that this PR replaces with the ~33ms continuous loop (see startContinuousRepin in AgentCursor.swift). Under the new loop the overlay is back in place within one frame (~33ms), so the time.sleep(1.5) is both inaccurately justified and ~40× longer than needed — inflates the test suite wall-clock for no coverage benefit.

🧹 Suggested cleanup
-        # Let the defensive-repin ticks fully settle (last tick fires at ~1200ms
-        # after the second pinAbove inside click, which returns after animations).
-        time.sleep(1.5)
+        # Continuous repin loop fires every ~33 ms; one frame plus slack
+        # is plenty for the overlay to land in its final z-position.
+        time.sleep(0.3)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/cua-driver/Tests/integration/test_overlay_z_order.py` around lines 142 -
144, The test contains a stale comment referencing the old defensive-repin
schedule and uses an excessively long sleep (time.sleep(1.5)); update the
comment to reflect the new continuous repin loop (startContinuousRepin in
AgentCursor.swift) which restores overlay within one frame (~33ms) and reduce
the sleep to a small margin (e.g., ~0.05s) sufficient for one frame; locate the
sleep in test_overlay_z_order.py and replace the comment text and the 1.5s sleep
value accordingly.
libs/cua-driver/Tests/integration/test_hermes_form_fill.py (1)

70-72: Dead helper: _build_focus_app is never called.

setUpClass has its own inline build-if-stale check at lines 157-160 using a different policy (rebuilds when the .swift is newer than the binary), so _build_focus_app is never invoked and its "build only when missing" logic doesn't match what the suite actually uses. Either delete it, or refactor setUpClass to call it (and move the mtime check into the helper) so there's a single source of truth shared with test_overlay_z_order.py's _ensure_focus_app_built.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@libs/cua-driver/Tests/integration/test_hermes_form_fill.py` around lines 70 -
72, The helper function _build_focus_app is unused and its "build only when
missing" behavior conflicts with the inline mtime logic in setUpClass; refactor
by moving the mtime check/build logic into _build_focus_app (so it rebuilds when
the .swift source is newer OR builds when missing), then replace the inline
build-if-stale block in setUpClass with a call to _build_focus_app, and update
test_overlay_z_order.py to call the same
_ensure_focus_app_built/_build_focus_app helper so there is a single source of
truth for building the focus app.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursor.swift`:
- Around line 442-449: The hide() method fails to clear the lingering focus
rectangle; call the private clearFocusRect() from hide() so
AgentCursorRenderer.shared.focusRect is cleared when the cursor auto-hides (or
remove clearFocusRect() if you intentionally don't need it). Update the hide()
implementation (method name: hide) to invoke clearFocusRect() before returning,
ensuring the singleton AgentCursorRenderer.shared.focusRect is reset and
preventing the old rect from flashing on the next show().

In `@libs/cua-driver/Sources/CuaDriverServer/Tools/SetValueTool.swift`:
- Around line 239-254: proc.waitUntilExit() can block indefinitely if osascript
stalls; modify the block in SetValueTool (where proc, proc.executableURL,
proc.arguments, outPipe, and errorResult are used) to enforce a bounded timeout:
start the process as you already do, then run a timeout mechanism (e.g., Task
race with Task.sleep or a DispatchWorkItem on a background queue) that waits up
to a fixed duration (e.g., 10s) and, on timeout, calls proc.terminate() (and
proc.kill() if needed) and returns an errorResult indicating timeout; if the
process exits normally before the deadline, proceed to read from outPipe and
return the existing result.
- Around line 217-238: The current value escaping in SetValueTool (variables
vLow, js, and appleScript) is incomplete and allows double quotes and
backslashes to break the AppleScript/JS string; update the sanitizer to first
escape backslashes then single quotes for the JS single-quoted literal (vLow),
and also escape backslashes and double quotes when embedding into the outer
AppleScript double-quoted string (or switch to passing the value via osascript
stdin/argument to avoid double-escaping altogether); additionally, when
launching osascript where proc.waitUntilExit() is used, add a timeout guard and
handle timeout-by-killing the process to avoid indefinite hangs.

In `@libs/cua-driver/Tests/FocusMonitorApp/FocusMonitorApp.swift`:
- Around line 3-15: The file header currently documents two focus-loss kinds but
the implementation also tracks a third: the text-field first-responder loss via
TrackingTextField which increments fieldLossCount and writes to
/tmp/focus_monitor_field_losses.txt; update the top docstring to list this third
signal (mention TrackingTextField, fieldLossCount, and the field losses file) so
readers/tests understand all three tracked events.

---

Outside diff comments:
In `@libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursorOverlayWindow.swift`:
- Around line 3-11: Update the class-level doc for AgentCursorOverlayWindow to
remove the outdated claim that the overlay is "installed at `.statusBar + 1`"
and instead describe the current behavior: the window uses `.normal` level and
is intentionally sandwiched in the regular window stack (above ordinary windows
but below system menus/modals) while remaining borderless, transparent,
click-through, and non-key/non-main; edit the opening summary sentence to
reflect this new window level and stacking contract so the top-of-file comment
matches the implementation in AgentCursorOverlayWindow.

---

Nitpick comments:
In `@libs/cua-driver/Sources/CuaDriverServer/Tools/ClickTool.swift`:
- Around line 272-289: The unconditional 800 ms sleep after AXPress for any
element whose AXInput.describe(element) role is "AXTextField" or "AXTextArea" is
adding latency for native Cocoa text fields; narrow the condition in
ClickTool.swift by either (1) gating the delay on the owning app (check the
element's owning app bundle id returned by AXInput.describe/owner info and only
apply the sleep for WebKit/Safari bundle ids), (2) adding a configurable
post_press_delay_ms parameter (expose a param on the ClickTool/press API and use
that value instead of the hardcoded 800 ms), or (3) replace the fixed
Task.sleep(for: .milliseconds(800)) with a short polling loop that repeatedly
reads AXFocusedUIElement and returns as soon as AXFocusedUIElement == element
(with a small max timeout), and update the condition around axAction ==
"AXPress" and target.role to incorporate the chosen gate.

In `@libs/cua-driver/Tests/integration/test_hermes_form_fill.py`:
- Around line 70-72: The helper function _build_focus_app is unused and its
"build only when missing" behavior conflicts with the inline mtime logic in
setUpClass; refactor by moving the mtime check/build logic into _build_focus_app
(so it rebuilds when the .swift source is newer OR builds when missing), then
replace the inline build-if-stale block in setUpClass with a call to
_build_focus_app, and update test_overlay_z_order.py to call the same
_ensure_focus_app_built/_build_focus_app helper so there is a single source of
truth for building the focus app.

In `@libs/cua-driver/Tests/integration/test_overlay_z_order.py`:
- Around line 142-144: The test contains a stale comment referencing the old
defensive-repin schedule and uses an excessively long sleep (time.sleep(1.5));
update the comment to reflect the new continuous repin loop
(startContinuousRepin in AgentCursor.swift) which restores overlay within one
frame (~33ms) and reduce the sleep to a small margin (e.g., ~0.05s) sufficient
for one frame; locate the sleep in test_overlay_z_order.py and replace the
comment text and the 1.5s sleep value accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eaaf5ea9-04cb-4f95-ab08-7b680fdc53a9

📥 Commits

Reviewing files that changed from the base of the PR and between deb93ed and 758f750.

📒 Files selected for processing (13)
  • libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursor.swift
  • libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursorOverlayWindow.swift
  • libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursorRenderer.swift
  • libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursorView.swift
  • libs/cua-driver/Sources/CuaDriverCore/Input/AXInput.swift
  • libs/cua-driver/Sources/CuaDriverServer/Tools/ClickTool.swift
  • libs/cua-driver/Sources/CuaDriverServer/Tools/LaunchAppTool.swift
  • libs/cua-driver/Sources/CuaDriverServer/Tools/ListWindowsTool.swift
  • libs/cua-driver/Sources/CuaDriverServer/Tools/SetValueTool.swift
  • libs/cua-driver/Tests/FocusMonitorApp/FocusMonitorApp.swift
  • libs/cua-driver/Tests/integration/fixtures/form_all_inputs.html
  • libs/cua-driver/Tests/integration/test_hermes_form_fill.py
  • libs/cua-driver/Tests/integration/test_overlay_z_order.py

Comment thread libs/cua-driver/Sources/CuaDriverCore/Cursor/AgentCursor.swift
Comment thread libs/cua-driver/Sources/CuaDriverServer/Tools/SetValueTool.swift Outdated
Comment thread libs/cua-driver/Sources/CuaDriverServer/Tools/SetValueTool.swift
Comment thread libs/cua-driver/Tests/FocusMonitorApp/FocusMonitorApp.swift Outdated
Comment thread libs/cua-driver/Tests/integration/test_hermes_form_fill.py Outdated
- AgentCursor: call clearFocusRect() in hide() so the focus rect does
  not linger after the cursor auto-hides (the helper had a docstring
  explicitly saying it should be called there but was never wired up)

- SetValueTool: fix value escaping in JS/AppleScript injection path.
  Previously only single-quotes were escaped, leaving double-quotes and
  backslashes able to break the outer AppleScript string or inject
  arbitrary commands. Switch to percent-encoding with unreserved chars
  only and decode via decodeURIComponent() in the JS, eliminating the
  double-escape problem entirely. Also add a 10-second polling timeout
  around proc.waitUntilExit() so a stuck Safari/osascript does not block
  the MCP tool handler indefinitely.

- FocusMonitorApp: update doc-comment from "two kinds" to "three kinds"
  of focus loss to match the implementation (app-level, window key
  status, and text-field first-responder, the third being the newest
  addition tracked via TrackingTextField).

- test_hermes_form_fill: change module-level os.environ["ANTHROPIC_API_KEY"]
  to os.environ.get(..., "") so test discovery / lint / collect-only
  passes do not raise KeyError. setUpClass now skips the suite cleanly
  when the key is absent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant