Skip to content

perf(meeting-detector): gate the depth-25 AX walk on recent audio#3957

Merged
louis030195 merged 2 commits into
mainfrom
claude/zealous-knuth-814952
Jun 10, 2026
Merged

perf(meeting-detector): gate the depth-25 AX walk on recent audio#3957
louis030195 merged 2 commits into
mainfrom
claude/zealous-knuth-814952

Conversation

@louis030195

Copy link
Copy Markdown
Collaborator

What & why

The meeting detector runs scan_process — a depth-25 accessibility-tree walk (every node a synchronous cross-process AX IPC) — on each running meeting-capable app every 5–10s, gated only by screen-lock and app-presence, never by audio.

The worst case is the common case: with no meeting there's no signal to early-break on, so it walks every window to full depth, forever, finding nothing. For an always-open browser with a huge web AX tree that's the dominant steady-state non-model CPU cost (it showed up as a top continuous non-model consumer in live sampling).

screenpipe already computes per-chunk audio activity (RMS-thresholded) and calls MeetingDetector::on_audio_activity — but that hook was a no-op, and is_in_audio_session() was a circular wrapper around the v2 flag the detector itself sets. This wires the existing-but-discarded signal into a real recency stamp and gates the idle scan cadence on it.

Inspired by fastrepl/anarlog (hyprnote), which keys meeting detection off audio-device activity rather than UI walking.

The change (idle scan cadence)

Before — every 10s, always (apps open, no call):
0s    10s   20s   30s   40s   50s   60s
●─────●─────●─────●─────●─────●─────●     7 depth-25 AX walks / min

After — 30s when silent; instant wake on audio:
0s                  30s                 60s
●───────────────────●───────────────────●   ~2 walks / min
                ⚡ audio onset → immediate scan (no detection latency)
State Meeting apps Recent audio Scan interval vs before
Idle open yes 10s unchanged
Idle open no (silent) 30s ~3× fewer walks
Confirming / Active / Ending open any 5s unchanged (never slowed)
any audio quiet→active edge wake now detect ≤ before

Why detection accuracy cannot regress

  • The state machine (advance_state) is untouched → all existing detector tests pass unchanged.
  • A real meeting produces audio → the onset trigger scans immediately → detection is at least as fast as today.
  • The only behavior delta: an app open with zero audio and no call is scanned every 30s instead of 10s. The 30s safety-net scan still catches even a fully-silent meeting; the instant anyone speaks, onset fires.
  • With no detector (tests / detector disabled) the cadence is byte-identical to before.

Tests

  • screenpipe-audio: 6/6 (4 new recency tests, incl. one pinning the signal is independent of the v2 flag — no circular dep).
  • screenpipe-engine: 102 passed / 0 failed / 2 pre-existing ignored (4 new gate tests + the existing advance_state suite, all green).
  • Standalone logic proof (gate decisions + recency + rising-edge arithmetic): 11/11.

Validation status

  • ✅ Compiles clean (both crates), no warnings; all unit tests green.
  • ⏳ End-to-end before/after CPU on a running build needs a full app build (CI/release). Methodology: count AX-walk frequency + meeting-detector thread CPU over a fixed idle window (apps open, muted) — expect ~3× fewer idle walks with no missed detection.

Files

  • crates/screenpipe-audio/src/meeting_detector.rs — real on_audio_activity recency stamp + audio_onset Notify + audio_active_within() / wait_for_audio_onset().
  • crates/screenpipe-engine/src/meeting_detector.rsIDLE_QUIET_SCAN_INTERVAL + pure apps_present_scan_interval() + audio-onset select! arm.

🤖 Generated with Claude Code


A rendered before/after diagram is available (HTML→PNG) — gist hosting rejected the binary, so the ASCII timeline above stands in. Happy to attach the image inline if you'd prefer it.

Louis Beaumont and others added 2 commits June 9, 2026 17:30
The meeting detector ran scan_process (a depth-25 accessibility-tree walk,
every node a synchronous cross-process AX IPC) on each running meeting-capable
app every 5-10s, gated only by screen-lock and app-presence, never by audio.
The worst case is the common case: with no meeting there is no signal to
early-break on, so it walks every window to full depth, forever, finding
nothing. For an always-open browser with a huge web AX tree that is the
dominant steady-state non-model CPU cost.

screenpipe already computes per-chunk audio activity (RMS-thresholded) and
called MeetingDetector::on_audio_activity, but that hook was a no-op and
is_in_audio_session() was a circular wrapper around the v2 flag the detector
itself sets. This wires the existing signal into a real recency stamp and
gates the idle scan cadence on it:

- Idle + meeting apps open + recent audio -> fast idle rate (10s, unchanged)
- Idle + meeting apps open + silent       -> quiet rate (30s), ~3x fewer walks
- audio onset (quiet->active) wakes the loop immediately, so a call that just
  started is scanned at once and detection latency does NOT regress
- Confirming/Active/Ending: unchanged; a tracked meeting is never slowed

The state machine (advance_state) is untouched, so all existing detector tests
pass unchanged. With no detector (tests / detector disabled) the cadence is
byte-identical to the prior behaviour.

Inspired by fastrepl/anarlog (hyprnote), which keys meeting detection off
audio-device activity rather than UI walking.

Tests: +4 engine gate tests, +4 audio recency tests (incl. one pinning that
the signal is independent of the v2 flag). audio 6/6 pass; engine 102 pass /
0 fail / 2 pre-existing ignored.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@louis030195

Copy link
Copy Markdown
Collaborator Author

Pushed a format-only follow-up (9fa6176) for the Code Quality failure. Local validation: \cargo fmt --all -- --check, \cargo test -p screenpipe-audio meeting_detector --lib, and \cargo test -p screenpipe-engine meeting_detector --lib\ all pass. Fresh CI is running.

@github-actions

Copy link
Copy Markdown
Contributor

Meeting-detection eval

Source: crates/screenpipe-meeting-eval/evals/scenarios/ · replays scripted scan traces through prod advance_state.

scenario meetings final flap (controls/audio) end latency (s) status
arc_meet_toolbar_autohide 1 Idle 8 / 0 305 xfail: 1 miss
browser_tab_switch_with_audio 1 Idle 0 / 24 305 ok
confirming_drops_no_meeting 0 Idle 0 / 0 n/a ok
native_zoom_minimized_with_audio 1 Idle 0 / 24 35 ok
zoom_native_clean_call 1 Idle 0 / 0 35 ok

flap = Ending → Active oscillations inside one meeting. controls-flap = controls reappeared; audio-flap = output audio kept it alive. High controls-flap = brittle scan; high audio-flap = legitimate but watch for drift. See crates/screenpipe-meeting-eval/evals/README.md for the methodology and per-scenario rationale.

Assertion details

[xfail] arc_meet_toolbar_autohide:
  flap_count: max 3 got 8

@louis030195 louis030195 merged commit 33b5e18 into main Jun 10, 2026
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant