Acoustic loopback duplicates transcripts during meetings on speakers

## Summary

When the user is in a voice/video meeting (Google Meet, Zoom, etc.) and using **speakers instead of headphones**, every utterance gets transcribed **twice** — once from the mic, once from the System Audio loopback — and shows up as two near-duplicate lines in the transcript UI, often with different speaker labels.

## Repro

1. Join a Google Meet (or any voice call) with the laptop's built-in speakers active (no headphones).
2. Speak.
3. Open the meeting transcript view.

You'll see each utterance rendered twice at the same timestamp, e.g.:

| time | label | text |
|---|---|---|
| 07:35:07 AM | `louis@screenpi.pe` (🔊) | "the chat I had when we were going through so current screen pipe is healthy this morning was healthy it did have a couple of…" |
| 07:35:07 AM | `You` (🎤) | "So recurrent screen pipe is healthy. This morning was healthy. It did have a couple of drops, one of which I put in there…" |

Same speech, two transcriptions, slight wording differences from two separate Whisper/Parakeet passes.

## Root cause — chained

Three things happen simultaneously:

1. **Mic** (`MacBook Pro Microphone (input)`) captures the user's voice directly → its own `audio_chunks` row + transcription.
2. The conferencing app plays the **meeting audio mix** (which includes the user's own voice fed back through the call) **through the speakers**.
3. **System Audio loopback** (`System Audio (output)`) captures that → a *separate* `audio_chunks` row + transcription of the same speech, encoded once through the meet codec + room acoustics.

Two distinct `audio_chunks.file_path` files, two distinct `audio_transcriptions` rows.

### Verified in DB (live install, 2026-05-04 14:35:49Z)

```
chunk_id=24796  is_input=1  device=MacBook Pro Microphone (input)   speaker_id=2   name=louis@screenpi.pe
chunk_id=24795  is_input=0  device=System Audio (output)            speaker_id=12  name=louis@screenpi.pe
```

### Secondary effect — duplicate speaker entities with the same name

The calendar-speaker-id rules in `crates/screenpipe-engine/src/calendar_speaker_id.rs` then name **both** unnamed speakers `louis@screenpi.pe`:

- **Rule 1** ("input device → user") names the mic-side speaker `louis@screenpi.pe` → ends up on speaker_id=2.
- **Rule 2** ("1:1 meeting + 1 unnamed output → other attendee") fires on the loopback-side speaker; if both calendar attendees resolve to the user (self-meeting or both accounts being Louis), "other attendee" comes back as Louis again → speaker_id=12.

The embedding manager keeps them as **separate clusters** (mic timbre vs Meet-encoded reflected timbre cluster differently across the 0.70 cosine threshold), and the naming layer doesn't check for name collision before writing. Net: two named speaker rows for the same person.

### Tertiary effect — UI shows mismatched icons

`apps/screenpipe-app-tauri/components/rewind/timeline/audio-transcript.tsx`:

```ts
const name = speakerName || (item.audio.is_input ? "me" : "speaker");
```

The meeting popover overrides the displayed name on `is_input=true` rows, so the mic-side row shows "You" even though the DB has `name=louis@screenpi.pe` on it. The output-side row shows the literal name. Two icons (🔊 / 🎤) for the same speaker = user confusion.

## Impact

- **User confusion**: looks like there's a phantom second "Louis" in the meeting.
- **Doubled transcription cost** on every speakers-mode meeting.
- **Polluted speaker DB**: each loopback session creates new duplicate-name speaker entities (current count: 2 entities both named `louis@screenpi.pe` plus 1 unnamed cluster, 3000+ embeddings between them, all the same person).
- **Per-chunk health/metrics noise**: doubled DB writes, doubled audio file output, doubled disk usage during long meetings.

## Fix plan (in priority order)

### (1) Capture-time fix — skip System Audio loopback during meetings-on-speakers

When `meeting_detector` reports an active meeting AND macOS audio output route ≠ headphones, suspend `System Audio (output)` capture for the duration of the meeting. The loopback only adds noise (it just echoes the user + the remote audio that the mic also picks up via the speakers).

Detection hook: macOS `AVAudioSessionRouteDescription.outputs` / Core Audio's `kAudioDevicePropertyDataSource` to identify built-in speakers vs headphones/AirPods/USB.

### (2) Naming-time fix — speaker-name collision check

In `calendar_speaker_id.rs`, before writing `name = X` to a speaker, query for any existing speaker with that exact name. If found:
- If the embeddings are close → merge into the existing one.
- If the embeddings are far → still merge, but flag for manual review (the user might have multiple voices/devices that legitimately diverged).

The DB should not be allowed to contain two speakers with an identical exact name.

### (3) Cross-device transcription dedup (band-aid)

Cheap last-mile guard at `replace_audio_transcription` time: if a row exists with `is_input != self.is_input`, same speaker name, within ±5s, and text Levenshtein similarity > 0.8 → drop the new one. Treats symptom, not cause, but defends the UI even when (1) misses (e.g. user explicitly enables system audio capture).

### (4) One-time DB cleanup migration

After (1)+(2) ship, sweep existing duplicate-named speakers and reassign their `audio_transcriptions.speaker_id` to a single survivor. Without (1) first, the dups would just regenerate.

## Out of scope

- Real-time AEC (acoustic echo cancellation) at capture: covered partially by the OS, but doing it ourselves is a much bigger lift than (1) and offers little additional value once (1) is in place.

## Acceptance

- A meeting on speakers no longer produces two transcript rows per utterance.
- DB never has two `speakers` rows with identical `name`.
- Existing duplicate `louis@screenpi.pe` (or any other duplicated email) speakers from past sessions are merged after upgrade.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Acoustic loopback duplicates transcripts during meetings on speakers #3228

Summary

Repro

Root cause — chained

Verified in DB (live install, 2026-05-04 14:35:49Z)

Secondary effect — duplicate speaker entities with the same name

Tertiary effect — UI shows mismatched icons

Impact

Fix plan (in priority order)

(1) Capture-time fix — skip System Audio loopback during meetings-on-speakers

(2) Naming-time fix — speaker-name collision check

(3) Cross-device transcription dedup (band-aid)

(4) One-time DB cleanup migration

Out of scope

Acceptance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

time	label	text
07:35:07 AM	`louis@screenpi.pe` (🔊)	"the chat I had when we were going through so current screen pipe is healthy this morning was healthy it did have a couple of…"
07:35:07 AM	`You` (🎤)	"So recurrent screen pipe is healthy. This morning was healthy. It did have a couple of drops, one of which I put in there…"

Acoustic loopback duplicates transcripts during meetings on speakers #3228

Description

Summary

Repro

Root cause — chained

Verified in DB (live install, 2026-05-04 14:35:49Z)

Secondary effect — duplicate speaker entities with the same name

Tertiary effect — UI shows mismatched icons

Impact

Fix plan (in priority order)

(1) Capture-time fix — skip System Audio loopback during meetings-on-speakers

(2) Naming-time fix — speaker-name collision check

(3) Cross-device transcription dedup (band-aid)

(4) One-time DB cleanup migration

Out of scope

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions