fix(meetings): never drop live transcript segments on a coverage-window miss#3918
Conversation
…ow miss From a report that a meeting "recorded both sides for ~5 min then stopped surfacing the transcript." The audio was captured and transcribed the whole call (live finals for both devices, batch transcriptions, and the live->durable mirror all ran in the logs). The gap is on the surfacing side: how live meeting finals get copied into audio_transcriptions, which the post-call meeting view reads. Two robustness bugs in the live-coverage path: 1. mirror_live_meeting_to_audio_transcriptions silently DROPPED any live segment whose nearest same-device chunk fell outside the +/-coverage_window. The live provider finalizes a turn seconds after the audio (drifting captured_at past the chunk timestamp); long chunks or capture gaps do the same. Dropped segments never reached audio_transcriptions, so they vanished from every post-call surface (meeting notes, timeline, search). Now fall back to the nearest SAME-device chunk regardless of window rather than dropping: losing the text is worse than a small playback offset, and the row keeps the segment's real timestamp so search/timeline stay correct. Device attribution stays strict (never a different device's chunk). 2. mark_chunks_covered_by_live matched the device name case-SENSITIVELY while its sibling mirror matched case-INSENSITIVELY (and the mirror comment claimed they agreed). A casing difference left meeting chunks pending, re-transcribed by batch and inconsistent with the mirror. Aligned to case-insensitive. Regression tests: a far same-device chunk is used instead of dropping the segment; case-different device names still match coverage. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Diarization eval resultsSource:
DER, VAD FA, VAD FN, boundary err: lower is better. Continuity: higher is better, 1.0 = same hyp cluster across all silence gaps. Composed workday rows and Pipeline replay matrixSource: generated
The no-secret CI matrix runs local diarization under Parakeet/Whisper engine labels across live/background and mic/system device profiles. Real Deepgram/screenpipe-cloud smoke can be run locally with Transcription qualitySource: LibriSpeech test-clean (CC-BY-4.0) · per-model utterance cap · normalized lowercased word-level Levenshtein
WER + CER on read-aloud speech. Per-model utterance caps keep wall time bounded — tiny/parakeet at 50, the heavier large-v3-turbo-quantized at 20. See README for normalization rules. |
The bug
A meeting can capture and transcribe both sides of a call, yet the transcript stops surfacing partway through (looks like "recording stopped" while detection stays green). The data is in the DB on the live path, but the post-call meeting view reads
audio_transcriptions, and a live segment only lands there ifmirror_live_meeting_to_audio_transcriptionscopies it in.That mirror silently dropped any live segment whose nearest same-device audio chunk fell outside a fixed ±
coverage_window(15s):A second, related inconsistency:
mark_chunks_covered_by_livematched the device name case-sensitively (instr(file_path, device_name)) while its sibling mirror matched case-insensitively (and the mirror's comment claimed they agreed). A casing difference between the chunk file path and the stored device name left meeting chunkspending— re-transcribed by the batch reconciler and inconsistent with the mirror.The fix
timestamp, so search/timeline stay correct. Device attribution stays strict: it never matches a different device's chunk.mark_chunks_covered_by_live's device-name match to be case-insensitive, matching the mirror.Tests
Two regression tests added, full coverage/mirror/dedup suite passes (
screenpipe-db, no hardware):Scope / honest notes
🤖 Generated with Claude Code