feat(#134): configurable voice prompts on assistant pause#135
Merged
Conversation
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #134
Summary
Configurable Stop hook that speaks the assistant's question aloud (Jarvis-from-Iron-Man style) when it pauses for user input. Initial phase is macOS-only via the bundled
saycommand, no voice input — adopters reply via keyboard.Default OFF. Pure-additive change for upstream — no existing fork sees behaviour change until they explicitly flip
voice_prompts.enabledtotruein their.claude/project-config.json.Decision rationale in AgDR-0009-voice-prompts-on-pause.
What ships
.claude/hooks/voice-prompt-on-pause.shquestions-onlyby default — last paragraph ends with?or matches recognised "Approved?" / "Reply with X" / "(a)/(b)/(c)" / "which path" patterns), markdown stripping (backticks, bold/italic, link syntax, bullets, table pipes), sentence-boundary truncation tomax_chars(default 200), fire-and-forgetsayinvocation.claude/hooks/tests/test_voice_prompt_on_pause.shtests/test_<name>.shconvention fromtest_warn_stale_review_markers.sh). All passing locally..claude/project-config.defaults.jsonvoice_promptsblock (enabled, voice, max_chars, rate_wpm, trigger). All default-OFF / safe values..claude/settings.jsonStophook entry wired with the standard ops-root resolver wrapper that the rest of the hooks usedocs/agdr/AgDR-0009-voice-prompts-on-pause.mdsaywins for the initial phase, future-phase backlogdocs/project-config.mdSix files, +665 / -1 LOC.
Why a Stop hook, not something more elaborate
The ApexYard interaction pattern in long sessions has lots of discrete pause points where the assistant cannot proceed without explicit input — per-PR merge approvals, design-review (a/b/c) choices, "which path do you want", tool-result confirmations. These pauses are silent text in the terminal. If the user has stepped away from the keyboard, the conversation stalls with zero attentional signal.
A Stop hook fires exactly at those pause points. The trigger heuristic only speaks when the message looks like a request for input — so tool-result reports, summary messages, and progress updates stay silent.
Real example from the session that produced this PR — the assistant said "Reply
approve 354(ormerge 354) to ship — then I'll roll on to #243". Hook trigger fired (last paragraph containsReply with+ ends with em-dash but the heuristic also matches the apostrophe-paragraph). Withenabled: true, the user would have heard "Reply approve 354 or merge 354 to ship — then I'll roll on to two four three" in Daniel's voice.Why default OFF
Upstream-friendly default. Adopters who pull this commit see no behaviour change until they opt in. The hook is a sub-millisecond fast-path no-op when disabled (single
config_get_orcall beforeexit 0).Why macOS-only initial phase
The user's framing ("Jarvis from Iron Man") implies a high-quality British male voice. macOS bundles
Daniel (Premium)which is the closest free voice. ElevenLabs / OpenAI TTS would produce closer fidelity, but at recurring per-character cost AND require API-key management AND send assistant text off the local machine — that's an AgDR-worthy decision in its own right (Phase 3, separately).Linux/Windows fall through silently when
sayisn't on PATH. Phase 2 adds OS-detection andespeak/Add-Type SpeechSynthesizerpaths. Same trigger model, same config schema, just a platform-layer addition.Testing
bash .claude/hooks/tests/test_voice_prompt_on_pause.sh— 9/9 passApproved?pattern → say invoked(a)/(b)/(c)menu → say invokedsaynot on PATH → exit 0, no crashbash -nsyntax-check on both the hook and the test filejq .validatesproject-config.defaults.jsonandsettings.json.claude/project-config.jsonoverride and confirmed "heard it" on the in-session smoke testRisks
sayisn't on PATH — cross-platform users see exactly the disabled-state behaviour. No errors, no spam.?would get read aloud. Conservative regex prefers false-negatives. Adopters can disable the hook entirely or switch totrigger: "always"for debugging.saywill speak through the active output device. Acceptable side-effect for v1; an env-var override could disable per-session in Phase 2.Glossary
{ session_id, transcript_path, ... }JSON on stdin. Used here to detect pause-for-input moments and speak the question aloud.questions-only)?(after stripping trailing whitespace + markdown emphasis), OR matches one ofApproved?/Reply with/Confirm/(a)/(b)/(c)/which path/proceed?. Conservative; prefers false-negatives.Daniel (Premium)say -v "?".VOICE_PROMPTS_SYNC=1saysynchronously instead of fire-and-forget. Tests need this because the orphaned-bg-process reparenting interacts badly with the test runner's subshell wrapper. Production invocations always run async.. ! ?) until the next sentence would push pastmax_chars. Avoids cutting mid-word; reads cleanly.Refs #134