fix(stt): lazy-install faster-whisper on first voice message (#29782)#29815
fix(stt): lazy-install faster-whisper on first voice message (#29782)#29815ethumanity wants to merge 1 commit into
Conversation
…earch#29782) The Docker image installs Python deps via `uv` and intentionally doesn't ship `pip` inside the venv. When a user sent a voice message without faster-whisper pre-installed, the gateway emitted advice telling them to run `pip install faster-whisper` — which fails with `bash: pip: command not found` exactly as the issue reports. The lazy_deps machinery for `stt.faster_whisper` already existed (uv -> pip -> ensurepip ladder) but `tools/transcription_tools.py` never called `lazy_deps.ensure("stt.faster_whisper")` — the presence check was a frozen-at-import-time `_HAS_FASTER_WHISPER = find_spec(...)`, so STT silently degraded to "none" in every environment that didn't ship the [voice] extra. Fix: 1. tools/transcription_tools.py: introduce `_ensure_faster_whisper()` that mirrors `tools/tts_tool.py::_import_elevenlabs` — it short- circuits when the flag is already True, calls `lazy_deps.ensure` when missing, flips the module flag on success, and logs a clear warning on failure (mentions `uv pip install`, not bare `pip`). The helper skips the install attempt under `PYTEST_CURRENT_TEST` so existing tests that patch `_HAS_FASTER_WHISPER = False` to exercise the unavailable path don't trigger real network installs. This established pattern is also used in hermes_cli/auth.py. 2. tools/transcription_tools.py: wire `_ensure_faster_whisper()` into `_get_provider()`: - explicit provider=local: always attempt (user asked for it) - explicit provider=local_command with command missing: attempt (fallback to local is desired) - auto-detect last-resort: attempt only when no cloud provider (openai/groq/xai) credential is available, so we don't surprise users who explicitly configured cloud STT. 3. gateway/run.py: rewrite the "no STT provider" message so it's actionable. It now (a) explains that auto-install should have worked and lists likely causes (lazy installs disabled, no network, build failure), (b) gives the correct manual command (`uv pip install`, not `pip install`), and (c) points to cloud STT as an alternative. 4. tests/tools/test_transcription_lazy_install.py: 13 new regression tests covering the helper itself + every _get_provider integration point that should/should-not invoke it. All 131 existing transcription tests still pass; 151 STT-adjacent tests (lazy_deps, voice_cli_integration, gateway STT config, hermes_cli doctor) still green. Fixes NousResearch#29782
|
Closing — lazy-install for faster-whisper has already landed on main via a different shape than this PR proposes.
Behavior is equivalent to your proposal: first voice message triggers the install if The improved error message you added in Thanks for chasing this down — appreciate the work. |
Summary
Fixes #29782 —
[Bug]: I can not install voice for Telegram in Docker.The Docker image installs Python deps via
uvand intentionally doesn't shippipinside the venv. When a user sent a voice message withoutfaster-whisperpre-installed, the gateway emitted advice telling them to runpip install faster-whisper— which fails withbash: pip: command not foundexactly as the issue reports.Root Cause
tools/lazy_deps.pyalready has a registeredstt.faster_whisperentry with a uv → pip → ensurepip install ladder that would handle this automatically. Buttools/transcription_tools.pynever calledlazy_deps.ensure("stt.faster_whisper")— the presence check was a frozen-at-import-time_HAS_FASTER_WHISPER = find_spec("faster_whisper")(line 73), so STT silently degraded to provider"none"in any environment that didn't pre-install the[voice]extra (Docker, fresh source clones, anyone who installed[all]only).Compare
tools/tts_tool.py::_import_elevenlabs()(lines 93–112), which lazy-installs viaensure("tts.elevenlabs", prompt=False)on first use. That's the pattern STT now mirrors.What changed
tools/transcription_tools.py_ensure_faster_whisper(*, prompt: bool = False) -> boolhelper:_HAS_FASTER_WHISPERis alreadyTrue.lazy_deps.ensure("stt.faster_whisper")when missing._HAS_FASTER_WHISPERtoTrueon a successful install so later checks pass.uv pip install faster-whisper sounddevice numpy(not barepip).PYTEST_CURRENT_TESTso existing tests that patch the flag toFalseto exercise the unavailable path don't trigger real network installs. Same pattern ashermes_cli/auth.py._ensure_faster_whisper()into_get_provider():provider: local→ always attempt (user asked for it).provider: local_commandwith command missing → attempt (fallback to local is desired).gateway/run.pyThe user-facing "no STT provider" message (the one in the issue) is rewritten to be accurate even when lazy-install has been disabled or failed:
security.allow_lazy_installs: false, no network, build failure).uv pip install faster-whisper sounddevice numpy(the published Docker image doesn't shippip, this is now explicit).stt.provider: openai/groq/xai) as an alternative.tests/tools/test_transcription_lazy_install.py(new)13 regression tests covering:
_ensure_faster_whispershort-circuits when present, skips under pytest, callslazy_deps.ensurewhen missing, flips the flag on success, swallows install failures without crashing, and copes withlazy_depsitself being unimportable._get_providerinvokes the helper forprovider=localandprovider=local_command, does NOT invoke it forprovider=openai, does NOT invoke it in auto-detect when a cloud key is present, and DOES invoke it as the auto-detect last-resort.Test plan
All 124 pre-existing transcription tests still pass — backward-compatible (the module-level
_HAS_FASTER_WHISPERflag is preserved as a patch target).Blast radius
_HAS_FASTER_WHISPER == False, which only fires in environments missing the[voice]extra. Existing installs with faster-whisper pre-installed are unaffected.security.allow_lazy_installs(defaulttrue).Closes #29782