Skip to content

fix(stt): lazy-install faster-whisper on first voice message (#29782)#29815

Closed
ethumanity wants to merge 1 commit into
NousResearch:mainfrom
ethumanity:fix/stt-lazy-install-docker
Closed

fix(stt): lazy-install faster-whisper on first voice message (#29782)#29815
ethumanity wants to merge 1 commit into
NousResearch:mainfrom
ethumanity:fix/stt-lazy-install-docker

Conversation

@ethumanity

Copy link
Copy Markdown

Summary

Fixes #29782[Bug]: I can not install voice for Telegram in Docker.

The Docker image installs Python deps via uv and intentionally doesn't ship pip inside the venv. When a user sent a voice message without faster-whisper pre-installed, the gateway emitted advice telling them to run pip install faster-whisper — which fails with bash: pip: command not found exactly as the issue reports.

Root Cause

tools/lazy_deps.py already has a registered stt.faster_whisper entry with a uv → pip → ensurepip install ladder that would handle this automatically. But tools/transcription_tools.py never called lazy_deps.ensure("stt.faster_whisper") — the presence check was a frozen-at-import-time _HAS_FASTER_WHISPER = find_spec("faster_whisper") (line 73), so STT silently degraded to provider "none" in any environment that didn't pre-install the [voice] extra (Docker, fresh source clones, anyone who installed [all] only).

Compare tools/tts_tool.py::_import_elevenlabs() (lines 93–112), which lazy-installs via ensure("tts.elevenlabs", prompt=False) on first use. That's the pattern STT now mirrors.

What changed

tools/transcription_tools.py

  • New _ensure_faster_whisper(*, prompt: bool = False) -> bool helper:
    • Short-circuits when _HAS_FASTER_WHISPER is already True.
    • Calls lazy_deps.ensure("stt.faster_whisper") when missing.
    • Flips _HAS_FASTER_WHISPER to True on a successful install so later checks pass.
    • Logs a clear warning on failure mentioning uv pip install faster-whisper sounddevice numpy (not bare pip).
    • Skips the install under PYTEST_CURRENT_TEST so existing tests that patch the flag to False to exercise the unavailable path don't trigger real network installs. Same pattern as hermes_cli/auth.py.
  • Wired _ensure_faster_whisper() into _get_provider():
    • Explicit provider: local → always attempt (user asked for it).
    • Explicit provider: local_command with command missing → attempt (fallback to local is desired).
    • Auto-detect last-resort → attempt only when no cloud provider (openai/groq/xai) credential is available, so we don't surprise users who configured cloud STT.

gateway/run.py

The user-facing "no STT provider" message (the one in the issue) is rewritten to be accurate even when lazy-install has been disabled or failed:

  • Explains that auto-install should have worked and enumerates likely causes (security.allow_lazy_installs: false, no network, build failure).
  • Gives the correct manual command: uv pip install faster-whisper sounddevice numpy (the published Docker image doesn't ship pip, this is now explicit).
  • Points to cloud STT (stt.provider: openai / groq / xai) as an alternative.

tests/tools/test_transcription_lazy_install.py (new)

13 regression tests covering:

  • _ensure_faster_whisper short-circuits when present, skips under pytest, calls lazy_deps.ensure when missing, flips the flag on success, swallows install failures without crashing, and copes with lazy_deps itself being unimportable.
  • _get_provider invokes the helper for provider=local and provider=local_command, does NOT invoke it for provider=openai, does NOT invoke it in auto-detect when a cloud key is present, and DOES invoke it as the auto-detect last-resort.

Test plan

$ .venv/bin/python -m pytest tests/tools/test_transcription_lazy_install.py
======================== 13 passed in 0.15s ========================

$ .venv/bin/python -m pytest tests/tools/test_transcription_tools.py                               tests/tools/test_transcription.py                               tests/tools/test_transcription_dotenv_fallback.py
=================== 124 passed, 7 skipped in 1.21s ===================

$ .venv/bin/python -m pytest tests/tools/test_lazy_deps.py                               tests/tools/test_voice_cli_integration.py                               tests/gateway/test_stt_config.py                               tests/hermes_cli/test_doctor.py
====================== 212 passed in ~6s =======================

All 124 pre-existing transcription tests still pass — backward-compatible (the module-level _HAS_FASTER_WHISPER flag is preserved as a patch target).

Blast radius

  • Behaviour change is gated on _HAS_FASTER_WHISPER == False, which only fires in environments missing the [voice] extra. Existing installs with faster-whisper pre-installed are unaffected.
  • Install attempts respect security.allow_lazy_installs (default true).
  • No new package pins, no Dockerfile change, no PyPI dependency added.
  • Manual remediation path is documented in the user-facing error message.

Closes #29782

…earch#29782)

The Docker image installs Python deps via `uv` and intentionally
doesn't ship `pip` inside the venv. When a user sent a voice message
without faster-whisper pre-installed, the gateway emitted advice telling
them to run `pip install faster-whisper` — which fails with
`bash: pip: command not found` exactly as the issue reports.

The lazy_deps machinery for `stt.faster_whisper` already existed
(uv -> pip -> ensurepip ladder) but `tools/transcription_tools.py`
never called `lazy_deps.ensure("stt.faster_whisper")` — the presence
check was a frozen-at-import-time `_HAS_FASTER_WHISPER = find_spec(...)`,
so STT silently degraded to "none" in every environment that didn't
ship the [voice] extra.

Fix:

1. tools/transcription_tools.py: introduce `_ensure_faster_whisper()`
   that mirrors `tools/tts_tool.py::_import_elevenlabs` — it short-
   circuits when the flag is already True, calls `lazy_deps.ensure`
   when missing, flips the module flag on success, and logs a clear
   warning on failure (mentions `uv pip install`, not bare `pip`).

   The helper skips the install attempt under `PYTEST_CURRENT_TEST`
   so existing tests that patch `_HAS_FASTER_WHISPER = False` to
   exercise the unavailable path don't trigger real network installs.
   This established pattern is also used in hermes_cli/auth.py.

2. tools/transcription_tools.py: wire `_ensure_faster_whisper()` into
   `_get_provider()`:
     - explicit provider=local: always attempt (user asked for it)
     - explicit provider=local_command with command missing: attempt
       (fallback to local is desired)
     - auto-detect last-resort: attempt only when no cloud provider
       (openai/groq/xai) credential is available, so we don't surprise
       users who explicitly configured cloud STT.

3. gateway/run.py: rewrite the "no STT provider" message so it's
   actionable. It now (a) explains that auto-install should have worked
   and lists likely causes (lazy installs disabled, no network, build
   failure), (b) gives the correct manual command (`uv pip install`,
   not `pip install`), and (c) points to cloud STT as an alternative.

4. tests/tools/test_transcription_lazy_install.py: 13 new regression
   tests covering the helper itself + every _get_provider integration
   point that should/should-not invoke it.

All 131 existing transcription tests still pass; 151 STT-adjacent tests
(lazy_deps, voice_cli_integration, gateway STT config, hermes_cli doctor)
still green.

Fixes NousResearch#29782
@alt-glitch alt-glitch added type/bug Something isn't working P2 Medium — degraded but workaround exists tool/tts Text-to-speech and transcription comp/gateway Gateway runner, session dispatch, delivery area/docker Docker image, Compose, packaging labels May 21, 2026
@benbarclay

Copy link
Copy Markdown
Collaborator

Closing — lazy-install for faster-whisper has already landed on main via a different shape than this PR proposes.

tools/transcription_tools.py::_try_lazy_install_stt() (around line 200) wraps tools.lazy_deps.ensure("stt.faster_whisper") and is wired into the three call sites this PR targets:

  • _get_provider (line 755 and 825)
  • _transcribe_local (line 1096)

Behavior is equivalent to your proposal: first voice message triggers the install if faster-whisper isn't importable, then proceeds. Your implementation has some nice touches (PYTEST_CURRENT_TEST short-circuit, module-flag flip after success) that didn't make it into the upstream version — happy to review a focused follow-up PR if you want to port those.

The improved error message you added in gateway/run.py is also valuable but is a separate concern and partially overlaps with #29800 (which adjusts the install-hint strings). If you'd like to keep that piece, a small targeted PR on gateway/run.py would be straightforward to land.

Thanks for chasing this down — appreciate the work.

@benbarclay benbarclay closed this May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/docker Docker image, Compose, packaging comp/gateway Gateway runner, session dispatch, delivery P2 Medium — degraded but workaround exists tool/tts Text-to-speech and transcription type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: I can not install voice for Telegram in Docker

3 participants