Skip to content

Bug: Selecting Qwen3-TTS 0.6B still downloads and uses the 1.7B model #96

@CelebrityPunks

Description

@CelebrityPunks

Description

When selecting Qwen3-TTS 0.6B from the model dropdown and generating speech, the app ignores the selection and downloads/loads the 1.7B model instead. There is also no way to cancel the unwanted download from the UI.

Steps to reproduce

  1. Install Voicebox v0.1.12
  2. Download only the Qwen TTS 0.6B model from Model Management
  3. Select Qwen3-TTS 0.6B in the generation dropdown
  4. Click generate
  5. The app starts downloading Qwen TTS 1.7B (3.6 GB) instead of using the already-downloaded 0.6B

Root cause

In backend/main.py, the /generate endpoint calls create_voice_prompt_for_profile() (line 546) before calling load_model_async(model_size) (line 585).

Since create_voice_prompt() in pytorch_backend.py (line 235) calls load_model_async(None), and None falls back to self.model_size which defaults to "1.7B" from the constructor, the 1.7B model gets loaded/downloaded regardless of the user's selection.

Fix

PR #95 reorders the operations so the user's requested model is loaded before the voice prompt is created.

Environment

  • Voicebox v0.1.12
  • Windows 11
  • NVIDIA GeForce GTX 1660 SUPER

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions