Skip to content

Commit 3187344

Browse files
committed
add model loading status, effects preset dropdown, clean up UI
Backend: - Generation service reports 'loading_model' status only when model is not yet in memory, then 'generating' once inference starts - Migrate hf_offline_patch.py from print() to logging module - Update ADDING_TTS_ENGINES.md for post-refactor file paths Frontend: - HistoryTable shows 'Loading model...' vs 'Generating...' based on step - FloatingGenerateBox: replace instruct toggle + inline effects editor with an effects preset dropdown (third dropdown after language and engine) - Instruct UI removed for now (form field preserved for future models) - Remove focus ring from Select component globally
1 parent 8efcc95 commit 3187344

File tree

8 files changed

+157
-240
lines changed

8 files changed

+157
-240
lines changed

app/src/components/Generation/FloatingGenerateBox.tsx

Lines changed: 90 additions & 165 deletions
Large diffs are not rendered by default.

app/src/components/History/HistoryTable.tsx

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -394,7 +394,8 @@ export function HistoryTable() {
394394
>
395395
{history.map((gen) => {
396396
const isCurrentlyPlaying = currentAudioId === gen.id && isPlaying;
397-
const isGenerating = gen.status === 'generating';
397+
const isInProgress = gen.status === 'loading_model' || gen.status === 'generating';
398+
const isGenerating = isInProgress;
398399
const isFailed = gen.status === 'failed';
399400
const isPlayable = !isGenerating && !isFailed;
400401
const hasVersions = gen.versions && gen.versions.length > 1;
@@ -472,8 +473,10 @@ export function HistoryTable() {
472473
) : null}
473474
</div>
474475
<div className="text-xs text-muted-foreground">
475-
{isGenerating ? (
476-
<span className="text-accent">Generating...</span>
476+
{isInProgress ? (
477+
<span className="text-accent">
478+
{gen.status === 'loading_model' ? 'Loading model...' : 'Generating...'}
479+
</span>
477480
) : (
478481
formatDate(gen.created_at)
479482
)}

app/src/components/ui/select.tsx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ const SelectTrigger = React.forwardRef<
1616
<SelectPrimitive.Trigger
1717
ref={ref}
1818
className={cn(
19-
'flex h-10 w-full items-center justify-between rounded-md border border-input bg-background px-3 py-2 text-sm ring-offset-background placeholder:text-muted-foreground focus:outline-none focus:ring-2 focus:ring-ring focus:ring-offset-2 disabled:cursor-not-allowed disabled:opacity-50 [&>span]:line-clamp-1',
19+
'flex h-10 w-full items-center justify-between rounded-md border border-input bg-background px-3 py-2 text-sm ring-offset-background placeholder:text-muted-foreground focus:outline-none disabled:cursor-not-allowed disabled:opacity-50 [&>span]:line-clamp-1',
2020
className,
2121
)}
2222
{...props}

app/src/lib/api/types.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ export interface GenerationResponse {
7373
instruct?: string;
7474
engine?: string;
7575
model_size?: string;
76-
status: 'generating' | 'completed' | 'failed';
76+
status: 'loading_model' | 'generating' | 'completed' | 'failed';
7777
error?: string;
7878
is_favorited?: boolean;
7979
created_at: string;

app/src/lib/hooks/useGenerationProgress.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ import { useServerStore } from '@/stores/serverStore';
88

99
interface GenerationStatusEvent {
1010
id: string;
11-
status: 'generating' | 'completed' | 'failed' | 'not_found';
11+
status: 'loading_model' | 'generating' | 'completed' | 'failed' | 'not_found';
1212
duration?: number;
1313
error?: string;
1414
}

backend/services/generation.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -55,20 +55,21 @@ async def run_generation(
5555
bg_db = next(get_db())
5656

5757
try:
58-
# --- Load model --------------------------------------------------
59-
await load_engine_model(engine, model_size)
60-
6158
tts_model = get_tts_backend_for_engine(engine)
6259

63-
# --- Build voice prompt ------------------------------------------
60+
if not tts_model.is_loaded():
61+
await history.update_generation_status(generation_id, "loading_model", bg_db)
62+
63+
await load_engine_model(engine, model_size)
64+
6465
voice_prompt = await profiles.create_voice_prompt_for_profile(
6566
profile_id,
6667
bg_db,
6768
use_cache=True,
6869
engine=engine,
6970
)
7071

71-
# --- Inference ---------------------------------------------------
72+
await history.update_generation_status(generation_id, "generating", bg_db)
7273
trim_fn = trim_tts_output if engine_needs_trim(engine) else None
7374

7475
gen_kwargs: dict = dict(

backend/utils/hf_offline_patch.py

Lines changed: 37 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,100 +1,88 @@
1-
"""
2-
Monkey patch for huggingface_hub to force offline mode with cached models.
3-
This prevents mlx_audio from making network requests when models are already downloaded.
1+
"""Monkey-patch huggingface_hub to force offline mode with cached models.
2+
3+
Prevents mlx_audio from making network requests when models are already
4+
downloaded. Must be imported BEFORE mlx_audio.
45
"""
56

7+
import logging
68
import os
79
from pathlib import Path
810
from typing import Optional, Union
911

12+
logger = logging.getLogger(__name__)
13+
1014

1115
def patch_huggingface_hub_offline():
12-
"""
13-
Monkey-patch huggingface_hub to force offline mode.
14-
This must be called BEFORE importing mlx_audio.
15-
"""
16+
"""Monkey-patch huggingface_hub to force offline mode."""
1617
try:
17-
import huggingface_hub
18+
import huggingface_hub # noqa: F401 -- need the package loaded
1819
from huggingface_hub import constants as hf_constants
1920
from huggingface_hub.file_download import _try_to_load_from_cache
20-
21-
# Store original function
21+
2222
original_try_load = _try_to_load_from_cache
23-
23+
2424
def _patched_try_to_load_from_cache(
2525
repo_id: str,
2626
filename: str,
2727
cache_dir: Union[str, Path, None] = None,
2828
revision: Optional[str] = None,
2929
repo_type: Optional[str] = None,
3030
):
31-
"""
32-
Patched version that forces offline mode.
33-
Returns None if not cached (instead of making network request).
34-
"""
35-
# Always use the original function, but we're already in HF_HUB_OFFLINE mode
3631
result = original_try_load(
3732
repo_id=repo_id,
3833
filename=filename,
3934
cache_dir=cache_dir,
4035
revision=revision,
4136
repo_type=repo_type,
4237
)
43-
38+
4439
if result is None:
45-
# File not in cache - log this for debugging
4640
cache_path = Path(hf_constants.HF_HUB_CACHE) / f"models--{repo_id.replace('/', '--')}"
47-
print(f"[HF_PATCH] File not cached: {repo_id}/{filename}")
48-
print(f"[HF_PATCH] Expected at: {cache_path}")
41+
logger.debug("file not cached: %s/%s (expected at %s)", repo_id, filename, cache_path)
4942
else:
50-
print(f"[HF_PATCH] Cache hit: {repo_id}/{filename}")
51-
43+
logger.debug("cache hit: %s/%s", repo_id, filename)
44+
5245
return result
53-
54-
# Replace the function
46+
5547
import huggingface_hub.file_download as fd
48+
5649
fd._try_to_load_from_cache = _patched_try_to_load_from_cache
57-
58-
print("[HF_PATCH] huggingface_hub patched for offline mode")
59-
50+
logger.debug("huggingface_hub patched for offline mode")
51+
6052
except ImportError:
61-
print("[HF_PATCH] huggingface_hub not found, skipping patch")
62-
except Exception as e:
63-
print(f"[HF_PATCH] Error patching huggingface_hub: {e}")
53+
logger.debug("huggingface_hub not available, skipping offline patch")
54+
except Exception:
55+
logger.exception("failed to patch huggingface_hub for offline mode")
6456

6557

6658
def ensure_original_qwen_config_cached():
59+
"""Symlink the original Qwen repo cache to the MLX community version.
60+
61+
mlx_audio may try to fetch config from the original Qwen repo. If only
62+
the MLX community variant is cached, create a symlink so the cache lookup
63+
succeeds without a network request.
6764
"""
68-
The MLX community model is based on the original Qwen model.
69-
mlx_audio may try to fetch config from the original repo.
70-
We need to ensure that config is available in the cache.
71-
"""
72-
from huggingface_hub import constants as hf_constants
73-
74-
# Original Qwen model that mlx_audio might reference
65+
try:
66+
from huggingface_hub import constants as hf_constants
67+
except ImportError:
68+
return
69+
7570
original_repo = "Qwen/Qwen3-TTS-12Hz-1.7B-Base"
7671
mlx_repo = "mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16"
77-
72+
7873
cache_dir = Path(hf_constants.HF_HUB_CACHE)
79-
8074
original_path = cache_dir / f"models--{original_repo.replace('/', '--')}"
8175
mlx_path = cache_dir / f"models--{mlx_repo.replace('/', '--')}"
82-
83-
# If original repo cache doesn't exist but MLX does, create a symlink or copy config
76+
8477
if not original_path.exists() and mlx_path.exists():
85-
print(f"[HF_PATCH] Original repo not cached, but MLX version is")
86-
print(f"[HF_PATCH] Creating symlink from {original_repo} -> {mlx_repo}")
87-
8878
try:
89-
# Create a symlink so the cache lookup succeeds
9079
original_path.parent.mkdir(parents=True, exist_ok=True)
9180
original_path.symlink_to(mlx_path, target_is_directory=True)
92-
print(f"[HF_PATCH] Symlink created successfully")
93-
except Exception as e:
94-
print(f"[HF_PATCH] Could not create symlink: {e}")
81+
logger.info("created cache symlink: %s -> %s", original_repo, mlx_repo)
82+
except Exception:
83+
logger.warning("could not create cache symlink for %s", original_repo, exc_info=True)
9584

9685

97-
# Auto-apply patch when module is imported
9886
if os.environ.get("VOICEBOX_OFFLINE_PATCH", "1") != "0":
9987
patch_huggingface_hub_offline()
10088
ensure_original_qwen_config_cached()

docs/plans/ADDING_TTS_ENGINES.md

Lines changed: 15 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,9 @@ Guide for adding new TTS model backends. Based on the implementation of LuxTTS (
66

77
## Overview
88

9-
Adding an engine touches ~12 files across 4 layers (down from ~19 after the model config registry refactor). The backend protocol work is straightforward — the real time sink is dependency hell, upstream library bugs, and PyInstaller bundling.
9+
Adding an engine touches ~10 files across 4 layers. The backend protocol work is straightforward — the real time sink is dependency hell, upstream library bugs, and PyInstaller bundling.
10+
11+
The backend is split into layers: `routes/` (thin HTTP handlers), `services/` (business logic), `backends/` (engine implementations), and `utils/` (shared utilities). New engines only need to touch `backends/` and `models.py` on the backend side — the route and service layers use a model config registry that handles dispatch automatically.
1012

1113
---
1214

@@ -119,26 +121,24 @@ In `backend/models.py`:
119121

120122
---
121123

122-
## Phase 2: API Integration (`main.py`)
124+
## Phase 2: Route and Service Integration
123125

124-
With the model config registry, `main.py` has **zero per-engine dispatch points**. All endpoints use registry helpers like `get_model_config()`, `load_engine_model()`, `engine_needs_trim()`, `check_model_loaded()`, etc.
126+
With the model config registry, the route and service layers have **zero per-engine dispatch points**. All endpoints use registry helpers like `get_model_config()`, `load_engine_model()`, `engine_needs_trim()`, `check_model_loaded()`, etc.
125127

126-
**You don't need to touch `main.py` at all** unless your engine needs custom behavior in the generate endpoint (e.g. a new post-processing step beyond `trim_tts_output`).
128+
**You don't need to touch any route or service files** unless your engine needs custom behavior in the generate pipeline (e.g. a new post-processing step beyond `trim_tts_output`).
127129

128130
### 2.1 What the registry handles automatically
129131

130-
| Endpoint | Registry function used |
131-
|----------|----------------------|
132-
| `POST /generate` | `load_engine_model(engine, size)` + `engine_needs_trim(engine)` |
133-
| `POST /generate/stream` | `ensure_model_cached_or_raise(engine, size)` + `load_engine_model()` |
134-
| `GET /models/status` | `get_all_model_configs()` + `check_model_loaded(config)` |
135-
| `POST /models/download` | `get_model_config(name)` + `get_model_load_func(config)` |
136-
| `POST /models/{name}/unload` | `get_model_config(name)` + `unload_model_by_config(config)` |
137-
| `DELETE /models/{name}` | `get_model_config(name)` + `unload_model_by_config(config)` |
132+
| Route file | Registry function used |
133+
|------------|----------------------|
134+
| `routes/generations.py` | `load_engine_model(engine, size)` + `engine_needs_trim(engine)` |
135+
| `routes/models.py` | `get_all_model_configs()` + `check_model_loaded(config)` |
136+
| `routes/models.py` | `get_model_config(name)` + `get_model_load_func(config)` |
137+
| `services/generation.py` | `get_tts_backend_for_engine()` + `ensure_model_cached_or_raise()` |
138138

139139
### 2.2 Post-processing
140140

141-
If your model produces trailing silence or hallucinated audio, set `needs_trim=True` on your `ModelConfig`. The generate endpoint checks `engine_needs_trim(engine)` and applies `trim_tts_output()` automatically.
141+
If your model produces trailing silence or hallucinated audio, set `needs_trim=True` on your `ModelConfig`. The generation service checks `engine_needs_trim(engine)` and applies `trim_tts_output()` automatically.
142142

143143
---
144144

@@ -321,7 +321,7 @@ Used by both Chatterbox backends. LuxTTS works fine on MPS.
321321

322322
To get download progress bars in the UI, wrap model loading with `HFProgressTracker`:
323323
```python
324-
from backend.utils.hf_progress import HFProgressTracker
324+
from ..utils.hf_progress import HFProgressTracker
325325
tracker = HFProgressTracker(model_name, progress_manager)
326326
with tracker.patch_download():
327327
model = ModelClass.from_pretrained(repo_id)
@@ -339,7 +339,7 @@ The tracker monkey-patches tqdm to intercept HuggingFace's internal progress bar
339339
- [ ] `backend/requirements.txt` — dependencies added (check for `--no-deps` needs)
340340
- [ ] `justfile``--no-deps` install step if needed
341341

342-
### API (`backend/main.py`)
342+
### Routes and services
343343
No changes needed — the model config registry handles all dispatch automatically.
344344

345345
### Frontend

0 commit comments

Comments
 (0)