Skip to content

fix: Chatterbox float64 dtype mismatch + model unload button#264

Merged
jamiepine merged 4 commits intomainfrom
fix/chatterbox-float64-dtype
Mar 13, 2026
Merged

fix: Chatterbox float64 dtype mismatch + model unload button#264
jamiepine merged 4 commits intomainfrom
fix/chatterbox-float64-dtype

Conversation

@jamiepine
Copy link
Owner

@jamiepine jamiepine commented Mar 13, 2026

Summary

Fixes expected m1 and m2 to have the same dtype, but got: float != double when generating with Chatterbox (both multilingual and turbo), and adds a model unload button so users can free memory without restarting the backend.

Root cause

librosa.load() returns float64 numpy arrays. The upstream chatterbox package converts these to torch tensors via torch.from_numpy() without casting to float32, creating dtype mismatches in two places:

  1. S3Tokenizer.log_mel_spectrogram — float64 STFT magnitudes hit float32 _mel_filters in a matmul
  2. VoiceEncoder.forward — float64 mel spectrograms hit float32 LSTM weights

Fix

Monkey-patch both entry points after model load to cast inputs to float32:

  • S3Tokenizer.log_mel_spectrogramaudio.float() before STFT
  • VoiceEncoder.forwardmels.float() before LSTM

Applied to both Chatterbox Multilingual and Chatterbox Turbo backends.

Model unload feature

  • POST /models/{model_name}/unload — unloads any model from memory without deleting from disk
  • New "Unload" button in model detail dialog (with Unplug icon)
  • Delete button remains disabled while model is loaded

Summary by CodeRabbit

  • New Features

    • Per-model "Unload" action: API endpoint, client method and UI Unload button with loading state and toasts; Delete remains disabled until unload completes.
    • Model publishing now only occurs after post-load adjustments succeed, reducing partial/failed model states.
  • Bug Fixes

    • Normalize audio/mel input dtypes to float32 before processing to improve compatibility and audio reliability.
    • Improved loaded-model checks to more accurately detect and manage model sizes during unload/delete.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 08a239f1-26ea-480e-b576-5cf1503dbe0c

📥 Commits

Reviewing files that changed from the base of the PR and between bfd7b81 and 2f535a7.

📒 Files selected for processing (3)
  • backend/backends/chatterbox_backend.py
  • backend/backends/chatterbox_turbo_backend.py
  • backend/main.py

📝 Walkthrough

Walkthrough

Adds an unload-by-name API and UI flow, moves chatterbox model assignment to a local variable and applies post-load monkey-patches (S3Tokenizer.log_mel_spectrogram, VoiceEncoder.forward) before publishing the model to avoid half-initialized state.

Changes

Cohort / File(s) Summary
Chatterbox backends
backend/backends/chatterbox_backend.py, backend/backends/chatterbox_turbo_backend.py
Load model into a local model variable, apply pre/post-load monkey-patches (ensure CPU map_location, cast audio/mel inputs to float32 via wrapped S3Tokenizer.log_mel_spectrogram and VoiceEncoder.forward), then assign self.model only after patches succeed.
API server
backend/main.py
Add POST /models/{model_name}/unload and helper unload_model_by_name; validate model names, dispatch unload to specific backends (tts, luxtts, chatterbox, chatterbox_turbo, whisper), and update checks to use _current_model_size when present.
Frontend — model management UI
app/src/components/ServerSettings/ModelManagement.tsx
Add unloadMutation and an Unload button (Unplug icon) with loading state; show Unload + Delete buttons together and disable Delete while model is loaded; minor formatSize tweak for large sizes.
Frontend — API client
app/src/lib/api/client.ts
Add unloadModel(modelName: string): Promise<{ message: string }> to call the new server endpoint.

Sequence Diagram(s)

sequenceDiagram
    participant User as User (UI)
    participant Frontend as Frontend (ModelManagement)
    participant Api as ApiClient / HTTP
    participant Server as Backend API (/models/{name}/unload)
    participant Loader as Model Backend (e.g., chatterbox_turbo)
    participant Model as In-memory Model

    User->>Frontend: Click "Unload"
    Frontend->>Api: POST /models/{name}/unload
    Api->>Server: HTTP request
    Server->>Loader: dispatch unload for model_name
    Loader->>Model: perform unload / release resources
    Model-->>Loader: success
    Loader-->>Server: unloaded
    Server-->>Api: 200 { message }
    Api-->>Frontend: response
    Frontend->>Frontend: invalidate model status queries / update UI
Loading
sequenceDiagram
    participant Loader as Model Loader
    participant ModelLib as Chatterbox Library
    participant S3Tok as S3Tokenizer
    participant VEnc as VoiceEncoder

    Loader->>ModelLib: load model instance (into local `model`)
    ModelLib-->>Loader: model instance
    Loader->>S3Tok: wrap log_mel_spectrogram -> cast audio to float32
    Loader->>VEnc: wrap forward -> cast mel inputs to float32
    Loader-->>ModelLib: bind patched methods to local `model`
    Loader->>Loader: assign `self.model = model` (publish) and complete load
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through loaders, soft and light,

I nudged the mels to float just right,
I wrapped the calls and held my breath,
Then gently let the models rest,
A tiny hop — all systems bright.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the primary changes: fixing a Chatterbox float64 dtype mismatch and adding a model unload button. Both are substantial features covered in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/chatterbox-float64-dtype
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
backend/backends/chatterbox_backend.py (1)

186-194: Consider extracting this monkey-patch into a shared helper.

This same block appears in both Chatterbox backends; centralizing it would reduce drift and keep behavior consistent across model variants.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_backend.py` around lines 186 - 194, Extract the
repeated monkey-patch into a shared helper function (e.g.,
patch_ve_forward_to_float) and replace the inline block in Chatterbox backends
with a call to that helper; the helper should accept the model.ve instance,
capture the original forward (using the same getattr/hasattr check you currently
use to set _orig_ve_forward), define the wrapper (currently _f32_forward) that
calls the original with mels.float(), and attach it back with types.MethodType
so the behavior is identical across model variants.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@backend/backends/chatterbox_backend.py`:
- Around line 186-194: Extract the repeated monkey-patch into a shared helper
function (e.g., patch_ve_forward_to_float) and replace the inline block in
Chatterbox backends with a call to that helper; the helper should accept the
model.ve instance, capture the original forward (using the same getattr/hasattr
check you currently use to set _orig_ve_forward), define the wrapper (currently
_f32_forward) that calls the original with mels.float(), and attach it back with
types.MethodType so the behavior is identical across model variants.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae0b9a43-c070-47fa-82a4-bd2f62d18052

📥 Commits

Reviewing files that changed from the base of the PR and between 1d32170 and 2ec8ac1.

📒 Files selected for processing (2)
  • backend/backends/chatterbox_backend.py
  • backend/backends/chatterbox_turbo_backend.py

The previous approach of patching librosa.load didn't work because
melspectrogram itself performs float64 math (numpy dot, signal.lfilter)
regardless of input dtype. The actual mismatch happens when pack()
creates a float64 tensor from the mel arrays and passes it into the
float32 LSTM weights in VoiceEncoder.forward().

Fix by monkey-patching VoiceEncoder.forward() to call mels.float()
before the LSTM, ensuring the input always matches the model dtype.
@jamiepine jamiepine force-pushed the fix/chatterbox-float64-dtype branch from 2ec8ac1 to 7abdd7c Compare March 13, 2026 11:44
- POST /models/{model_name}/unload — unloads a specific model from
  memory without deleting from disk, supports all engine types
- Frontend: Unload button in model detail dialog when model is loaded
- Delete button remains disabled while loaded (unload first)
@jamiepine jamiepine force-pushed the fix/chatterbox-float64-dtype branch from 7abdd7c to cac80f6 Compare March 13, 2026 11:50
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/ServerSettings/ModelManagement.tsx`:
- Around line 308-315: The onSuccess handler for the unload call currently
ignores the API response and always shows "Model unloaded"; change the signature
in the onSuccess callback to use the real response (replace _data with e.g.
data) and use data.message (or branch on it) when calling toast so the toast
reflects the API's actual success/no-op message; keep the existing
queryClient.invalidateQueries/refetchQueries calls (they can remain as-is) and
ensure you still pass modelName where needed for context in the toast if
desired.

In `@backend/backends/chatterbox_backend.py`:
- Around line 181-195: The patching of VoiceEncoder.forward should happen on a
local model variable before assigning to self.model so a failure doesn't leave
self.model partially initialized; in load_model() create a local variable (e.g.,
model), retrieve model.ve as in the diff (_ve), capture the original forward
(_orig_ve_forward), bind the f32 wrapper (_f32_forward) with types.MethodType to
model.ve, and only after successful patching assign self.model = model (leaving
existing logic that checks self.model at Lines 97/100 unchanged). Ensure you
reference and modify model.ve.forward (not self.model) and preserve the existing
_ve, _orig_ve_forward, and _f32_forward naming/behavior when moving the code.

In `@backend/main.py`:
- Around line 1526-1539: The unload path for Chatterbox calls
backend.unload_model() directly while background threads in
chatterbox_backend.py (_generate_sync and _create_prompt_sync) may still access
self.model, causing None derefs; modify the Chatterbox backend (referencing
class methods _generate_sync, _create_prompt_sync and unload_model) to serialize
unloads against in-flight ops by adding either an asyncio.Lock or an atomic
active-ops counter on the backend instance: increment the counter or acquire the
lock at the start of generation entry points, decrement/release when finished
(including error paths), and make unload_model wait for zero active ops or
acquire the same lock before clearing self.model; then update the API call sites
that use get_tts_backend_for_engine(...).unload_model() to rely on the backend’s
guarded unload (no changes to the API surface), ensuring unload is blocked until
current generations complete.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 416b06e1-816c-41c5-b2dc-2d428fd0965d

📥 Commits

Reviewing files that changed from the base of the PR and between 2ec8ac1 and 7abdd7c.

📒 Files selected for processing (5)
  • app/src/components/ServerSettings/ModelManagement.tsx
  • app/src/lib/api/client.ts
  • backend/backends/chatterbox_backend.py
  • backend/backends/chatterbox_turbo_backend.py
  • backend/main.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/backends/chatterbox_turbo_backend.py

Comment on lines +308 to +315
onSuccess: async (_data, modelName) => {
toast({
title: 'Model unloaded',
description: `${modelName} has been unloaded from memory.`,
});
await queryClient.invalidateQueries({ queryKey: ['modelStatus'], refetchType: 'all' });
await queryClient.refetchQueries({ queryKey: ['modelStatus'] });
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use the API response message for unload toasts.

The new unload endpoint returns a normal success payload for stale/no-op unloads, but this handler ignores _data.message, so the UI still says the unload succeeded even when nothing changed. Toast the returned message (or branch on it) so the modal stays truthful.

Suggested fix
-    onSuccess: async (_data, modelName) => {
+    onSuccess: async (data, modelName) => {
       toast({
-        title: 'Model unloaded',
-        description: `${modelName} has been unloaded from memory.`,
+        title: data.message.includes('is not loaded') ? 'Model already unloaded' : 'Model unloaded',
+        description: data.message,
       });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/ServerSettings/ModelManagement.tsx` around lines 308 -
315, The onSuccess handler for the unload call currently ignores the API
response and always shows "Model unloaded"; change the signature in the
onSuccess callback to use the real response (replace _data with e.g. data) and
use data.message (or branch on it) when calling toast so the toast reflects the
API's actual success/no-op message; keep the existing
queryClient.invalidateQueries/refetchQueries calls (they can remain as-is) and
ensure you still pass modelName where needed for context in the toast if
desired.

Comment on lines +1526 to +1539
elif model_type == "chatterbox":
from .backends import get_tts_backend_for_engine
backend = get_tts_backend_for_engine("chatterbox")
if backend.is_loaded():
backend.unload_model()
else:
return {"message": f"Model {model_name} is not loaded"}
elif model_type == "chatterbox_turbo":
from .backends import get_tts_backend_for_engine
backend = get_tts_backend_for_engine("chatterbox_turbo")
if backend.is_loaded():
backend.unload_model()
else:
return {"message": f"Model {model_name} is not loaded"}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Serialize Chatterbox unloads against active generation.

These branches call backend.unload_model() directly even though backend/backends/chatterbox_backend.py, Lines 310-339 still uses self.model from a worker thread during generation. If the user clicks Unload while a Chatterbox request is in flight, self.model can become None mid-run and the request falls over with a 500. Please gate unload behind a per-backend active-ops counter or unload lock before exposing it from the API.

Based on learnings, unloading self.model while _generate_sync or _create_prompt_sync threads hold a direct reference requires an asyncio.Lock or active-ops counter to serialize model swap against in-flight inference.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 1526 - 1539, The unload path for Chatterbox
calls backend.unload_model() directly while background threads in
chatterbox_backend.py (_generate_sync and _create_prompt_sync) may still access
self.model, causing None derefs; modify the Chatterbox backend (referencing
class methods _generate_sync, _create_prompt_sync and unload_model) to serialize
unloads against in-flight ops by adding either an asyncio.Lock or an atomic
active-ops counter on the backend instance: increment the counter or acquire the
lock at the start of generation entry points, decrement/release when finished
(including error paths), and make unload_model wait for zero active ops or
acquire the same lock before clearing self.model; then update the API call sites
that use get_tts_backend_for_engine(...).unload_model() to rely on the backend’s
guarded unload (no changes to the API surface), ensuring unload is blocked until
current generations complete.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
app/src/components/ServerSettings/ModelManagement.tsx (1)

308-312: ⚠️ Potential issue | 🟡 Minor

Use the unload API response message in the toast.

Line 308 ignores the returned payload, so the toast can claim success even when backend responds with “is not loaded”.

Suggested fix
-    onSuccess: async (_data, modelName) => {
+    onSuccess: async (data, modelName) => {
       toast({
-        title: 'Model unloaded',
-        description: `${modelName} has been unloaded from memory.`,
+        title: data.message.includes('is not loaded') ? 'Model already unloaded' : 'Model unloaded',
+        description: data.message || `${modelName} unload request completed.`,
       });
       await queryClient.invalidateQueries({ queryKey: ['modelStatus'], refetchType: 'all' });
       await queryClient.refetchQueries({ queryKey: ['modelStatus'] });
     },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app/src/components/ServerSettings/ModelManagement.tsx` around lines 308 -
312, The onSuccess handler in ModelManagement.tsx currently ignores the API
response and always shows "Model unloaded" using the modelName param; change it
to read the response payload (the _data argument) from the unload API and use
its message (e.g., _data.message) for the toast title/description so the UI
reflects responses like "is not loaded"; keep a fallback to the modelName if the
message is missing and ensure the same toast call (toast({...})) uses the
response-derived text instead of assuming success.
🧹 Nitpick comments (1)
backend/main.py (1)

1493-1505: Consider centralizing model-name metadata.

The model_types map here duplicates model metadata that is also maintained in other model-management routes, which is easy to drift over time. Extracting one shared constant/helper would reduce mismatch risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 1493 - 1505, The model_types dict is duplicated
across model-management code; extract it into a single shared constant or helper
(e.g., MODEL_TYPES or get_model_metadata()) and replace local occurrences of
model_types in functions/routes (referencing model_types, model_name lookups) to
import and use that centralized symbol; ensure the new module exports the same
mapping shape (model_name -> (model_type, model_size)) and update any callers
that expect this data so they use the centralized constant/helper to avoid
drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/main.py`:
- Around line 1513-1518: The unload branch uses tts_model.model_size which is
inconsistent with the rest of the file's authoritative loaded-size signal;
change the comparison to use the module's current loaded-size field (e.g.
tts._current_model_size or the getter used by the status/health path) so the
check becomes tts.get_tts_model().is_loaded() and compare against
tts._current_model_size, then call tts.unload_tts_model() when they match;
update any references in the same block to use the same symbol used by
status/health (tts._current_model_size or its getter) to avoid misreporting Qwen
as not loaded.

---

Duplicate comments:
In `@app/src/components/ServerSettings/ModelManagement.tsx`:
- Around line 308-312: The onSuccess handler in ModelManagement.tsx currently
ignores the API response and always shows "Model unloaded" using the modelName
param; change it to read the response payload (the _data argument) from the
unload API and use its message (e.g., _data.message) for the toast
title/description so the UI reflects responses like "is not loaded"; keep a
fallback to the modelName if the message is missing and ensure the same toast
call (toast({...})) uses the response-derived text instead of assuming success.

---

Nitpick comments:
In `@backend/main.py`:
- Around line 1493-1505: The model_types dict is duplicated across
model-management code; extract it into a single shared constant or helper (e.g.,
MODEL_TYPES or get_model_metadata()) and replace local occurrences of
model_types in functions/routes (referencing model_types, model_name lookups) to
import and use that centralized symbol; ensure the new module exports the same
mapping shape (model_name -> (model_type, model_size)) and update any callers
that expect this data so they use the centralized constant/helper to avoid
drift.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 485c65d3-d13a-4e66-ae25-04bc4ec5f34e

📥 Commits

Reviewing files that changed from the base of the PR and between 7abdd7c and 2cbd56f.

📒 Files selected for processing (5)
  • app/src/components/ServerSettings/ModelManagement.tsx
  • app/src/lib/api/client.ts
  • backend/backends/chatterbox_backend.py
  • backend/backends/chatterbox_turbo_backend.py
  • backend/main.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/backends/chatterbox_backend.py
  • app/src/lib/api/client.ts

The actual dtype mismatch was in S3Tokenizer.log_mel_spectrogram, not
VoiceEncoder.forward. librosa.load returns float64 numpy, which
torch.from_numpy preserves as double. The STFT output (double) then
hits _mel_filters (float32) in a matmul at s3tokenizer.py:163.

Now patching both entry points after model load:
1. S3Tokenizer.log_mel_spectrogram — cast audio to float32 before STFT
2. VoiceEncoder.forward — cast mels to float32 before LSTM

Remove debug traceback logging (no longer needed).
@jamiepine jamiepine force-pushed the fix/chatterbox-float64-dtype branch from 2cbd56f to bfd7b81 Compare March 13, 2026 12:04
@jamiepine jamiepine changed the title fix: Chatterbox float64 dtype mismatch in VoiceEncoder fix: Chatterbox float64 dtype mismatch + model unload button Mar 13, 2026
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
backend/backends/chatterbox_turbo_backend.py (1)

194-212: ⚠️ Potential issue | 🟠 Major

Keep the loaded Turbo model local until the patching succeeds.

This has the same failure mode as the multilingual backend: self.model is assigned at Line 167/Line 173, then the new patch block can still fail on Line 194 through Line 212. If that happens, retries short-circuit at Line 97 and Line 100, and the cold-download flow may already be marked complete at Line 178 and Line 179. Please patch a local model, then publish it to self.model and complete the task state afterward.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_turbo_backend.py` around lines 194 - 212, The
patching block mutates self.model in-place and can raise after self.model is
already set; instead, load the model into a local variable (e.g., local_model),
apply all patches to local_model (patching _tokzr.log_mel_spectrogram via
_orig_log_mel/_f32_log_mel and _ve.forward via
_orig_ve_forward/_f32_ve_forward), and only after all patches succeed assign
local_model to self.model and update whatever "task state" completion code
currently runs after model assignment; this ensures retries and cold-download
markers (currently guarded around self.model assignment) are not short-circuited
by a failed patch.
backend/backends/chatterbox_backend.py (1)

188-206: ⚠️ Potential issue | 🟠 Major

Patch a local model instance before assigning self.model.

These bindings run after self.model has already been published, so any exception from Line 188 through Line 206 leaves the backend stuck with a partially initialized model. The next load_model() call then returns early at Line 97 and Line 100, and the cold-download path may already have been marked complete at Line 178 and Line 179.

🛠️ Proposed fix
-                            self.model = ChatterboxMultilingualTTS.from_pretrained(
+                            model = ChatterboxMultilingualTTS.from_pretrained(
                                 device=device,
                             )
@@
-                    self.model = ChatterboxMultilingualTTS.from_pretrained(
+                    model = ChatterboxMultilingualTTS.from_pretrained(
                         device=device,
                     )
@@
-            t3_tfmr = self.model.t3.tfmr
+            t3_tfmr = model.t3.tfmr
@@
-            if not is_cached:
-                progress_manager.mark_complete(model_name)
-                task_manager.complete_download(model_name)
-
             # Patch float64 → float32 dtype mismatches in upstream chatterbox.
@@
-            _tokzr = self.model.s3gen.tokenizer
+            _tokzr = model.s3gen.tokenizer
@@
-            _ve = self.model.ve
+            _ve = model.ve
@@
+            self.model = model
+            if not is_cached:
+                progress_manager.mark_complete(model_name)
+                task_manager.complete_download(model_name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_backend.py` around lines 188 - 206, The patching
of tokenizer and VoiceEncoder is being done directly on self.model which can
leave self.model partially initialized if an exception occurs; instead, load the
model into a local variable (e.g., model = <loaded object>), apply the patches
to model.s3gen.tokenizer.log_mel_spectrogram and model.ve.forward (use the
existing _f32_log_mel and _f32_ve_forward patterns) and only assign self.model =
model after all patches succeed; this ensures load_model’s early-return checks
and cold-download flags are not left pointing at a partially-initialized object.
🧹 Nitpick comments (1)
backend/backends/chatterbox_turbo_backend.py (1)

191-212: Extract the dtype monkey-patch into one helper.

This block is now duplicated almost verbatim with backend/backends/chatterbox_backend.py. A shared helper that accepts the loaded model and applies the dtype shims would reduce drift the next time upstream Chatterbox changes these method names or signatures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_turbo_backend.py` around lines 191 - 212, Extract
the duplicated dtype monkey-patch into a single helper function (e.g.,
apply_dtype_monkeypatch) and call it from both chatterbox_turbo_backend and
chatterbox_backend after the model is loaded; the helper should accept the
loaded model, patch model.s3gen.tokenizer.log_mel_spectrogram to coerce audio to
float (preserving the original via _orig_log_mel) and patch model.ve.forward to
call the original with mels.float() (preserving _orig_ve_forward), using
types.MethodType to bind the wrappers, and keep the original symbol names
(S3Tokenizer / s3gen.tokenizer, log_mel_spectrogram, VoiceEncoder / ve, forward)
so future upstream changes are localized to that helper.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@backend/backends/chatterbox_backend.py`:
- Around line 188-206: The patching of tokenizer and VoiceEncoder is being done
directly on self.model which can leave self.model partially initialized if an
exception occurs; instead, load the model into a local variable (e.g., model =
<loaded object>), apply the patches to model.s3gen.tokenizer.log_mel_spectrogram
and model.ve.forward (use the existing _f32_log_mel and _f32_ve_forward
patterns) and only assign self.model = model after all patches succeed; this
ensures load_model’s early-return checks and cold-download flags are not left
pointing at a partially-initialized object.

In `@backend/backends/chatterbox_turbo_backend.py`:
- Around line 194-212: The patching block mutates self.model in-place and can
raise after self.model is already set; instead, load the model into a local
variable (e.g., local_model), apply all patches to local_model (patching
_tokzr.log_mel_spectrogram via _orig_log_mel/_f32_log_mel and _ve.forward via
_orig_ve_forward/_f32_ve_forward), and only after all patches succeed assign
local_model to self.model and update whatever "task state" completion code
currently runs after model assignment; this ensures retries and cold-download
markers (currently guarded around self.model assignment) are not short-circuited
by a failed patch.

---

Nitpick comments:
In `@backend/backends/chatterbox_turbo_backend.py`:
- Around line 191-212: Extract the duplicated dtype monkey-patch into a single
helper function (e.g., apply_dtype_monkeypatch) and call it from both
chatterbox_turbo_backend and chatterbox_backend after the model is loaded; the
helper should accept the loaded model, patch
model.s3gen.tokenizer.log_mel_spectrogram to coerce audio to float (preserving
the original via _orig_log_mel) and patch model.ve.forward to call the original
with mels.float() (preserving _orig_ve_forward), using types.MethodType to bind
the wrappers, and keep the original symbol names (S3Tokenizer / s3gen.tokenizer,
log_mel_spectrogram, VoiceEncoder / ve, forward) so future upstream changes are
localized to that helper.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c178db28-21ff-4cdd-86bb-2041ce15e2c0

📥 Commits

Reviewing files that changed from the base of the PR and between 2cbd56f and bfd7b81.

📒 Files selected for processing (2)
  • backend/backends/chatterbox_backend.py
  • backend/backends/chatterbox_turbo_backend.py

…sed state

Apply local-var-then-assign pattern to chatterbox_backend.py (multilingual)
to match the turbo backend. Also use _current_model_size fallback in
unload, delete, and status endpoints for consistent Qwen model size checks.
@jamiepine jamiepine merged commit 0e9f5db into main Mar 13, 2026
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant