fix: Chatterbox float64 dtype mismatch + model unload button by jamiepine · Pull Request #264 · jamiepine/voicebox

jamiepine · 2026-03-13T11:18:49Z

Summary

Fixes expected m1 and m2 to have the same dtype, but got: float != double when generating with Chatterbox (both multilingual and turbo), and adds a model unload button so users can free memory without restarting the backend.

Root cause

librosa.load() returns float64 numpy arrays. The upstream chatterbox package converts these to torch tensors via torch.from_numpy() without casting to float32, creating dtype mismatches in two places:

S3Tokenizer.log_mel_spectrogram — float64 STFT magnitudes hit float32 _mel_filters in a matmul
VoiceEncoder.forward — float64 mel spectrograms hit float32 LSTM weights

Fix

Monkey-patch both entry points after model load to cast inputs to float32:

S3Tokenizer.log_mel_spectrogram — audio.float() before STFT
VoiceEncoder.forward — mels.float() before LSTM

Applied to both Chatterbox Multilingual and Chatterbox Turbo backends.

Model unload feature

POST /models/{model_name}/unload — unloads any model from memory without deleting from disk
New "Unload" button in model detail dialog (with Unplug icon)
Delete button remains disabled while model is loaded

Summary by CodeRabbit

New Features
- Per-model "Unload" action: API endpoint, client method and UI Unload button with loading state and toasts; Delete remains disabled until unload completes.
- Model publishing now only occurs after post-load adjustments succeed, reducing partial/failed model states.
Bug Fixes
- Normalize audio/mel input dtypes to float32 before processing to improve compatibility and audio reliability.
- Improved loaded-model checks to more accurately detect and manage model sizes during unload/delete.

coderabbitai · 2026-03-13T11:19:04Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 08a239f1-26ea-480e-b576-5cf1503dbe0c

📥 Commits

Reviewing files that changed from the base of the PR and between bfd7b81 and 2f535a7.

📒 Files selected for processing (3)

backend/backends/chatterbox_backend.py
backend/backends/chatterbox_turbo_backend.py
backend/main.py

📝 Walkthrough

Walkthrough

Adds an unload-by-name API and UI flow, moves chatterbox model assignment to a local variable and applies post-load monkey-patches (S3Tokenizer.log_mel_spectrogram, VoiceEncoder.forward) before publishing the model to avoid half-initialized state.

Changes

Cohort / File(s)	Summary
Chatterbox backends `backend/backends/chatterbox_backend.py`, `backend/backends/chatterbox_turbo_backend.py`	Load model into a local `model` variable, apply pre/post-load monkey-patches (ensure CPU map_location, cast audio/mel inputs to `float32` via wrapped `S3Tokenizer.log_mel_spectrogram` and `VoiceEncoder.forward`), then assign `self.model` only after patches succeed.
API server `backend/main.py`	Add `POST /models/{model_name}/unload` and helper `unload_model_by_name`; validate model names, dispatch unload to specific backends (tts, luxtts, chatterbox, chatterbox_turbo, whisper), and update checks to use `_current_model_size` when present.
Frontend — model management UI `app/src/components/ServerSettings/ModelManagement.tsx`	Add `unloadMutation` and an Unload button (Unplug icon) with loading state; show Unload + Delete buttons together and disable Delete while model is loaded; minor `formatSize` tweak for large sizes.
Frontend — API client `app/src/lib/api/client.ts`	Add `unloadModel(modelName: string): Promise<{ message: string }>` to call the new server endpoint.

Sequence Diagram(s)

sequenceDiagram
    participant User as User (UI)
    participant Frontend as Frontend (ModelManagement)
    participant Api as ApiClient / HTTP
    participant Server as Backend API (/models/{name}/unload)
    participant Loader as Model Backend (e.g., chatterbox_turbo)
    participant Model as In-memory Model

    User->>Frontend: Click "Unload"
    Frontend->>Api: POST /models/{name}/unload
    Api->>Server: HTTP request
    Server->>Loader: dispatch unload for model_name
    Loader->>Model: perform unload / release resources
    Model-->>Loader: success
    Loader-->>Server: unloaded
    Server-->>Api: 200 { message }
    Api-->>Frontend: response
    Frontend->>Frontend: invalidate model status queries / update UI

sequenceDiagram
    participant Loader as Model Loader
    participant ModelLib as Chatterbox Library
    participant S3Tok as S3Tokenizer
    participant VEnc as VoiceEncoder

    Loader->>ModelLib: load model instance (into local `model`)
    ModelLib-->>Loader: model instance
    Loader->>S3Tok: wrap log_mel_spectrogram -> cast audio to float32
    Loader->>VEnc: wrap forward -> cast mel inputs to float32
    Loader-->>ModelLib: bind patched methods to local `model`
    Loader->>Loader: assign `self.model = model` (publish) and complete load

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix: Chatterbox float64 dtype mismatch + model unload button #264 — Adds the same post-load monkey patches (casting in VoiceEncoder.forward and S3Tokenizer.log_mel_spectrogram); strongly related to the backend patching here.

Poem

🐰 I hopped through loaders, soft and light,

I nudged the mels to float just right,
I wrapped the calls and held my breath,
Then gently let the models rest,
A tiny hop — all systems bright.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly addresses the primary changes: fixing a Chatterbox float64 dtype mismatch and adding a model unload button. Both are substantial features covered in the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch fix/chatterbox-float64-dtype

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

backend/backends/chatterbox_backend.py (1)

186-194: Consider extracting this monkey-patch into a shared helper.

This same block appears in both Chatterbox backends; centralizing it would reduce drift and keep behavior consistent across model variants.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_backend.py` around lines 186 - 194, Extract the
repeated monkey-patch into a shared helper function (e.g.,
patch_ve_forward_to_float) and replace the inline block in Chatterbox backends
with a call to that helper; the helper should accept the model.ve instance,
capture the original forward (using the same getattr/hasattr check you currently
use to set _orig_ve_forward), define the wrapper (currently _f32_forward) that
calls the original with mels.float(), and attach it back with types.MethodType
so the behavior is identical across model variants.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@backend/backends/chatterbox_backend.py`:
- Around line 186-194: Extract the repeated monkey-patch into a shared helper
function (e.g., patch_ve_forward_to_float) and replace the inline block in
Chatterbox backends with a call to that helper; the helper should accept the
model.ve instance, capture the original forward (using the same getattr/hasattr
check you currently use to set _orig_ve_forward), define the wrapper (currently
_f32_forward) that calls the original with mels.float(), and attach it back with
types.MethodType so the behavior is identical across model variants.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ae0b9a43-c070-47fa-82a4-bd2f62d18052

📥 Commits

Reviewing files that changed from the base of the PR and between 1d32170 and 2ec8ac1.

📒 Files selected for processing (2)

backend/backends/chatterbox_backend.py
backend/backends/chatterbox_turbo_backend.py

The previous approach of patching librosa.load didn't work because melspectrogram itself performs float64 math (numpy dot, signal.lfilter) regardless of input dtype. The actual mismatch happens when pack() creates a float64 tensor from the mel arrays and passes it into the float32 LSTM weights in VoiceEncoder.forward(). Fix by monkey-patching VoiceEncoder.forward() to call mels.float() before the LSTM, ensuring the input always matches the model dtype.

- POST /models/{model_name}/unload — unloads a specific model from memory without deleting from disk, supports all engine types - Frontend: Unload button in model detail dialog when model is loaded - Delete button remains disabled while loaded (unload first)

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app/src/components/ServerSettings/ModelManagement.tsx`:
- Around line 308-315: The onSuccess handler for the unload call currently
ignores the API response and always shows "Model unloaded"; change the signature
in the onSuccess callback to use the real response (replace _data with e.g.
data) and use data.message (or branch on it) when calling toast so the toast
reflects the API's actual success/no-op message; keep the existing
queryClient.invalidateQueries/refetchQueries calls (they can remain as-is) and
ensure you still pass modelName where needed for context in the toast if
desired.

In `@backend/backends/chatterbox_backend.py`:
- Around line 181-195: The patching of VoiceEncoder.forward should happen on a
local model variable before assigning to self.model so a failure doesn't leave
self.model partially initialized; in load_model() create a local variable (e.g.,
model), retrieve model.ve as in the diff (_ve), capture the original forward
(_orig_ve_forward), bind the f32 wrapper (_f32_forward) with types.MethodType to
model.ve, and only after successful patching assign self.model = model (leaving
existing logic that checks self.model at Lines 97/100 unchanged). Ensure you
reference and modify model.ve.forward (not self.model) and preserve the existing
_ve, _orig_ve_forward, and _f32_forward naming/behavior when moving the code.

In `@backend/main.py`:
- Around line 1526-1539: The unload path for Chatterbox calls
backend.unload_model() directly while background threads in
chatterbox_backend.py (_generate_sync and _create_prompt_sync) may still access
self.model, causing None derefs; modify the Chatterbox backend (referencing
class methods _generate_sync, _create_prompt_sync and unload_model) to serialize
unloads against in-flight ops by adding either an asyncio.Lock or an atomic
active-ops counter on the backend instance: increment the counter or acquire the
lock at the start of generation entry points, decrement/release when finished
(including error paths), and make unload_model wait for zero active ops or
acquire the same lock before clearing self.model; then update the API call sites
that use get_tts_backend_for_engine(...).unload_model() to rely on the backend’s
guarded unload (no changes to the API surface), ensuring unload is blocked until
current generations complete.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 416b06e1-816c-41c5-b2dc-2d428fd0965d

📥 Commits

Reviewing files that changed from the base of the PR and between 2ec8ac1 and 7abdd7c.

📒 Files selected for processing (5)

app/src/components/ServerSettings/ModelManagement.tsx
app/src/lib/api/client.ts
backend/backends/chatterbox_backend.py
backend/backends/chatterbox_turbo_backend.py
backend/main.py

🚧 Files skipped from review as they are similar to previous changes (1)

backend/backends/chatterbox_turbo_backend.py

coderabbitai · 2026-03-13T11:55:27Z

app/src/components/ServerSettings/ModelManagement.tsx

+    onSuccess: async (_data, modelName) => {
+      toast({
+        title: 'Model unloaded',
+        description: `${modelName} has been unloaded from memory.`,
+      });
+      await queryClient.invalidateQueries({ queryKey: ['modelStatus'], refetchType: 'all' });
+      await queryClient.refetchQueries({ queryKey: ['modelStatus'] });
+    },


⚠️ Potential issue | 🟡 Minor

Use the API response message for unload toasts.

The new unload endpoint returns a normal success payload for stale/no-op unloads, but this handler ignores _data.message, so the UI still says the unload succeeded even when nothing changed. Toast the returned message (or branch on it) so the modal stays truthful.

Suggested fix

- onSuccess: async (_data, modelName) => { + onSuccess: async (data, modelName) => { toast({ - title: 'Model unloaded', - description: `${modelName} has been unloaded from memory.`, + title: data.message.includes('is not loaded') ? 'Model already unloaded' : 'Model unloaded', + description: data.message, });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@app/src/components/ServerSettings/ModelManagement.tsx` around lines 308 - 315, The onSuccess handler for the unload call currently ignores the API response and always shows "Model unloaded"; change the signature in the onSuccess callback to use the real response (replace _data with e.g. data) and use data.message (or branch on it) when calling toast so the toast reflects the API's actual success/no-op message; keep the existing queryClient.invalidateQueries/refetchQueries calls (they can remain as-is) and ensure you still pass modelName where needed for context in the toast if desired.

backend/backends/chatterbox_backend.py

coderabbitai · 2026-03-13T11:55:27Z

backend/main.py

+        elif model_type == "chatterbox":
+            from .backends import get_tts_backend_for_engine
+            backend = get_tts_backend_for_engine("chatterbox")
+            if backend.is_loaded():
+                backend.unload_model()
+            else:
+                return {"message": f"Model {model_name} is not loaded"}
+        elif model_type == "chatterbox_turbo":
+            from .backends import get_tts_backend_for_engine
+            backend = get_tts_backend_for_engine("chatterbox_turbo")
+            if backend.is_loaded():
+                backend.unload_model()
+            else:
+                return {"message": f"Model {model_name} is not loaded"}


⚠️ Potential issue | 🟠 Major

Serialize Chatterbox unloads against active generation.

These branches call backend.unload_model() directly even though backend/backends/chatterbox_backend.py, Lines 310-339 still uses self.model from a worker thread during generation. If the user clicks Unload while a Chatterbox request is in flight, self.model can become None mid-run and the request falls over with a 500. Please gate unload behind a per-backend active-ops counter or unload lock before exposing it from the API.

Based on learnings, unloading self.model while _generate_sync or _create_prompt_sync threads hold a direct reference requires an asyncio.Lock or active-ops counter to serialize model swap against in-flight inference.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@backend/main.py` around lines 1526 - 1539, The unload path for Chatterbox calls backend.unload_model() directly while background threads in chatterbox_backend.py (_generate_sync and _create_prompt_sync) may still access self.model, causing None derefs; modify the Chatterbox backend (referencing class methods _generate_sync, _create_prompt_sync and unload_model) to serialize unloads against in-flight ops by adding either an asyncio.Lock or an atomic active-ops counter on the backend instance: increment the counter or acquire the lock at the start of generation entry points, decrement/release when finished (including error paths), and make unload_model wait for zero active ops or acquire the same lock before clearing self.model; then update the API call sites that use get_tts_backend_for_engine(...).unload_model() to rely on the backend’s guarded unload (no changes to the API surface), ensuring unload is blocked until current generations complete.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

app/src/components/ServerSettings/ModelManagement.tsx (1)

308-312: ⚠️ Potential issue | 🟡 Minor

Use the unload API response message in the toast.

Line 308 ignores the returned payload, so the toast can claim success even when backend responds with “is not loaded”.

Suggested fix

-    onSuccess: async (_data, modelName) => {
+    onSuccess: async (data, modelName) => {
       toast({
-        title: 'Model unloaded',
-        description: `${modelName} has been unloaded from memory.`,
+        title: data.message.includes('is not loaded') ? 'Model already unloaded' : 'Model unloaded',
+        description: data.message || `${modelName} unload request completed.`,
       });
       await queryClient.invalidateQueries({ queryKey: ['modelStatus'], refetchType: 'all' });
       await queryClient.refetchQueries({ queryKey: ['modelStatus'] });
     },

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@app/src/components/ServerSettings/ModelManagement.tsx` around lines 308 -
312, The onSuccess handler in ModelManagement.tsx currently ignores the API
response and always shows "Model unloaded" using the modelName param; change it
to read the response payload (the _data argument) from the unload API and use
its message (e.g., _data.message) for the toast title/description so the UI
reflects responses like "is not loaded"; keep a fallback to the modelName if the
message is missing and ensure the same toast call (toast({...})) uses the
response-derived text instead of assuming success.

🧹 Nitpick comments (1)

backend/main.py (1)
1493-1505: Consider centralizing model-name metadata.

The model_types map here duplicates model metadata that is also maintained in other model-management routes, which is easy to drift over time. Extracting one shared constant/helper would reduce mismatch risk.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/main.py` around lines 1493 - 1505, The model_types dict is duplicated
across model-management code; extract it into a single shared constant or helper
(e.g., MODEL_TYPES or get_model_metadata()) and replace local occurrences of
model_types in functions/routes (referencing model_types, model_name lookups) to
import and use that centralized symbol; ensure the new module exports the same
mapping shape (model_name -> (model_type, model_size)) and update any callers
that expect this data so they use the centralized constant/helper to avoid
drift.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/main.py`:
- Around line 1513-1518: The unload branch uses tts_model.model_size which is
inconsistent with the rest of the file's authoritative loaded-size signal;
change the comparison to use the module's current loaded-size field (e.g.
tts._current_model_size or the getter used by the status/health path) so the
check becomes tts.get_tts_model().is_loaded() and compare against
tts._current_model_size, then call tts.unload_tts_model() when they match;
update any references in the same block to use the same symbol used by
status/health (tts._current_model_size or its getter) to avoid misreporting Qwen
as not loaded.

---

Duplicate comments:
In `@app/src/components/ServerSettings/ModelManagement.tsx`:
- Around line 308-312: The onSuccess handler in ModelManagement.tsx currently
ignores the API response and always shows "Model unloaded" using the modelName
param; change it to read the response payload (the _data argument) from the
unload API and use its message (e.g., _data.message) for the toast
title/description so the UI reflects responses like "is not loaded"; keep a
fallback to the modelName if the message is missing and ensure the same toast
call (toast({...})) uses the response-derived text instead of assuming success.

---

Nitpick comments:
In `@backend/main.py`:
- Around line 1493-1505: The model_types dict is duplicated across
model-management code; extract it into a single shared constant or helper (e.g.,
MODEL_TYPES or get_model_metadata()) and replace local occurrences of
model_types in functions/routes (referencing model_types, model_name lookups) to
import and use that centralized symbol; ensure the new module exports the same
mapping shape (model_name -> (model_type, model_size)) and update any callers
that expect this data so they use the centralized constant/helper to avoid
drift.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 485c65d3-d13a-4e66-ae25-04bc4ec5f34e

📥 Commits

Reviewing files that changed from the base of the PR and between 7abdd7c and 2cbd56f.

📒 Files selected for processing (5)

app/src/components/ServerSettings/ModelManagement.tsx
app/src/lib/api/client.ts
backend/backends/chatterbox_backend.py
backend/backends/chatterbox_turbo_backend.py
backend/main.py

🚧 Files skipped from review as they are similar to previous changes (2)

backend/backends/chatterbox_backend.py
app/src/lib/api/client.ts

backend/main.py

The actual dtype mismatch was in S3Tokenizer.log_mel_spectrogram, not VoiceEncoder.forward. librosa.load returns float64 numpy, which torch.from_numpy preserves as double. The STFT output (double) then hits _mel_filters (float32) in a matmul at s3tokenizer.py:163. Now patching both entry points after model load: 1. S3Tokenizer.log_mel_spectrogram — cast audio to float32 before STFT 2. VoiceEncoder.forward — cast mels to float32 before LSTM Remove debug traceback logging (no longer needed).

coderabbitai

♻️ Duplicate comments (2)

backend/backends/chatterbox_turbo_backend.py (1)

194-212: ⚠️ Potential issue | 🟠 Major

Keep the loaded Turbo model local until the patching succeeds.

This has the same failure mode as the multilingual backend: self.model is assigned at Line 167/Line 173, then the new patch block can still fail on Line 194 through Line 212. If that happens, retries short-circuit at Line 97 and Line 100, and the cold-download flow may already be marked complete at Line 178 and Line 179. Please patch a local model, then publish it to self.model and complete the task state afterward.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_turbo_backend.py` around lines 194 - 212, The
patching block mutates self.model in-place and can raise after self.model is
already set; instead, load the model into a local variable (e.g., local_model),
apply all patches to local_model (patching _tokzr.log_mel_spectrogram via
_orig_log_mel/_f32_log_mel and _ve.forward via
_orig_ve_forward/_f32_ve_forward), and only after all patches succeed assign
local_model to self.model and update whatever "task state" completion code
currently runs after model assignment; this ensures retries and cold-download
markers (currently guarded around self.model assignment) are not short-circuited
by a failed patch.

backend/backends/chatterbox_backend.py (1)

188-206: ⚠️ Potential issue | 🟠 Major

Patch a local model instance before assigning self.model.

These bindings run after self.model has already been published, so any exception from Line 188 through Line 206 leaves the backend stuck with a partially initialized model. The next load_model() call then returns early at Line 97 and Line 100, and the cold-download path may already have been marked complete at Line 178 and Line 179.

🛠️ Proposed fix

-                            self.model = ChatterboxMultilingualTTS.from_pretrained(
+                            model = ChatterboxMultilingualTTS.from_pretrained(
                                 device=device,
                             )
@@
-                    self.model = ChatterboxMultilingualTTS.from_pretrained(
+                    model = ChatterboxMultilingualTTS.from_pretrained(
                         device=device,
                     )
@@
-            t3_tfmr = self.model.t3.tfmr
+            t3_tfmr = model.t3.tfmr
@@
-            if not is_cached:
-                progress_manager.mark_complete(model_name)
-                task_manager.complete_download(model_name)
-
             # Patch float64 → float32 dtype mismatches in upstream chatterbox.
@@
-            _tokzr = self.model.s3gen.tokenizer
+            _tokzr = model.s3gen.tokenizer
@@
-            _ve = self.model.ve
+            _ve = model.ve
@@
+            self.model = model
+            if not is_cached:
+                progress_manager.mark_complete(model_name)
+                task_manager.complete_download(model_name)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_backend.py` around lines 188 - 206, The patching
of tokenizer and VoiceEncoder is being done directly on self.model which can
leave self.model partially initialized if an exception occurs; instead, load the
model into a local variable (e.g., model = <loaded object>), apply the patches
to model.s3gen.tokenizer.log_mel_spectrogram and model.ve.forward (use the
existing _f32_log_mel and _f32_ve_forward patterns) and only assign self.model =
model after all patches succeed; this ensures load_model’s early-return checks
and cold-download flags are not left pointing at a partially-initialized object.

🧹 Nitpick comments (1)

backend/backends/chatterbox_turbo_backend.py (1)
191-212: Extract the dtype monkey-patch into one helper.

This block is now duplicated almost verbatim with backend/backends/chatterbox_backend.py. A shared helper that accepts the loaded model and applies the dtype shims would reduce drift the next time upstream Chatterbox changes these method names or signatures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/backends/chatterbox_turbo_backend.py` around lines 191 - 212, Extract
the duplicated dtype monkey-patch into a single helper function (e.g.,
apply_dtype_monkeypatch) and call it from both chatterbox_turbo_backend and
chatterbox_backend after the model is loaded; the helper should accept the
loaded model, patch model.s3gen.tokenizer.log_mel_spectrogram to coerce audio to
float (preserving the original via _orig_log_mel) and patch model.ve.forward to
call the original with mels.float() (preserving _orig_ve_forward), using
types.MethodType to bind the wrappers, and keep the original symbol names
(S3Tokenizer / s3gen.tokenizer, log_mel_spectrogram, VoiceEncoder / ve, forward)
so future upstream changes are localized to that helper.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@backend/backends/chatterbox_backend.py`:
- Around line 188-206: The patching of tokenizer and VoiceEncoder is being done
directly on self.model which can leave self.model partially initialized if an
exception occurs; instead, load the model into a local variable (e.g., model =
<loaded object>), apply the patches to model.s3gen.tokenizer.log_mel_spectrogram
and model.ve.forward (use the existing _f32_log_mel and _f32_ve_forward
patterns) and only assign self.model = model after all patches succeed; this
ensures load_model’s early-return checks and cold-download flags are not left
pointing at a partially-initialized object.

In `@backend/backends/chatterbox_turbo_backend.py`:
- Around line 194-212: The patching block mutates self.model in-place and can
raise after self.model is already set; instead, load the model into a local
variable (e.g., local_model), apply all patches to local_model (patching
_tokzr.log_mel_spectrogram via _orig_log_mel/_f32_log_mel and _ve.forward via
_orig_ve_forward/_f32_ve_forward), and only after all patches succeed assign
local_model to self.model and update whatever "task state" completion code
currently runs after model assignment; this ensures retries and cold-download
markers (currently guarded around self.model assignment) are not short-circuited
by a failed patch.

---

Nitpick comments:
In `@backend/backends/chatterbox_turbo_backend.py`:
- Around line 191-212: Extract the duplicated dtype monkey-patch into a single
helper function (e.g., apply_dtype_monkeypatch) and call it from both
chatterbox_turbo_backend and chatterbox_backend after the model is loaded; the
helper should accept the loaded model, patch
model.s3gen.tokenizer.log_mel_spectrogram to coerce audio to float (preserving
the original via _orig_log_mel) and patch model.ve.forward to call the original
with mels.float() (preserving _orig_ve_forward), using types.MethodType to bind
the wrappers, and keep the original symbol names (S3Tokenizer / s3gen.tokenizer,
log_mel_spectrogram, VoiceEncoder / ve, forward) so future upstream changes are
localized to that helper.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c178db28-21ff-4cdd-86bb-2041ce15e2c0

📥 Commits

Reviewing files that changed from the base of the PR and between 2cbd56f and bfd7b81.

📒 Files selected for processing (2)

backend/backends/chatterbox_backend.py
backend/backends/chatterbox_turbo_backend.py

…sed state Apply local-var-then-assign pattern to chatterbox_backend.py (multilingual) to match the turbo backend. Also use _current_model_size fallback in unload, delete, and status endpoints for consistent Qwen model size checks.

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

jamiepine force-pushed the fix/chatterbox-float64-dtype branch from 2ec8ac1 to 7abdd7c Compare March 13, 2026 11:44

jamiepine force-pushed the fix/chatterbox-float64-dtype branch from 7abdd7c to cac80f6 Compare March 13, 2026 11:50

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

backend/main.py Show resolved Hide resolved

jamiepine force-pushed the fix/chatterbox-float64-dtype branch from 2cbd56f to bfd7b81 Compare March 13, 2026 12:04

jamiepine changed the title ~~fix: Chatterbox float64 dtype mismatch in VoiceEncoder~~ fix: Chatterbox float64 dtype mismatch + model unload button Mar 13, 2026

coderabbitai bot reviewed Mar 13, 2026

View reviewed changes

jamiepine merged commit 0e9f5db into main Mar 13, 2026
1 check was pending

coderabbitai bot mentioned this pull request Mar 13, 2026

feat: paralinguistic tag autocomplete for Chatterbox Turbo #265

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Chatterbox float64 dtype mismatch + model unload button#264

fix: Chatterbox float64 dtype mismatch + model unload button#264
jamiepine merged 4 commits intomainfrom
fix/chatterbox-float64-dtype

jamiepine commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 13, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Mar 13, 2026

Uh oh!

Uh oh!

coderabbitai bot Mar 13, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jamiepine commented Mar 13, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Model unload feature

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Mar 13, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jamiepine commented Mar 13, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 13, 2026 •

edited

Loading