Conversation
Use refetchQueries instead of invalidateQueries for more reliable history refresh. Add history refetch to SSE onerror handler so dropped connections don't leave the list stale. Reset page to 0 in HistoryTable when a pending generation completes. Closes #231
Fixes hash mismatch when pip resolves Qwen3-TTS transitive deps. Closes #286
Wrap SSE generators with BrokenPipeError/ConnectionResetError handling so client disconnects during generation status polling, download progress, or audio streaming don't produce unhandled Errno 32 errors. Closes #248
cu121 only ships kernels up to SM 9.0 (Ada Lovelace). RTX 50-series (Blackwell, SM 12.0) and RTX 6000 Pro need cu126 which includes SM 12.0 support while remaining backward compatible with older GPUs. Closes #289
Move audio validation and saving to thread pool so librosa/ffmpeg decoding doesn't block the async event loop. Combine validate + load into a single pass to avoid decoding the file twice. Add 50 MB upload limit and chunked reads to prevent unbounded memory allocation. Closes #278
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR wires an optional Whisper model-size selection end-to-end (frontend → API → backends), expands supported transcription languages, adds chunked profile uploads with size limits, improves SSE/client-disconnect handling, upgrades CUDA wheel targets to 12.6, and adds a pip upgrade step in the backend Docker build. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Frontend as Frontend
participant API as Backend API
participant Backend as STT Backend
participant ModelStore as Model Loader/Cache
User->>Frontend: Upload audio + select model
Frontend->>API: transcribeAudio(file, language, model)
activate API
API->>ModelStore: validate model against WHISPER_HF_REPOS
alt Model already loaded
API->>Backend: transcribe(audio_path, language, model_size)
else Model not loaded
API-->>Frontend: 202 Accepted (model download in background)
API->>ModelStore: trigger load_model_async(model_size)
end
Backend->>ModelStore: ensure model available
Backend->>Backend: run transcription using model_size
Backend-->>API: transcription result
API-->>Frontend: transcription result
deactivate API
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/routes/profiles.py (1)
121-147:⚠️ Potential issue | 🟡 MinorTemp file leak if an exception occurs during chunked upload.
If an exception is raised inside the
withblock (e.g.,IOErrorontmp.write()or network error onfile.read()) before the explicit size-limit check, the temp file won't be cleaned up because thetry/finallyblock only starts after thewithblock exits.🛡️ Proposed fix: wrap the entire operation in a single try/finally
- with tempfile.NamedTemporaryFile(suffix=file_suffix, delete=False) as tmp: + tmp_path = None + try: + with tempfile.NamedTemporaryFile(suffix=file_suffix, delete=False) as tmp: + tmp_path = tmp.name total_size = 0 while chunk := await file.read(SAMPLE_UPLOAD_CHUNK_SIZE): total_size += len(chunk) if total_size > SAMPLE_MAX_FILE_SIZE: - Path(tmp.name).unlink(missing_ok=True) raise HTTPException( status_code=413, detail=f"File too large (max {SAMPLE_MAX_FILE_SIZE // (1024 * 1024)} MB)", ) tmp.write(chunk) - tmp_path = tmp.name - try: - sample = await profiles.add_profile_sample( + sample = await profiles.add_profile_sample( profile_id, tmp_path, reference_text, db, - ) - return sample + ) + return sample except ValueError as e: raise HTTPException(status_code=400, detail=str(e)) except Exception as e: raise HTTPException(status_code=500, detail=f"Failed to process audio file: {str(e)}") finally: - Path(tmp_path).unlink(missing_ok=True) + if tmp_path: + Path(tmp_path).unlink(missing_ok=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/profiles.py` around lines 121 - 147, The temp file can leak if an exception occurs inside the with tempfile.NamedTemporaryFile block because the current try/finally starts after the with; wrap the whole read/write and processing in a single try/finally so cleanup always runs: declare tmp_path = None before the try, move the with tempfile.NamedTemporaryFile(...) as tmp and the chunked read loop (using SAMPLE_UPLOAD_CHUNK_SIZE and SAMPLE_MAX_FILE_SIZE) and the call to profiles.add_profile_sample(...) into the try, catch specific exceptions (ValueError) and general exceptions as currently done, and in the finally call Path(tmp_path).unlink(missing_ok=True) only if tmp_path is set to ensure the temp file is always removed even if file.read() or tmp.write() raises.app/src/lib/hooks/useGenerationProgress.ts (1)
16-20:⚠️ Potential issue | 🟡 MinorDocstring still references
invalidateQueriesbut implementation now usesrefetchQueries.The comment on line 18 says "invalidates the history query" but the code now calls
refetchQueries. Update the docstring to reflect the new behavior.📝 Suggested fix
/** * Subscribes to SSE for all pending generations. When a generation completes, - * invalidates the history query, removes it from pending, and auto-plays + * refetches the history query, removes it from pending, and auto-plays * if the player is idle. */🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/lib/hooks/useGenerationProgress.ts` around lines 16 - 20, Update the docstring in useGenerationProgress.ts to reflect that the hook now calls refetchQueries rather than invalidating the cache: change the description that currently says "invalidates the history query" to state that it "refetches the history query" (or similar), and ensure the comment mentions refetching the history query and removing pending entries and auto-playing when the player is idle to match the behavior implemented in useGenerationProgress.
🧹 Nitpick comments (5)
backend/services/profiles.py (2)
120-121: Consider movingasyncioimport to module level.While local imports can reduce startup overhead,
asynciois a standard library module with negligible import cost. Moving it to the top of the file with other imports would follow the conventional Python import style.✨ Optional: Move asyncio to top-level imports
from typing import List, Optional from datetime import datetime +import asyncio import uuid import shutilThen remove the local import:
- import asyncio - profile = db.query(DBVoiceProfile).filter_by(id=profile_id).first()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/profiles.py` around lines 120 - 121, Move the local "import asyncio" out of the function and add a top-level "import asyncio" with the other module imports; then delete the local "import asyncio" statement (the one shown in the diff) so all uses in this module rely on the single module-level asyncio import.
25-25: Remove unusedload_audioimport from line 25.The
load_audiofunction is no longer called in this file after the refactor to usevalidate_and_load_reference_audio. Remove it from the import statement.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/services/profiles.py` at line 25, The import statement currently brings in load_audio which is unused after switching to validate_and_load_reference_audio; update the import on the line that references validate_reference_audio, validate_and_load_reference_audio, load_audio, save_audio by removing load_audio so only the used symbols (validate_reference_audio, validate_and_load_reference_audio, save_audio) are imported.backend/utils/audio.py (1)
226-256: Good refactor for single-pass validation and loading.This eliminates redundant audio decoding by combining validation and loading into one function. The 4-tuple return type is consistently applied across all code paths.
One minor suggestion from static analysis: prefer f-string conversion flag over explicit
str()call.✨ Optional: Use explicit conversion flag
except Exception as e: - return False, f"Error validating audio: {str(e)}", None, None + return False, f"Error validating audio: {e!s}", None, None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/utils/audio.py` around lines 226 - 256, In validate_and_load_reference_audio, update the exception formatting to use the f-string conversion flag instead of calling str() directly; replace the current return in the except block that builds the error message using str(e) with an f-string that uses the conversion flag (e.g., {e!s}) so the message becomes f"Error validating audio: {e!s}" while leaving the rest of the return tuple unchanged.backend/routes/profiles.py (1)
123-131: Consider offloading blocking file writes to a thread pool.The PR objective mentions moving audio processing to a thread pool to avoid blocking the async event loop. While
file.read()is async,tmp.write(chunk)is a synchronous blocking call. For a 50 MB upload with 1 MB chunks, this results in 50 blocking write operations on the event loop.♻️ Proposed fix using run_in_executor
+import asyncio +from functools import partial + # In the upload handler: - tmp.write(chunk) + loop = asyncio.get_running_loop() + await loop.run_in_executor(None, tmp.write, chunk)Alternatively, consider using
aiofilesfor fully async file I/O if it's already a project dependency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/profiles.py` around lines 123 - 131, The sync tmp.write(chunk) inside the upload loop blocks the async event loop; change it to run the write in the threadpool (use loop.run_in_executor or asyncio.get_running_loop().run_in_executor) so writes are executed off the event loop while keeping the async file.read(...) loop and the total_size checks intact (refer to SAMPLE_UPLOAD_CHUNK_SIZE, SAMPLE_MAX_FILE_SIZE, file.read, tmp.write and Path(tmp.name).unlink). Alternatively, replace the temp file usage with aiofiles to perform async writes if the project already depends on aiofiles; ensure the error/cleanup path (unlink on oversized) still runs correctly.backend/routes/transcription.py (1)
48-49: Consider exposing a public method for cache checking.Line 49 accesses the private method
_is_model_cached. While this works, accessing private methods directly couples this route to implementation details of the whisper model class. Consider adding a publicis_model_cached(model_size)method to the whisper model interface.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/routes/transcription.py` around lines 48 - 49, The route currently inspects a private member by calling whisper_model._is_model_cached(model_size); instead add and use a public method like whisper_model.is_model_cached(model_size) on the whisper model class and update the route to call that instead of the private method, keeping the existing logic that checks whisper_model.is_loaded() and whisper_model.model_size; implement the new public is_model_cached(model_size) method to wrap the existing private check so callers no longer rely on _is_model_cached.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/release.yml:
- Around line 192-195: The torchaudio install in the "Install PyTorch with CUDA
12.6" workflow step omits --force-reinstall, causing potential pip cache
inconsistency with torch; update the pip install command for torchaudio (the
line starting with "pip install torchaudio --index-url
https://download.pytorch.org/whl/cu126") to include --force-reinstall so both
packages are installed deterministically from the cu126 index.
---
Outside diff comments:
In `@app/src/lib/hooks/useGenerationProgress.ts`:
- Around line 16-20: Update the docstring in useGenerationProgress.ts to reflect
that the hook now calls refetchQueries rather than invalidating the cache:
change the description that currently says "invalidates the history query" to
state that it "refetches the history query" (or similar), and ensure the comment
mentions refetching the history query and removing pending entries and
auto-playing when the player is idle to match the behavior implemented in
useGenerationProgress.
In `@backend/routes/profiles.py`:
- Around line 121-147: The temp file can leak if an exception occurs inside the
with tempfile.NamedTemporaryFile block because the current try/finally starts
after the with; wrap the whole read/write and processing in a single try/finally
so cleanup always runs: declare tmp_path = None before the try, move the with
tempfile.NamedTemporaryFile(...) as tmp and the chunked read loop (using
SAMPLE_UPLOAD_CHUNK_SIZE and SAMPLE_MAX_FILE_SIZE) and the call to
profiles.add_profile_sample(...) into the try, catch specific exceptions
(ValueError) and general exceptions as currently done, and in the finally call
Path(tmp_path).unlink(missing_ok=True) only if tmp_path is set to ensure the
temp file is always removed even if file.read() or tmp.write() raises.
---
Nitpick comments:
In `@backend/routes/profiles.py`:
- Around line 123-131: The sync tmp.write(chunk) inside the upload loop blocks
the async event loop; change it to run the write in the threadpool (use
loop.run_in_executor or asyncio.get_running_loop().run_in_executor) so writes
are executed off the event loop while keeping the async file.read(...) loop and
the total_size checks intact (refer to SAMPLE_UPLOAD_CHUNK_SIZE,
SAMPLE_MAX_FILE_SIZE, file.read, tmp.write and Path(tmp.name).unlink).
Alternatively, replace the temp file usage with aiofiles to perform async writes
if the project already depends on aiofiles; ensure the error/cleanup path
(unlink on oversized) still runs correctly.
In `@backend/routes/transcription.py`:
- Around line 48-49: The route currently inspects a private member by calling
whisper_model._is_model_cached(model_size); instead add and use a public method
like whisper_model.is_model_cached(model_size) on the whisper model class and
update the route to call that instead of the private method, keeping the
existing logic that checks whisper_model.is_loaded() and
whisper_model.model_size; implement the new public is_model_cached(model_size)
method to wrap the existing private check so callers no longer rely on
_is_model_cached.
In `@backend/services/profiles.py`:
- Around line 120-121: Move the local "import asyncio" out of the function and
add a top-level "import asyncio" with the other module imports; then delete the
local "import asyncio" statement (the one shown in the diff) so all uses in this
module rely on the single module-level asyncio import.
- Line 25: The import statement currently brings in load_audio which is unused
after switching to validate_and_load_reference_audio; update the import on the
line that references validate_reference_audio,
validate_and_load_reference_audio, load_audio, save_audio by removing load_audio
so only the used symbols (validate_reference_audio,
validate_and_load_reference_audio, save_audio) are imported.
In `@backend/utils/audio.py`:
- Around line 226-256: In validate_and_load_reference_audio, update the
exception formatting to use the f-string conversion flag instead of calling
str() directly; replace the current return in the except block that builds the
error message using str(e) with an f-string that uses the conversion flag (e.g.,
{e!s}) so the message becomes f"Error validating audio: {e!s}" while leaving the
rest of the return tuple unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 43986149-0c4d-4e66-9d61-c336266fb382
📒 Files selected for processing (17)
.github/workflows/release.ymlDockerfileapp/src/components/History/HistoryTable.tsxapp/src/lib/api/client.tsapp/src/lib/api/types.tsapp/src/lib/hooks/useGenerationProgress.tsapp/src/lib/hooks/useTranscription.tsbackend/backends/__init__.pybackend/backends/mlx_backend.pybackend/backends/pytorch_backend.pybackend/models.pybackend/routes/generations.pybackend/routes/profiles.pybackend/routes/transcription.pybackend/services/profiles.pybackend/utils/audio.pybackend/utils/progress.py
Summary
Fixes 7 bugs from the issue tracker in one pass.
modeland expandedlanguageparams to/transcribeendpoint[object Object]error messages by normalizing error detail objectsrefetchQueries, handle SSE onerror, reset page)Notes
invalidateQueriestorefetchQueriesfor more reliable cache busting after generation completionCloses #233, #290, #231, #286, #248, #289, #278
Summary by CodeRabbit
New Features
Bug Fixes
Improvements