fix(whisper): cgroup-aware threads + Vulkan ICD manifest fallback#348
Conversation
…ulkan Closes #346: replace `cpus().length` with `availableParallelism()` for the transcription `maxThreads`. libuv (Node 22+) reads cgroup v1/v2 CFS quotas and sched_getaffinity, so worker thread allocation now scales to the container's actual CPU budget instead of the host's logical-core count. Closes #347: probe Vulkan ICD presence directly when the `vulkaninfo` CLI binary is missing. Slim Linux container images often ship the Vulkan runtime (libvulkan + ICD JSONs) without the `vulkan-tools` package, so `hasVulkanBackend()` previously fell back to "no Vulkan" and silently disabled GPU transcription. Add a manifest-enumeration fallback that walks the standard ICD search paths (`/usr/share/vulkan/icd.d`, `/usr/local/share/vulkan/icd.d`, `/etc/vulkan/icd.d`) and honors `VK_DRIVER_FILES` / `VK_ICD_FILENAMES` overrides, matching how the Vulkan loader itself locates ICDs. `vulkaninfo` stays the primary probe (most authoritative, also reports loader/instance health); the manifest enumeration is the secondary signal so legacy hosts with broken loader libs still get the same verdict they did before.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdjusts Whisper local service to use container-aware parallelism for transcription threads, adds guarded Vulkan initialization with CPU retry, and improves Vulkan backend detection by falling back to ICD manifest enumeration when vulkaninfo is unavailable. ChangesWhisper Local Transcription Configuration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/services/whisper-local.ts`:
- Around line 294-325: The fallback in hasVulkanBackend currently treats mere
presence of an ICD JSON (found by hasVulkanIcdManifest) as sufficient to mark
vulkanBackendAvailable and later force backend='vulkan' in
getContext/initWhisper; change this so hasVulkanBackend only returns true after
a minimal runtime validation (e.g., attempt to create a Vulkan context or load
the Vulkan loader via a safe probe) and otherwise leaves vulkanBackendAvailable
false so initWhisper/getContext can choose a CPU backend; update references to
vulkanBackendAvailable, hasVulkanIcdManifest, hasVulkanBackend, getContext, and
initWhisper to ensure the manifest check is only a heuristic and not
authoritative, and ensure initWhisper will fall back to CPU if the Vulkan
runtime probe fails.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e9959911-6050-48da-bde7-32ec52f8b622
📒 Files selected for processing (1)
server/services/whisper-local.ts
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a2a8563d57
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
…probe Both reviewers correctly flagged that the manifest-enumeration fallback is a heuristic, not a guarantee — a stale ICD JSON pointing at a non-loadable driver library would make hasVulkanBackend() return true but initWhisper throw at context init, defeating the safety property of the whole probe. Add defense in depth: if backend='vulkan' init fails, retry once with useGpu=false / backend=undefined, poisoning the vulkanBackendAvailable cache so the next getContext() goes straight to CPU. The probes stay as best-effort signals to avoid the throw in the common case; this catch is the actual safety net that breaks the brittle-probe cycle.
|
Pushed b22fccc — addresses both reviewers' converged concern. You're both right: manifest enumeration is a heuristic, not a load-time guarantee. A stale ICD JSON pointing at a missing/non-loadable driver library would make `hasVulkanBackend()` return true and `initWhisper(..., 'vulkan')` throw — exactly the failure mode the probe was supposed to prevent. The fix follows CR's suggested shape: if the Vulkan init throws, retry once with `useGpu=false` and `backend=undefined`, and poison `vulkanBackendAvailable` so subsequent inits skip the probe entirely. The probes (CLI + manifest) stay as best-effort signals to avoid the throw in the common case; this catch is the actual safety net. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b22fccc2a2
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@server/services/whisper-local.ts`:
- Around line 90-96: The fallback catch currently sets vulkanBackendAvailable =
false but leaves the detectGpu()/getSystemInfo() cache reporting hasGpu: true;
update the catch in initWhisper so when you poison vulkanBackendAvailable you
also clear/override the GPU detection cache—e.g., call the existing detectGpu
cache invalidator or set the cached system info's hasGpu to false (ensure to
reference detectGpu and getSystemInfo by name), so future init attempts see no
GPU and run CPU paths.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9fd9509a-7677-4f7f-adbe-76346f1cd214
📒 Files selected for processing (1)
server/services/whisper-local.ts
CR and Codex both flagged the inconsistency: when initWhisper's Vulkan path fails and we fall back to CPU, vulkanBackendAvailable gets poisoned but the separate gpuDetected cache (populated earlier by /api/transcribe/ config or any getSystemInfo() call) keeps reporting hasGpu: true. The UI then keeps suppressing the CPU-slow hint even though transcription has been forced onto CPU. Set gpuDetected = null alongside the Vulkan poison so the next detectGpu() call recomputes from the now-correct vulkanBackendAvailable value.
|
Pushed 64d1821 — both reviewers correctly flagged the cache inconsistency. Setting `gpuDetected = null` next to the Vulkan poison so `detectGpu()` / `getSystemInfo()` recompute against the now-correct `vulkanBackendAvailable` state. UI hints about CPU-slow transcription will then surface accurately on hosts where the Vulkan fallback fired. @coderabbitai full review |
|
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 64d1821368
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Codex P2 — the loader treats VK_ADD_DRIVER_FILES as additive to the default search paths, but the manifest probe only honored the *replace* variants (VK_DRIVER_FILES / VK_ICD_FILENAMES). Slim Linux images that provide drivers exclusively via VK_ADD_DRIVER_FILES were falling through to the default-dirs check, finding nothing, and disabling the Vulkan backend even though the loader itself could initialize the driver. Now: VK_DRIVER_FILES (or legacy VK_ICD_FILENAMES) replaces, and VK_ADD_DRIVER_FILES adds — matching the loader's spec.
|
Pushed ee995e8 — addresses the `VK_ADD_DRIVER_FILES` gap. The probe now matches the Vulkan loader's spec: `VK_DRIVER_FILES` (or legacy `VK_ICD_FILENAMES`) replaces the default search, and `VK_ADD_DRIVER_FILES` is additive on top of it. @coderabbitai full review |
|
Your implementation mirrors this: ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ee995e86ee
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
Codex P2 — the Vulkan loader splits VK_DRIVER_FILES / VK_ICD_FILENAMES / VK_ADD_DRIVER_FILES on the OS path-list delimiter (`:` on POSIX, `;` on Windows). Hardcoding `:` mangled Windows entries like `C:\drivers\nvidia.json` into two invalid fragments. Use Node's `path.delimiter` so the probe stays correct on every platform we ship to.
|
Pushed ae36d3a — using `node:path`'s `delimiter` so Windows drive-letter ICD paths (`C:\drivers\nvidia.json`) aren't split apart. Same loader semantic on every platform now. @coderabbitai full review |
|
✅ Actions performedFull review triggered. |
Summary
Two small fixes to
server/services/whisper-local.ts, both filed today by CodeRabbit during the #319 review:Closes #346 — cgroup-aware thread count
Swap
cpus().lengthforos.availableParallelism()when computing the transcriptionmaxThreads. libuv's implementation (Node 22+) reads cgroup v1/v2 CFS quotas andsched_getaffinity, so worker thread allocation now scales to the container's actual CPU budget instead of the host's logical-core count.Closes #347 — Vulkan ICD manifest probe fallback
Add a manifest-enumeration fallback to
hasVulkanBackend(). The previous code disabled the Vulkan backend whenevervulkaninfo --summaryfailed, which silently skipped GPU transcription on slim container images that ship the Vulkan runtime (libvulkan.so.1+ ICD JSONs) without thevulkan-tools(vulkaninfo) package.The new flow:
vulkaninfo --summary(most authoritative — also reports loader/instance health)VK_DRIVER_FILES/VK_ICD_FILENAMESoverrides, matching how the Vulkan loader itself locates ICDsStandard search paths:
/usr/share/vulkan/icd.d/usr/local/share/vulkan/icd.d/etc/vulkan/icd.dVerification
Out of scope
Neither change touches the macOS/Windows code paths. `cpus().length` still feeds `getSystemInfo().cpuCount` (host info, distinct from "what we can use").
Summary by CodeRabbit