Skip to content

fix(whisper): cgroup-aware threads + Vulkan ICD manifest fallback#348

Merged
daggerhashimoto merged 6 commits into
masterfrom
fix/whisper-vulkan-icd-cgroup-threads
May 22, 2026
Merged

fix(whisper): cgroup-aware threads + Vulkan ICD manifest fallback#348
daggerhashimoto merged 6 commits into
masterfrom
fix/whisper-vulkan-icd-cgroup-threads

Conversation

@kimbotai1337

@kimbotai1337 kimbotai1337 commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Summary

Two small fixes to server/services/whisper-local.ts, both filed today by CodeRabbit during the #319 review:

Closes #346 — cgroup-aware thread count

Swap cpus().length for os.availableParallelism() when computing the transcription maxThreads. libuv's implementation (Node 22+) reads cgroup v1/v2 CFS quotas and sched_getaffinity, so worker thread allocation now scales to the container's actual CPU budget instead of the host's logical-core count.

Closes #347 — Vulkan ICD manifest probe fallback

Add a manifest-enumeration fallback to hasVulkanBackend(). The previous code disabled the Vulkan backend whenever vulkaninfo --summary failed, which silently skipped GPU transcription on slim container images that ship the Vulkan runtime (libvulkan.so.1 + ICD JSONs) without the vulkan-tools (vulkaninfo) package.

The new flow:

  1. Primary: vulkaninfo --summary (most authoritative — also reports loader/instance health)
  2. Fallback: enumerate ICD manifests at the standard loader search paths and honor VK_DRIVER_FILES / VK_ICD_FILENAMES overrides, matching how the Vulkan loader itself locates ICDs

Standard search paths:

  • /usr/share/vulkan/icd.d
  • /usr/local/share/vulkan/icd.d
  • /etc/vulkan/icd.d

Verification

  • `npm run build:server` — clean
  • No new test file added; `whisper-local.ts` has no test harness yet, and both changes are runtime-detection paths whose behavior depends on the host's filesystem layout (hard to mock cleanly for a single helper)

Out of scope

Neither change touches the macOS/Windows code paths. `cpus().length` still feeds `getSystemInfo().cpuCount` (host info, distinct from "what we can use").

Summary by CodeRabbit

  • Bug Fixes
    • Improved GPU detection by checking standard Vulkan manifest locations and honoring environment overrides.
    • Prefer system tooling for Vulkan availability checks, with a manifest-based fallback when tools are unavailable.
    • If GPU backend initialization fails, retry once on CPU and mark the GPU backend unavailable to avoid repeated probes.
    • Adjusted transcription threading to respect container/cgroup CPU quotas.

Review Change Stack

…ulkan

Closes #346: replace `cpus().length` with `availableParallelism()` for the
transcription `maxThreads`. libuv (Node 22+) reads cgroup v1/v2 CFS quotas
and sched_getaffinity, so worker thread allocation now scales to the
container's actual CPU budget instead of the host's logical-core count.

Closes #347: probe Vulkan ICD presence directly when the `vulkaninfo` CLI
binary is missing. Slim Linux container images often ship the Vulkan
runtime (libvulkan + ICD JSONs) without the `vulkan-tools` package, so
`hasVulkanBackend()` previously fell back to "no Vulkan" and silently
disabled GPU transcription. Add a manifest-enumeration fallback that
walks the standard ICD search paths (`/usr/share/vulkan/icd.d`,
`/usr/local/share/vulkan/icd.d`, `/etc/vulkan/icd.d`) and honors
`VK_DRIVER_FILES` / `VK_ICD_FILENAMES` overrides, matching how the Vulkan
loader itself locates ICDs.

`vulkaninfo` stays the primary probe (most authoritative, also reports
loader/instance health); the manifest enumeration is the secondary
signal so legacy hosts with broken loader libs still get the same
verdict they did before.
@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown

Warning

Rate limit exceeded

@kimbotai1337 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 35 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f6b12c45-0333-4040-a9f2-bd677d485c9b

📥 Commits

Reviewing files that changed from the base of the PR and between 14bd7b1 and ae36d3a.

📒 Files selected for processing (1)
  • server/services/whisper-local.ts
📝 Walkthrough

Walkthrough

Adjusts Whisper local service to use container-aware parallelism for transcription threads, adds guarded Vulkan initialization with CPU retry, and improves Vulkan backend detection by falling back to ICD manifest enumeration when vulkaninfo is unavailable.

Changes

Whisper Local Transcription Configuration

Layer / File(s) Summary
Import container-aware parallelism utility
server/services/whisper-local.ts
Node.js os module imports are expanded to include availableParallelism and fs directory reading for ICD discovery.
Vulkan GPU detection with ICD manifest fallback
server/services/whisper-local.ts
hasVulkanBackend() detection is enhanced with ICD manifest directory enumeration and support for loader override environment variables (VK_DRIVER_FILES, VK_ICD_FILENAMES, VK_ADD_DRIVER_FILES) as a fallback when vulkaninfo --summary is unavailable.
Guarded WhisperContext initialization with CPU fallback
server/services/whisper-local.ts
initWhisper is wrapped to catch Vulkan init failures, retry once on CPU (useGpu:false), and mark Vulkan as unavailable for future probes by setting vulkanBackendAvailable=false and clearing gpuDetected.
Container-aware transcription thread limiting
server/services/whisper-local.ts
Transcription options now use availableParallelism() instead of cpus().length for maxThreads to respect Linux container CPU quotas and cgroup resource limits.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

  • daggerhashimoto/openclaw-nerve#319: Related changes to server/services/whisper-local.ts for Vulkan probing and transcription threading; this PR further hardens detection and switches to container-aware parallelism.

Poem

🐰 I nibbled bytes where GPUs hide,
Probed the manifests, then looked inside.
If Vulkan trips, CPU takes the lead,
Threads count what's allowed, not what we read.
A little hop — resilient speed.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the two main changes: cgroup-aware thread count and Vulkan ICD manifest fallback, matching the primary objectives.
Description check ✅ Passed The description covers all required template sections: What/Why/How are detailed, Type of Change is marked, and Checklist is completed with build verification noted.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/whisper-vulkan-icd-cgroup-threads

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/services/whisper-local.ts`:
- Around line 294-325: The fallback in hasVulkanBackend currently treats mere
presence of an ICD JSON (found by hasVulkanIcdManifest) as sufficient to mark
vulkanBackendAvailable and later force backend='vulkan' in
getContext/initWhisper; change this so hasVulkanBackend only returns true after
a minimal runtime validation (e.g., attempt to create a Vulkan context or load
the Vulkan loader via a safe probe) and otherwise leaves vulkanBackendAvailable
false so initWhisper/getContext can choose a CPU backend; update references to
vulkanBackendAvailable, hasVulkanIcdManifest, hasVulkanBackend, getContext, and
initWhisper to ensure the manifest check is only a heuristic and not
authoritative, and ensure initWhisper will fall back to CPU if the Vulkan
runtime probe fails.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e9959911-6050-48da-bde7-32ec52f8b622

📥 Commits

Reviewing files that changed from the base of the PR and between c685b0c and a2a8563.

📒 Files selected for processing (1)
  • server/services/whisper-local.ts

Comment thread server/services/whisper-local.ts

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a2a8563d57

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread server/services/whisper-local.ts
…probe

Both reviewers correctly flagged that the manifest-enumeration fallback is a
heuristic, not a guarantee — a stale ICD JSON pointing at a non-loadable
driver library would make hasVulkanBackend() return true but initWhisper
throw at context init, defeating the safety property of the whole probe.

Add defense in depth: if backend='vulkan' init fails, retry once with
useGpu=false / backend=undefined, poisoning the vulkanBackendAvailable
cache so the next getContext() goes straight to CPU. The probes stay as
best-effort signals to avoid the throw in the common case; this catch is
the actual safety net that breaks the brittle-probe cycle.
@kimbotai1337

Copy link
Copy Markdown
Collaborator Author

Pushed b22fccc — addresses both reviewers' converged concern.

You're both right: manifest enumeration is a heuristic, not a load-time guarantee. A stale ICD JSON pointing at a missing/non-loadable driver library would make `hasVulkanBackend()` return true and `initWhisper(..., 'vulkan')` throw — exactly the failure mode the probe was supposed to prevent.

The fix follows CR's suggested shape: if the Vulkan init throws, retry once with `useGpu=false` and `backend=undefined`, and poison `vulkanBackendAvailable` so subsequent inits skip the probe entirely. The probes (CLI + manifest) stay as best-effort signals to avoid the throw in the common case; this catch is the actual safety net.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b22fccc2a2

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread server/services/whisper-local.ts
@kimbotai1337

Copy link
Copy Markdown
Collaborator Author

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown
✅ Actions performed

Full review triggered.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@server/services/whisper-local.ts`:
- Around line 90-96: The fallback catch currently sets vulkanBackendAvailable =
false but leaves the detectGpu()/getSystemInfo() cache reporting hasGpu: true;
update the catch in initWhisper so when you poison vulkanBackendAvailable you
also clear/override the GPU detection cache—e.g., call the existing detectGpu
cache invalidator or set the cached system info's hasGpu to false (ensure to
reference detectGpu and getSystemInfo by name), so future init attempts see no
GPU and run CPU paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9fd9509a-7677-4f7f-adbe-76346f1cd214

📥 Commits

Reviewing files that changed from the base of the PR and between c685b0c and b22fccc.

📒 Files selected for processing (1)
  • server/services/whisper-local.ts

Comment thread server/services/whisper-local.ts
CR and Codex both flagged the inconsistency: when initWhisper's Vulkan
path fails and we fall back to CPU, vulkanBackendAvailable gets poisoned
but the separate gpuDetected cache (populated earlier by /api/transcribe/
config or any getSystemInfo() call) keeps reporting hasGpu: true. The UI
then keeps suppressing the CPU-slow hint even though transcription has
been forced onto CPU.

Set gpuDetected = null alongside the Vulkan poison so the next detectGpu()
call recomputes from the now-correct vulkanBackendAvailable value.
@kimbotai1337

Copy link
Copy Markdown
Collaborator Author

Pushed 64d1821 — both reviewers correctly flagged the cache inconsistency. Setting `gpuDetected = null` next to the Vulkan poison so `detectGpu()` / `getSystemInfo()` recompute against the now-correct `vulkanBackendAvailable` state. UI hints about CPU-slow transcription will then surface accurately on hosts where the Vulkan fallback fired.

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown

@kimbotai1337 The gpuDetected = null reset alongside the Vulkan-poison is the right move — without it, detectGpu() and getSystemInfo() would keep serving a stale true even after vulkanBackendAvailable was poisoned, so the UI would never surface the CPU-slow hint on affected hosts. Resetting both cache fields atomically in the same catch block is the correct pattern here. Triggering a full review now.

✅ Actions performed

Full review triggered.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 64d1821368

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread server/services/whisper-local.ts Outdated
Codex P2 — the loader treats VK_ADD_DRIVER_FILES as additive to the
default search paths, but the manifest probe only honored the *replace*
variants (VK_DRIVER_FILES / VK_ICD_FILENAMES). Slim Linux images that
provide drivers exclusively via VK_ADD_DRIVER_FILES were falling through
to the default-dirs check, finding nothing, and disabling the Vulkan
backend even though the loader itself could initialize the driver.

Now: VK_DRIVER_FILES (or legacy VK_ICD_FILENAMES) replaces, and
VK_ADD_DRIVER_FILES adds — matching the loader's spec.
@kimbotai1337

Copy link
Copy Markdown
Collaborator Author

Pushed ee995e8 — addresses the `VK_ADD_DRIVER_FILES` gap. The probe now matches the Vulkan loader's spec: `VK_DRIVER_FILES` (or legacy `VK_ICD_FILENAMES`) replaces the default search, and `VK_ADD_DRIVER_FILES` is additive on top of it.

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown

@kimbotai1337 The VK_ADD_DRIVER_FILES handling looks correct per the Vulkan loader spec. To confirm the semantics the code now implements:

  • VK_DRIVER_FILES contains a list of full path listings to driver JSON Manifest files and/or paths to folders; it has replaced the older deprecated VK_ICD_FILENAMES, though the older variable continues to work. Both replace the default search paths entirely.
  • VK_ADD_DRIVER_FILES is additive — the loader's "known driver manifests" already account for default search paths plus what VK_ADD_DRIVER_FILES contributes.

Your implementation mirrors this: VK_DRIVER_FILES/VK_ICD_FILENAMES short-circuits to override-only paths, while VK_ADD_DRIVER_FILES supplements the default /usr/share/vulkan/icd.d, /usr/local/share/vulkan/icd.d, and /etc/vulkan/icd.d enumeration. That's a faithful match to loader behavior. Triggering the full review now.

✅ Actions performed

Full review triggered.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ee995e86ee

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread server/services/whisper-local.ts Outdated
Codex P2 — the Vulkan loader splits VK_DRIVER_FILES / VK_ICD_FILENAMES /
VK_ADD_DRIVER_FILES on the OS path-list delimiter (`:` on POSIX, `;` on
Windows). Hardcoding `:` mangled Windows entries like
`C:\drivers\nvidia.json` into two invalid fragments. Use Node's
`path.delimiter` so the probe stays correct on every platform we ship to.
@kimbotai1337

Copy link
Copy Markdown
Collaborator Author

Pushed ae36d3a — using `node:path`'s `delimiter` so Windows drive-letter ICD paths (`C:\drivers\nvidia.json`) aren't split apart. Same loader semantic on every platform now.

@coderabbitai full review

@coderabbitai

coderabbitai Bot commented May 22, 2026

Copy link
Copy Markdown

@kimbotai1337 Using path.delimiter is the right call — ":" as a hard-coded separator would slice C:\drivers\nvidia.json at the drive-letter colon on Windows, turning it into ["C", "\\drivers\\nvidia.json"]. Switching to path.delimiter (; on Windows, : on POSIX) gives you correct splitting on both platforms with zero behavioral change on Linux/macOS. Triggering the full review now.

✅ Actions performed

Full review triggered.

@daggerhashimoto daggerhashimoto merged commit c6aca1f into master May 22, 2026
3 checks passed
@daggerhashimoto daggerhashimoto deleted the fix/whisper-vulkan-icd-cgroup-threads branch May 22, 2026 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(whisper): probe Vulkan ICD directly when vulkaninfo CLI is absent fix(whisper): make maxThreads cgroup-aware instead of using cpus().length

2 participants