Background
In PR #319 (fix/whisper-vulkan-linux), transcribeOpts was updated to set maxThreads: cpus().length so Whisper utilises all available logical CPU cores on the host.
Problem
cpus().length (Node's os.cpus()) reflects the host's logical CPU count and is unaware of Linux cgroup CPU quotas. In containerised environments (Docker, Kubernetes, LXC, etc.) the container may be limited to a fractional number of CPUs via cpu.cfs_quota_us / cpu.cfs_period_us (cgroups v1) or cpu.max (cgroups v2), yet cpus().length will still return the full host count. Spawning more threads than the container is allotted leads to CPU throttling and can degrade transcription throughput.
Proposed fix
Read the effective CPU limit from the cgroup interface files at runtime and clamp maxThreads to Math.max(1, Math.floor(cgroupQuota)). Fall back to cpus().length if the cgroup files are absent (bare-metal, macOS, Windows) or if the quota is set to -1 (unlimited).
Rough sketch:
function effectiveCpuCount(): number {
// cgroups v2
try {
const raw = fs.readFileSync('/sys/fs/cgroup/cpu.max', 'utf8').trim();
const [quota, period] = raw.split(' ');
if (quota !== 'max') return Math.max(1, Math.floor(Number(quota) / Number(period)));
} catch { /* not v2 */ }
// cgroups v1
try {
const quota = Number(fs.readFileSync('/sys/fs/cgroup/cpu/cpu.cfs_quota_us', 'utf8'));
const period = Number(fs.readFileSync('/sys/fs/cgroup/cpu/cpu.cfs_period_us', 'utf8'));
if (quota > 0) return Math.max(1, Math.floor(quota / period));
} catch { /* not v1 */ }
return cpus().length;
}
Context
References
Background
In PR #319 (fix/whisper-vulkan-linux),
transcribeOptswas updated to setmaxThreads: cpus().lengthso Whisper utilises all available logical CPU cores on the host.Problem
cpus().length(Node'sos.cpus()) reflects the host's logical CPU count and is unaware of Linux cgroup CPU quotas. In containerised environments (Docker, Kubernetes, LXC, etc.) the container may be limited to a fractional number of CPUs viacpu.cfs_quota_us/cpu.cfs_period_us(cgroups v1) orcpu.max(cgroups v2), yetcpus().lengthwill still return the full host count. Spawning more threads than the container is allotted leads to CPU throttling and can degrade transcription throughput.Proposed fix
Read the effective CPU limit from the cgroup interface files at runtime and clamp
maxThreadstoMath.max(1, Math.floor(cgroupQuota)). Fall back tocpus().lengthif the cgroup files are absent (bare-metal, macOS, Windows) or if the quota is set to-1(unlimited).Rough sketch:
Context
hasVulkanBackend()anduseGpufixes shipped in that PR; this is the only remaining item.References