Best-in-class speech recognition for Node.js on Apple Silicon
Transcribe audio in 99 languages. Run 100% offline on your Mac.
OpenAI's Whisper is the gold standard for speech recognition accuracy. This package brings it to Node.js – powered by Apple's Neural Engine for fast, private, local transcription.
🎯 Accuracy first. Whisper large-v3-turbo delivers state-of-the-art transcription quality – better than any cloud API, right on your Mac.
🌍 99 languages. From Afrikaans to Zulu. Handles accents, dialects, and background noise.
🔒 100% private. Your audio never leaves your device. No API keys. No cloud. No subscription.
⚡ Fast enough. 14x real-time on M1 Ultra – transcribe 1 hour of audio in under 5 minutes.
Running Whisper without hardware acceleration is painfully slow. Here's how the alternatives compare:
| Approach | Speed | Drawbacks |
|---|---|---|
| OpenAI Whisper (Python) | ~2x real-time | Slow, needs Python |
| whisper.cpp (CPU) | ~4x real-time | No acceleration |
| faster-whisper | ~6x real-time | Needs NVIDIA GPU |
| Cloud APIs | ~1x + latency | Costs $$$, privacy concerns |
| whisper-coreml | 14x real-time | macOS only ✓ |
The Neural Engine in every Apple Silicon Mac is a dedicated ML accelerator that usually sits idle. This package puts it to work.
Need even more speed? Our sister project parakeet-coreml trades language coverage for 40x real-time performance.
| whisper-coreml | parakeet-coreml | |
|---|---|---|
| Best for | Accuracy, rare languages | Maximum speed |
| Speed | 14x real-time | 40x real-time |
| Languages | 99 | 25 European |
- 🎯 99 Languages – Full OpenAI Whisper multilingual support
- 🚀 14x real-time – 1 hour of audio in ~4.5 minutes (M1 Ultra)
- 🍎 Neural Engine – Runs on Apple's dedicated ML chip via CoreML
- 🔒 Fully Offline – No internet required after setup
- 📦 Zero Dependencies – No Python, no subprocess, no hassle
- 📝 Timestamps – Segment-level timing for subtitles
- ⬇️ One Command Setup –
npx whisper-coreml download
# Install
npm install whisper-coreml
# Download the model (~3GB, one-time)
npx whisper-coreml downloadRequirements: macOS 14+ (Sonoma), Apple Silicon (M1/M2/M3/M4), Node.js 20+
Measured on M1 Ultra:
5 min audio → 22 seconds → 14x real-time
1 hour audio → 4.5 minutes
Run npx whisper-coreml benchmark to test on your machine.
import { WhisperAsrEngine, getModelPath } from "whisper-coreml"
const engine = new WhisperAsrEngine({
modelPath: getModelPath()
})
await engine.initialize()
// Transcribe audio (16kHz, mono, Float32Array)
const result = await engine.transcribe(audioSamples, 16000)
console.log(result.text)
// "Hello, this is a test transcription."
console.log(`Language: ${result.language}`)
console.log(`Processed in ${result.durationMs}ms`)
// Segments include timestamps
for (const seg of result.segments) {
console.log(`[${seg.startMs}ms - ${seg.endMs}ms] ${seg.text}`)
}
engine.cleanup()| Property | Requirement |
|---|---|
| Sample Rate | 16,000 Hz (16 kHz) |
| Channels | Mono (single channel) |
| Format | Float32Array with values between -1.0–1.0 |
| Duration | Any length (auto-chunked internally) |
Example with ffmpeg:
ffmpeg -i input.mp3 -ar 16000 -ac 1 -f f32le output.pcmThen load the raw PCM file:
import { readFileSync } from "fs"
const buffer = readFileSync("output.pcm")
const samples = new Float32Array(buffer.buffer, buffer.byteOffset, buffer.length / 4)# Download the model (~1.5GB)
npx whisper-coreml download
# Check status
npx whisper-coreml status
# Run benchmark (requires cloned repo)
npx whisper-coreml benchmark
# Get model directory path
npx whisper-coreml pathThe main class for speech recognition.
new WhisperAsrEngine(options: WhisperAsrOptions)| Option | Type | Default | Description |
|---|---|---|---|
modelPath |
string |
required | Path to ggml model file |
language |
string |
"auto" |
Language code or "auto" to detect |
threads |
number |
0 |
CPU threads (0 = auto) |
| Method | Description |
|---|---|
initialize() |
Load model (async) |
transcribe(samples, rate) |
Transcribe audio |
isReady() |
Check if engine is initialized |
cleanup() |
Release native resources |
getVersion() |
Get version information |
interface TranscriptionResult {
text: string // Full transcription
language: string // Detected language (ISO code)
durationMs: number // Processing time in milliseconds
segments: TranscriptionSegment[]
}
interface TranscriptionSegment {
startMs: number // Segment start in milliseconds
endMs: number // Segment end in milliseconds
text: string // Transcription for this segment
confidence: number // Confidence score (0-1)
}| Function | Description |
|---|---|
isAvailable() |
Check if running on supported platform |
getDefaultModelDir() |
Get default model cache path |
getModelPath() |
Get path to the model file |
isModelDownloaded() |
Check if model is downloaded |
downloadModel() |
Download the model |
┌─────────────────────────────────────────────────────────┐
│ Your Node.js App │
├─────────────────────────────────────────────────────────┤
│ whisper-coreml API │ TypeScript
├─────────────────────────────────────────────────────────┤
│ Native Addon │ N-API + C++
│ (whisper_engine) │
├─────────────────────────────────────────────────────────┤
│ whisper.cpp │ C++
├─────────────────────────────────────────────────────────┤
│ CoreML │ Apple Framework
├─────────────────────────────────────────────────────────┤
│ Apple Neural Engine │ Dedicated ML Silicon
└─────────────────────────────────────────────────────────┘
- Maximum accuracy – When other solutions aren't good enough
- Rare languages – 99 languages, far beyond English/European
- Accented speech – Whisper handles accents and dialects well
- Noisy audio – Robust to background noise and music
Contributions are welcome! Please read our Contributing Guide for details.
MIT – see LICENSE for details.
- whisper.cpp by Georgi Gerganov
- OpenAI Whisper by OpenAI
Open Source at Sebastian Software
Copyright © 2026 Sebastian Software GmbH