Whisper

A macOS menu bar app for on-device speech-to-text. Hold a hotkey, speak, release — transcribed text is pasted into the active application automatically.

All processing runs locally using Qwen3 ASR models via Apple's MLX framework. No audio leaves your machine.

Features

Push-to-talk hotkey — configurable custom key combinations (including left/right modifiers) with global detection via a CGEvent tap
Multiple model options — Qwen3 ASR 0.6B (8-bit), 1.7B (8-bit), and 1.7B (4-bit) with on-demand downloading and per-model cache management
Smart paste — transcribed text is written to the pasteboard, Cmd+V is simulated via Accessibility, and the original clipboard contents are restored afterward; a space is prepended when the cursor follows non-whitespace
Visual feedback — animated floating overlay with a MeshGradient whose speed responds to real-time audio level
Menu bar UI — model selector with download/delete controls, permission status indicators, inline hotkey capture, and run-on-startup toggle
Privacy-first — fully offline inference, no network calls after model download

Requirements

macOS (Apple Silicon recommended for MLX performance)
Xcode (for building from source)
Microphone permission
Accessibility permission (for simulating paste keystrokes and detecting cursor context)

Installation

Open whisper.xcodeproj in Xcode. Go to the whisper target, then Signing & Capabilities, enable Automatically manage signing, and select your Team (Personal Team works for local use).

Build a Release app bundle:

xcodebuild -project whisper.xcodeproj -scheme whisper -configuration Release -derivedDataPath build clean build

Copy the built .app into /Applications:

cp -R "build/Build/Products/Release/whisper.app" /Applications/

Launch from /Applications (not from DerivedData):
```
open /Applications/whisper.app
```
Grant Microphone and Accessibility permissions when prompted.

Why /Applications matters — the Run on Startup toggle uses SMAppService.mainApp, which works most reliably when the app is installed in /Applications and properly signed.

If macOS blocks launch — right-click the app and choose Open, or remove quarantine:
xattr -dr com.apple.quarantine /Applications/whisper.app

How It Works

A global CGEvent tap listens for the configured key combination (left/right modifier aware).
On key-down, AVAudioEngine begins capturing microphone input at the native sample rate.
On key-up, recording stops. Audio is resampled to 16 kHz and passed to the Qwen3 ASR model running on-device via MLX.
The transcribed text is placed on the pasteboard, a Cmd+V keystroke is simulated through the Accessibility API, and the original pasteboard contents are restored.

Architecture

whisperApp.swift          App entry point, hotkey wiring, lifecycle
AppState.swift            Observable state machine (idle/recording/transcribing/pasting/error)

Services/
  TranscriptionService    Actor-isolated ML inference, model download & cache
  AudioRecorder           AVAudioEngine capture, RMS level, 16 kHz resampling
  PasteController         Pasteboard snapshot/restore, Cmd+V simulation

Views/
  MenuBarView             Dropdown menu (models, permissions, settings)
  RecordingOverlay        Animated MeshGradient circle
  OverlayManager          Overlay lifecycle
  OverlayPanel            Non-activating transparent NSPanel

Models/
  STTModelDefinition      Model registry (name, HuggingFace repo, quantization)

Hotkey/
  HotkeyDefinitions       CGEvent tap, custom key combos, UserDefaults persistence + legacy migration

Dependencies

Package	Requirement	Products used	Purpose
mlx-audio-swift	`revision: cc3b3880be05caf908970729e15ec209d018f06d`	`MLXAudioSTT`, `MLXAudioCore`	On-device speech-to-text and audio ML pipeline

License

See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github		.github
whisper.xcodeproj		whisper.xcodeproj
whisper		whisper
whisperTests		whisperTests
whisperUITests		whisperUITests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper

Features

Requirements

Installation

How It Works

Architecture

Dependencies

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisper

Features

Requirements

Installation

How It Works

Architecture

Dependencies

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages