Voice-to-text transcription that pastes automatically at your cursor.
Starling is a lightweight macOS menu bar app that lets you dictate text anywhere with a global hotkey. Speak naturally, and when you stop, the text appears instantly at your cursor—no context switching required.
- Global Hotkey — Press
⌃⌥⌘J(customizable) to start/stop recording - Voice Activity Detection — Automatically stops when you finish speaking
- Local Transcription — Uses Parakeet v3 Core ML (via FluidAudio) running on your Mac's Neural Engine
- Smart Paste — Automatically pastes transcribed text at your cursor without stealing focus
- Privacy First — No audio leaves your Mac; models cache locally in
~/Library/Caches/
brew tap Ryandonofrio3/starling
brew install starling- Download the latest
.appfrom Releases - Drag
Starling.appto/Applications - Open the app — it runs in your menu bar (look for the bird 🦜)
After download, it will warn you that the app is not from the App Store. Go to System Settings -> Privacy & Security and scroll down to Starling and click "Allow".
The app will guide you through a quick onboarding:
- Microphone Access — Grant permission to record audio
- Accessibility Access — Required to simulate paste (
⌘V) and detect cursor position - Model Download — First run downloads the ~2.5 GB Parakeet v3 Core ML model (one-time, requires internet)
- Press
⌃⌥⌘J(or your custom hotkey) to start recording - Speak — the bird glows while listening
- Stop naturally — VAD detects when you finish, or press the hotkey again to stop manually
- Text pastes automatically at your cursor (or copies to clipboard)
Access preferences from the menu bar bird icon:
- Hotkey — Customize the global shortcut (default:
⌃⌥⌘J) - Trailing Silence Duration — Adjust VAD sensitivity (how long to wait after you stop speaking)
- Clipboard Retention — Keep transcription on clipboard after auto-paste (off by default)
- Local Processing — All transcription happens on your Mac via Core ML + Neural Engine
- No Network Calls — After initial model download, app works fully offline
- No Audio Storage — Audio buffers are processed in memory and immediately discarded
- Secure Input Respect — Password fields trigger copy-only mode (no keystrokes simulated)
- Model Cache — FluidAudio stores models in
~/Library/Caches/FluidAudio/(~2.5 GB)
- macOS 14.0 (Sonoma) or later
- Apple Silicon (M1/M2/M3) or Intel Mac with Neural Engine support
- ~2.5 GB free disk space for model cache
- Permissions — Microphone + Accessibility access
- Xcode 15.0+
- macOS 14.0+ SDK
- Swift 6.0 toolchain
# Clone the repository
git clone https://github.com/Ryandonofrio3/starling.git
cd starling/Starling
# Open in Xcode
open Starling.xcodeproj
# Select the Starling scheme and build (⌘B)
# Run with ⌘R or archive for distributionOr build from the command line:
cd Starling
xcodebuild -scheme Starling -configuration Release build- Check Accessibility — System Settings → Privacy & Security → Accessibility → enable Starling
- Restart Required — After granting Accessibility, restart the app
- First launch only — Model download (~2.5 GB) requires a stable internet connection
- Check progress — Bird shows download status
- Clear cache — If download fails, quit app and run:
rm -rf ~/Library/Caches/FluidAudio/
- Conflict detection — Another app may be using the same hotkey
- Change hotkey — Open Preferences and set a different combination
- Check permissions — Ensure Accessibility is granted
- Microphone check — Test your mic in System Settings → Sound
- Reduce background noise — VAD is sensitive to ambient sound
- Adjust trailing silence — Increase duration in Preferences if getting cut off
- English-first — Parakeet v3 supports 25 languages but is optimized for English
- No streaming — Transcription happens after you stop speaking (no live text yet)
- Large model — 2.5 GB cache requirement (no smaller English-only variant available)
- Offline-only — No cloud sync or multi-device support
- Streaming transcription (pending FluidAudio partial results support)
- Custom model management (clear cache, view size, switch versions)
- Press-and-hold mode (alternative to toggle)
- Launch at login option
- Multi-language selection UI
- Developer diagnostics panel (ANE usage, cache stats)
Contributions are welcome! Please open an issue or PR for:
- Bug fixes
- Feature requests
- Documentation improvements
- Performance optimizations
See CONTRIBUTING.md for guidelines.
MIT License — see LICENSE for details.
- FluidAudio — Swift wrapper for Parakeet ASR models
- NVIDIA Parakeet — Underlying speech recognition model
- Cursor Rules Contributors — Project structure and workflow inspired by community best practices
Have feedback or questions? Open an issue on GitHub.