Whisp - An Oh My Zsh plugin for WhisperX

Whisp is an Oh My Zsh plugin that adds idempotency, convenience features, and speaker diarization to the WhisperX CLI tool. It helps you efficiently transcribe audio files without duplicating work.

Features

Idempotent Processing: Skip files that already have transcriptions unless explicitly forced
Speaker Diarization: Identify who is speaking with --diarize (powered by pyannote.audio)
Batch Processing: Transcribe multiple files with a single command
Extension Filtering: Process files of specific audio types
Model Selection: Easily switch between WhisperX models
Recursive Searching: Optionally find audio files in subdirectories
Output Control: View WhisperX's real-time output or suppress it
Resource Management: Limit thread usage to prevent system slowdown

Dependencies

Oh My Zsh
WhisperX CLI tool properly installed and available in your PATH
For diarization: A HuggingFace API token with access to pyannote models

Installation

Manual Installation

Clone this repository:

git clone https://github.com/yourusername/whisp.git ${ZSH_CUSTOM:-~/.oh-my-zsh/custom}/plugins/whisp

Add the plugin to your .zshrc file:
```
plugins=(... whisp)
```
Reload your shell:
```
source ~/.zshrc
```

Usage

Basic Commands

# Transcribe all supported audio files in the current directory
whisp

# Transcribe a specific file
whisp file.mp3

# Transcribe all files with a specific extension
whisp mp3

# Transcribe files with any of multiple extensions
whisp mp3 m4a wav

# Transcribe multiple specific files
whisp file1.mp3 file2.m4a

Options

# Choose which WhisperX model to use (default is turbo)
whisp --model tiny
whisp --model base
whisp --model small
whisp --model medium
whisp --model large
whisp --model turbo

# Force transcription even if a transcription already exists
whisp --force

# Specify language for transcription
whisp --language en

# Search for audio files in subdirectories
whisp --subdir

# Run silently (suppress WhisperX output)
whisp --silent

# Limit threads used (reduces system load)
whisp --cores 2

# Set compute type (default: float32, also: float16, int8)
whisp --compute-type float32

# Combine options
whisp mp3 --model medium --force --subdir --cores 4

Diarization

Speaker diarization identifies who is speaking and when. To use it:

Create a HuggingFace account
Accept the pyannote model agreements:
- pyannote/segmentation-3.0
- pyannote/speaker-diarization-3.1
Create an access token at HuggingFace Settings
Either set HF_TOKEN in your environment or pass --hf-token

# Transcribe with speaker identification
whisp --diarize meeting.mp3

# Pass HuggingFace token directly
whisp --diarize --hf-token hf_abc123 meeting.mp3

# Specify expected number of speakers
whisp --diarize --min-speakers 2 --max-speakers 4 call.mp3

Idempotency Behavior

Single File Mode: If a transcription exists, prompts you before creating a new one
Batch Mode: Automatically skips files with existing transcriptions
Force Mode: Creates uniquely named transcriptions without overwriting existing ones

Supported Audio Formats

mp3
mp4
m4a
wav
flac
aac
ogg
wma

Examples

Transcribe all MP3 files in the current directory using the medium model

whisp mp3 --model medium

Transcribe all audio files, including those in subdirectories

whisp --subdir

Force retranscription of a specific file

whisp interview.mp3 --force

Process multiple file types silently

whisp mp3 wav --silent

Transcribe a meeting with speaker diarization

whisp --diarize --min-speakers 2 meeting.mp3

Support

This has only been tested on macOS Sequoia 15. YMMV.

License

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
LICENSE		LICENSE
README.md		README.md
whisp.plugin.zsh		whisp.plugin.zsh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisp - An Oh My Zsh plugin for WhisperX

Features

Dependencies

Installation

Manual Installation

Usage

Basic Commands

Options

Diarization

Idempotency Behavior

Supported Audio Formats

Examples

Transcribe all MP3 files in the current directory using the medium model

Transcribe all audio files, including those in subdirectories

Force retranscription of a specific file

Process multiple file types silently

Transcribe a meeting with speaker diarization

Support

License

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisp - An Oh My Zsh plugin for WhisperX

Features

Dependencies

Installation

Manual Installation

Usage

Basic Commands

Options

Diarization

Idempotency Behavior

Supported Audio Formats

Examples

Transcribe all MP3 files in the current directory using the medium model

Transcribe all audio files, including those in subdirectories

Force retranscription of a specific file

Process multiple file types silently

Transcribe a meeting with speaker diarization

Support

License

Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages