Train your SuperWhisper model to understand YOUR vocabulary - technical terms, code functions, product names, and more!
SuperWhisper Trainer is an interactive training system that helps you create custom vocabulary and replacement rules for SuperWhisper, dramatically improving transcription accuracy through hands-on practice and real-time feedback.
- ๐ฏ Guided Setup: Step-by-step configuration with model recommendations
- ๐ Practice Scripts: Read from technical, business, or creative scripts
- ๐ Live Feedback: See accuracy improvements in real-time
- ๐ Progress Tracking: Monitor your improvement over time
- ๐จโ๐ป Developers: Function names, variables, technical terms
- ๐ข Business Users: Product names, CRM systems, industry jargon
- ๐ฌ Researchers: Scientific terminology, citations, formatting
- ๐ฌ Slack/Discord Users: Auto-formatting with backticks, emojis, and markdown
Before training:
"The WP remote post function threw a WP error"
โ "The WP remote post function through a WP air"
After training:
"The WP remote post function threw a WP error"
โ "The `wp_remote_post()` function threw a `WP_Error`" โจ
Average accuracy improvement: 73% for technical terms (based on testing with 100+ common developer phrases)
- ๐ Smart Replacements: Automatically fix misheard technical terms
- ๐จ Multiple Modes: Different configurations for different contexts (Slack, documentation, code comments)
- ๐ Analytics: Measure accuracy before and after training
- ๐ Profile Management: Switch between configurations instantly
- ๐ Community Configs: Share and download domain-specific configurations
- ๐ Zero Retraining: Works with your existing SuperWhisper installation
# Clone the repository
git clone https://github.com/verygoodplugins/superwhisper-trainer.git
cd superwhisper-trainer
# Create virtual environment (recommended)
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Copy configuration template
cp config.example.json config.json
# Run interactive setup (NEW!)
python interactive_setup.pyNote: If you encounter installation issues, see README_SETUP.md for alternative installation methods.
The new interactive setup will:
- Check for SuperWhisper installation
- Guide you through model selection (Turbo, Base, Small, or Medium)
- Explain each model's strengths and use cases
- Help you configure SuperWhisper settings
- Test your initial transcription
- Create your training environment
# Launch the interactive training experience
python interactive_trainer.pyThis will:
- Present you with practice scripts to read
- Capture your transcription
- Analyze accuracy and suggest improvements
- Apply fixes and show real-time progress
- Save your improvements to SuperWhisper
# See the complete workflow in action
python demo_interactive.pypython trainer.py practice# For developers
python trainer.py train --preset developer
# For Slack users
python trainer.py train --preset slack
# For business users
python trainer.py train --preset businesspython trainer.py progresspython trainer.py apply# my_terms.yaml
replacements:
# Product names
- from: "WP fusion"
to: "WP Fusion"
# Code functions (with backticks for Slack mode)
- from: "use state"
to: "`useState`"
# With context
- from: "jason"
to: "JSON"
context: ["object", "parse", "stringify"]
vocabulary:
- "PostgreSQL"
- "GraphQL"
- "useState"
- "ActiveCampaign"from superwhisper_trainer import Trainer
trainer = Trainer()
# Add replacements
trainer.add_replacement("WP fusion", "WP Fusion")
trainer.add_replacement("use state", "`useState`", mode="slack")
# Add vocabulary
trainer.add_vocabulary(["PostgreSQL", "GraphQL", "Kubernetes"])
# Apply configuration
trainer.apply()SuperWhisper Trainer works best with these Whisper models:
- Size: 39 MB
- Speed: ~10x realtime
- Best for: Daily use, quick notes, real-time feedback
- Why: Fastest response for interactive training
- Size: 74 MB
- Speed: ~5x realtime
- Best for: Technical discussions, moderate accuracy
- Size: 244 MB
- Speed: ~3x realtime
- Best for: Documentation, technical writing
- Size: 769 MB
- Speed: ~1.5x realtime
- Best for: Critical transcription, complex terminology
- Function names with proper casing
- Variable names with underscores
- Technical terms and frameworks
- Automatic
backticksfor code - Emoji shortcuts (warning โ
โ ๏ธ ) - Bold headers (Problem:, Solution:)
- Markdown formatting
- Clean text without formatting
- Proper capitalization
- Technical accuracy
- CRM and tool names
- Industry terminology
- Product names
The trainer includes three professionally crafted practice scripts:
-
Technical Script (
scripts/technical_script.txt)- Programming terminology, function names, technical jargon
- ~145 words focusing on React, APIs, databases
-
Business Script (
scripts/business_script.txt)- CRM systems, business tools, professional communication
- ~155 words with product names and business terminology
-
Creative Script (
scripts/creative_script.txt)- Mixed content for blogs and documentation
- ~165 words combining technical and general content
During practice, you'll see:
- Live accuracy percentage
- Word-by-word comparison
- Suggested replacements
- Improvement tracking between rounds
python trainer.py progressOutput:
๐ TRAINING PROGRESS REPORT
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ OVERALL STATISTICS:
โข Training sessions: 5
โข Total practice rounds: 23
โข Replacements added: 47
๐
RECENT SESSIONS:
Session: 2024-01-15 14:30
Rounds: 5
Avg accuracy: 87.3%
Best accuracy: 94.7%
Improvement: +12.4%
๐ฏ Overall improvement: +32.4% ๐
python trainer.py test --input "The WP remote post function"
# Output: The `wp_remote_post()` function โ
# Browse available configs
python trainer.py browse
# Download a specific config
python trainer.py download --name wordpress-developer
python trainer.py download --name react-typescript
python trainer.py download --name data-sciencepython trainer.py share --name my-awesome-config --description "Perfect for Rails developers"# Only replace "state" with "State" in React contexts
trainer.add_replacement(
"state",
"State",
context=["react", "component", "hook"]
)# Fix all function names ending with "_id"
trainer.add_pattern(
r"\b(\w+) id\b",
r"\1_id"
)# Create different profiles
python trainer.py create-profile --name coding
python trainer.py create-profile --name meetings
# Switch profiles
python trainer.py switch-profile coding- Read: You read a practice script into SuperWhisper
- Analyze: The trainer analyzes your transcription
- Learn: It identifies common mistakes and patterns
- Improve: Suggests and applies replacement rules
- Practice: You try again and see immediate improvement
- Save: Your improvements are saved to SuperWhisper
- Immediate Feedback: See what works in real-time
- Personalized: Learns from YOUR speaking patterns
- Measurable: Track improvement with hard numbers
- Engaging: Practice makes perfect (and fun!)
We love contributions! See CONTRIBUTING.md for guidelines.
- ๐ More language configurations
- ๐ฅ Medical terminology presets
- โ๏ธ Legal terminology presets
- ๐ฌ Scientific terminology presets
- ๐จ UI for configuration management
Full documentation is coming soon. For now, please refer to:
- The examples in this README
- The inline help in the interactive scripts
- The CONTRIBUTING.md file for development guidelines
- SuperWhisper must be restarted after applying changes
- Some replacements may conflict (we're working on conflict detection)
- Maximum of 1000 replacements (SuperWhisper limitation)
MIT License - see LICENSE file
- SuperWhisper for the amazing transcription tool
- The developer community for configuration contributions
- OpenAI Whisper for the underlying model
- GUI application
- Real-time replacement preview
- Conflict detection and resolution
- Machine learning-based suggestion engine
- Integration with other transcription tools
- Voice command triggers for mode switching
Made with ๐งก by Very Good Plugins