It analyzes subtitle files, integrates with ASBPlayer/MPV for direct video navigation, provides flexible ranking and filtering, and automatically hides words you already know using Anki.
Written in Rust π¦
π§ This project is under active development and may be buggy.
The macOS and Linux binaries have not been extensively tested.
- Download the latest release for your platform from Releases
- Connect to Anki - Anki Setup
- Connect to a Video Player - Either:
- ASBPlayer: In ASBPlayer,
MISC->Enable WebSocket client - MPV: Start MPV with
--input-ipc-server=/tmp/mpv-socket
- ASBPlayer: In ASBPlayer,
That's it! Yomine will segment the text, rank terms by frequency, and show you vocabulary and expressions to learn.
- Vocabulary extraction from Japanese subtitle files (words and expressions)
- Frequency-based ranking to prioritize terms
- Anki integration to filter out words you already know
- Video player integration (ASBPlayer and MPV) for timestamp navigation
- Term analysis with readings, part-of-speech, and context sentences
- Multi-sentence browsing to see multiple example sentences per term
- Ignore list to hide unwanted terms from your mining results
- Comprehensibility scoring - Sentence difficulty estimation based on your Anki card intervals
- Advanced filtering - Filter vocabulary by part-of-speech and frequency ranges
- Dictionary weighting - Customize which frequency sources are prioritized
- Sorting and searching - Sort by frequency, chronological order, sentence count, or comprehension level; search for specific terms
- Multiple subtitle formats - Supports SRT, ASS, and SSA subtitle files
- Frequency Analyzer Tool - Generate your own frequency dictionaries.
- Here's one I generated from around 5000 files: Anilist Top 500
- Go to Releases
- Download the appropriate file for your system:
- Windows:
yomine-*-windows-x64.exe - macOS:
yomine-*-macos-universal(Intel & Apple Silicon) - Linux:
yomine-*-linux-x64
- Windows:
- Run the executable
Yomine uses frequency dictionaries to rank vocabulary by importance and improve text segmentation. It will automatically download JPDB v2.2 Frequency Kana. Though you can add as many as you like, toggle and weigh them however you like.
Adding Dictionaries:
- In Yomine, go to File β Load New Frequency Dictionaries
- Select zip files containing Yomitan-compatible frequency dictionaries
- Restart when prompted
Recommended Dictionaries:
- JPDB v2.2 Frequency Kana: β Automagically downloaded and installed β
- BCCWJ: Based on the Balanced Corpus of Contemporary Written Japanese
- CC100: List from Common Crawl data
More dictionaries: Marv's collection and Shoui's collection
Generate Your Own:
You can also generate your own custom frequency dictionaries directly inside Yomine using the built-in Frequency Analyzer tool.
Note: Always download frequency dictionaries from trusted sources to avoid corrupted or malicious files. If you can't find a specific dictionary, consider generating your own. You may want to ask around on the TMW Discord as well.
Yomine connects to Anki to filter out terms you already know.
Prerequisites:
- Install the AnkiConnect add-on in Anki
- In Anki: Tools β Add-ons β Get Add-ons β Enter code
2055492159 - Restart Anki
- In Anki: Tools β Add-ons β Get Add-ons β Enter code
Configuration:
-
In Yomine: Settings β Anki Settings
-
Wait for connection to establish
-
For each note type:
- Select from dropdown
- Choose Term Field (Japanese word/phrase)
- Choose Reading Field (pronunciation)
- Click "Add" to save mapping
Note: Yomine will try to guess the correct fields for you
Yomine uses WebSocket to communicate with ASBPlayer for timestamp navigation.
Default Setup:
- Yomine runs WebSocket server on port
8766 - In ASBPlayer:
MISCβEnabled WebSocket Client
Changing the Port:
- In Yomine: Settings β WebSocket Settings
- Change the port to something else (8767, 8768, 1111, 5353, etc)
- Click "Save and Restart Server"
- In ASBPlayer:
MISCβWebSocket Server URLβ enterws://localhost:YOUR_PORT
Yomine can also integrate directly with MPV player for timestamp navigation, providing an alternative to ASBPlayer.
Setup:
-
Start MPV with IPC server enabled:
mpv --input-ipc-server=/tmp/mpv-socket your-video-file.mkv
-
Yomine will automatically detect when MPV is running and switch to MPV mode
-
When MPV is detected, the WebSocket server will be automatically stopped
-
When MPV is closed, Yomine will automatically restart the WebSocket server for ASBPlayer
Note: You can add input-ipc-server=/tmp/mpv-socket to your MPV configuration file to enable IPC by default.
The ignore list lets you hide terms you don't want to see from your mining results.
Adding Terms to Ignore List:
- Right-click any term in the main vocabulary table
- Select "Add to Ignore List"
- The term will be hidden from future mining sessions
Managing the Ignore List:
- Go to Settings β Ignore List Settings
- View all ignored terms in the list
- Remove terms by clicking the red "x"
- Anki Integration Customization
- Prebuilt Binaries
- Multi-Sentence Browsing - View multiple example sentences per term
- Ignore List - Hide unwanted terms from mining results
- Comprehensibility Scoring - Sentence difficulty estimation based on Anki intervals
- Advanced Filtering - Filter by part-of-speech and frequency ranges
- Custom Frequency Lists: Generate dictionaries from your own content
- Improved Segmentation: Better text parsing and part-of-speech tagging
- More File Types: Support for eBooks, web pages, etc.
What is vocabulary mining?
It's the process of extracting unknown words and expressions from native content (videos, books, etc.) to create targeted study materials. This approach focuses on vocabulary that's relevant to content you want to understand, rather than studying random word lists.
How should I use this tool?
I prefer post-input mining: after watching a video or episode, I add it to a todo list. Then, whenever I have time, I can review the content and extract terms I want to add to my Anki mining deck. This helps me stay focused on enjoying the content while watching, knowing I can come back to mine vocabulary later.
Yomine?
The name comes from θͺγΏ ("yomi" for reading) + "mine" (as in mining vocabulary).
Prerequisites:
- Rust with Cargo
Steps:
git clone https://github.com/mcgrizzz/Yomine.git
cd yomine
cargo build --release
cargo run --releaseYomine is licensed under MIT OR Apache-2.0
Author and maintainer: @mcgrizzz
Key Dependencies:
- Vibrato for text segmentation - MIT or Apache-2.0
- egui for user interface - MIT or Apache-2.0
- WanaKana Rust for Japanese text utilities - MIT
- jp-deinflector for Japanese deinflection
- Noto Sans/Serif JP fonts - SIL Open Font License
- Thanks to https://github.com/r-40021/noto-sans-jp for the converted font without intersection issues.
Happy Mining! βοΈ ι εΌ΅γγΎγγγοΌ


