Pass HanziToAnki a file via command-line or web interface, and generate flashcards for Anki/Pleco/Memrise.
You can generate the flashcards from any Chinese input - a news article you're studying, song lyrics, or even your exported WeChat logs!
HanziToAnki supports two modes: API Server (web interface) and CLI Tool (command-line). The same JAR/Gradle build supports both modes automatically.
Run with JAR:
java -jar build/libs/HanziToAnki-1.0.0.jarRun with Gradle:
./gradlew runApiThen open http://localhost:8080/ in your browser to upload files and generate decks.
Run with JAR:
java -jar build/libs/HanziToAnki-1.0.0.jar input.txt
java -jar build/libs/HanziToAnki-1.0.0.jar input.txt -f ANKI -o output.ankiRun with Gradle:
./gradlew runCli -Pargs='input.txt'
./gradlew runCli -Pargs='input.txt -f ANKI -o output.anki'Command-line options:
-w --word-listRead from an input file containing a list of words, separated by line breaks. Without this flag, individual characters are extracted-s --single-charactersExtract only single characters from the file-hsk <hsk level>Remove any words in any HSK levels up to and including the given one-t --strategy <strategy>Specify the word finding strategy. Options:6- ANSJ_SEGMENTATION (default): Intelligent word segmentation using ANSJ (Jieba-like)0- TRI_BI_MONOGRAMS_USE_ALL_CHARS_BIGRAM_OVERLAP: 3-char, 2-char, 1-char combinations1- TRI_BI_MONOGRAMS_USE_ALL_CHARS: Only 3-char combinations2- BIGRAM_AND_MONOGRAM_ONLY_NO_OVERLAP: 2-char and 1-char, no overlap3- BIGRAM_AND_MONOGRAM_ONLY_OVERLAP: 2-char and 1-char with overlap4- SINGLE_CHAR_ONLY: Single characters only5- ALL_COMBINATIONS: All possible n-gram combinations
-o <output filename>Override the default output file name-f --format <output format>Override the default output file name (ANKI, PLECO, MEMRISE)-c --char-type <char type>Specify the type of character (TRAD, SIMP or SIMP_AND_TRAD)
Features:
- Both simplified and traditional hanzi can be provided
- English definitions provided by CC-CEDICT
- Pinyin has accented characters (nĭ hăo instead of ni3 hao3)
- Pinyin is coloured by HTML markup
- Can ignore vocabulary below specified HSK level (reduces card count & saves time)
- Fully Open-Source, so you can contribute features, create issue tickets on GitHub, or even help fix those issues!
Please feel free to make suggestions, open/comment on issues, or share code!
Feel free to fork, create branches, and raise PRs.
If you get an error about "Invalid source release", check that echo $JAVA_HOME points to your JDK. We recommend sdkman for setting up Java.
The project uses Java 21 and Gradle 8.9.
Build the dual-mode JAR (supports both API and CLI):
./gradlew buildThe JAR will be at build/libs/HanziToAnki-1.0.0.jar
See Running HanziToAnki above for all running options.
Quick reference:
- API server:
./gradlew runApiorjava -jar build/libs/HanziToAnki-1.0.0.jar - CLI tool:
./gradlew runCli -Pargs='input.txt'orjava -jar build/libs/HanziToAnki-1.0.0.jar input.txt
Run all tests:
./gradlew testThe project includes:
- 40+ unit tests for core functionality
- 15+ integration tests for output validation with modern Chinese fixtures
- Tests for all 7 word segmentation strategies
You can also run tests with IntelliJ or other IDEs.
This project uses a modified version of the CEDICT Chinese dictionary, which can be found here: https://cc-cedict.org/wiki/