This directory contains benchmarking scripts to compare Rustling (Rust + PyO3) against other Python packages with similar functionalities.
GitHub: https://github.com/jacksonllee/rustling/tree/main/benchmarks
benchmarks/
├── README.md
├── run_chat.py # CHAT parsing benchmark (Rustling vs pylangacq)
├── run_conllu.py # CoNLL-U parsing benchmark (Rustling vs conllu)
├── run_elan.py # ELAN parsing benchmark (Rustling vs pympi-ling)
├── run_textgrid.py # TextGrid parsing benchmark (Rustling vs pympi-ling)
├── run_hmm.py # HMM benchmark (Rustling vs hmmlearn)
├── run_lm.py # Language model benchmark (Rustling vs NLTK)
├── run_wordseg.py # Word segmentation benchmark (Rustling vs wordseg)
├── run_perceptron_pos_tagger.py # POS tagger benchmark (Rustling vs NLTK PerceptronTagger)
├── update_readme.py # Update benchmark tables in README files
└── common/
├── __init__.py
└── data.py # Shared HKCanCor data loader
Most benchmarks use the HKCanCor corpus (~10K Cantonese sentences with POS tags), loaded via pycantonese. The shared data loader in common/data.py converts the corpus into the format each benchmark needs:
- Tagging: tagged sentences
[(word, tag), ...]for training, untagged word lists for testing - Word segmentation: word tuples for training, concatenated strings for testing
- HMM: word sequences (tags stripped) for unsupervised Baum-Welch EM training and Viterbi decoding
- Language models: word sequences (tags stripped)
The CoNLL-U benchmark uses the UD_English-EWT treebank (English Universal Dependencies data), auto-downloaded to ~/.rustling/ud-english-ewt/.
The ELAN benchmark uses the CantoMap corpus (Cantonese conversation data with ELAN annotations), auto-downloaded to ~/.rustling/cantomap/.
The TextGrid benchmark uses TextGrid files generated from the CantoMap ELAN data via rustling.elan.ELAN.to_textgrid_files(), cached at ~/.rustling/cantomap_textgrid/.
# Build Rustling (from repo root)
uv run maturin develop --release
# Install benchmark dependencies
uv sync --group benchmarks| Benchmark | Comparison Library |
|---|---|
| CHAT Parsing | pylangacq |
| CoNLL-U Parsing | conllu |
| ELAN Parsing | pympi-ling Eaf |
| TextGrid Parsing | pympi-ling TextGrid |
| HMM | hmmlearn CategoricalHMM |
| Word Segmentation | wordseg |
| POS Tagging | NLTK PerceptronTagger |
| Language Models | NLTK nltk.lm |
All benchmarks degrade gracefully if a comparison library is not installed.
Benchmarked against Python implementations from NLTK, wordseg (v0.0.5), pylangacq (v0.19.1), hmmlearn (v0.3.3), pympi-ling (v1.70.2), and conllu (v6.0.0).
| Component | Task | Speedup | vs. |
|---|---|---|---|
| Language Models | Fit | 11x | NLTK |
| Score | 2x | NLTK | |
| Generate | 86--107x | NLTK | |
| Word Segmentation | LongestStringMatching | 9x | wordseg |
| POS Tagging | Training | 5x | NLTK |
| Tagging | 17x | NLTK | |
| HMM | Fit | 14x | hmmlearn |
| Predict | 0.9x | hmmlearn | |
| Score | 5x | hmmlearn | |
| CHAT Parsing | Reading from a ZIP archive | 30x | pylangacq |
| Reading from strings | 35x | pylangacq | |
| Parsing utterances | 15x | pylangacq | |
| Parsing tokens | 8x | pylangacq | |
| ELAN Parsing | Parse single file | 4x | pympi-ling |
| Parse all files | 17x | pympi-ling | |
| TextGrid Parsing | Parse single file | 3x | pympi-ling |
| Parse all files | 8x | pympi-ling | |
| CoNLL-U Parsing | Parse from strings | 15x | conllu |
| Parse from files | 15x | conllu |
Each script supports --quick (fewer iterations), --export FILE (JSON output), and --quiet:
python benchmarks/run_chat.py
python benchmarks/run_conllu.py
python benchmarks/run_elan.py
python benchmarks/run_textgrid.py
python benchmarks/run_hmm.py
python benchmarks/run_wordseg.py
python benchmarks/run_perceptron_pos_tagger.py
python benchmarks/run_lm.pyAfter running benchmarks with --export, update the performance table in benchmarks/README.md:
python benchmarks/run_chat.py --export benchmarks/.results/chat.json
python benchmarks/run_conllu.py --export benchmarks/.results/conllu.json
python benchmarks/run_elan.py --export benchmarks/.results/elan.json
python benchmarks/run_textgrid.py --export benchmarks/.results/textgrid.json
python benchmarks/run_hmm.py --export benchmarks/.results/hmm.json
python benchmarks/run_wordseg.py --export benchmarks/.results/wordseg.json
python benchmarks/run_perceptron_pos_tagger.py --export benchmarks/.results/tagger.json
python benchmarks/run_lm.py --export benchmarks/.results/lm.json
python benchmarks/update_readme.py --from-json benchmarks/.results/- Use
--releasewhen building Rustling for accurate benchmarks:maturin develop --release - Close other applications to reduce noise
- Run multiple times to verify consistency
- Use
--quietwith--exportfor machine-readable output only