Skip to content

Latest commit

 

History

History

README.md

Benchmarks

This directory contains benchmarking scripts to compare Rustling (Rust + PyO3) against other Python packages with similar functionalities.

GitHub: https://github.com/jacksonllee/rustling/tree/main/benchmarks

Directory Structure

benchmarks/
├── README.md
├── run_chat.py        # CHAT parsing benchmark (Rustling vs pylangacq)
├── run_conllu.py      # CoNLL-U parsing benchmark (Rustling vs conllu)
├── run_elan.py        # ELAN parsing benchmark (Rustling vs pympi-ling)
├── run_textgrid.py    # TextGrid parsing benchmark (Rustling vs pympi-ling)
├── run_hmm.py         # HMM benchmark (Rustling vs hmmlearn)
├── run_lm.py          # Language model benchmark (Rustling vs NLTK)
├── run_wordseg.py     # Word segmentation benchmark (Rustling vs wordseg)
├── run_perceptron_pos_tagger.py  # POS tagger benchmark (Rustling vs NLTK PerceptronTagger)
├── update_readme.py   # Update benchmark tables in README files
└── common/
    ├── __init__.py
    └── data.py        # Shared HKCanCor data loader

Data Sources

Most benchmarks use the HKCanCor corpus (~10K Cantonese sentences with POS tags), loaded via pycantonese. The shared data loader in common/data.py converts the corpus into the format each benchmark needs:

  • Tagging: tagged sentences [(word, tag), ...] for training, untagged word lists for testing
  • Word segmentation: word tuples for training, concatenated strings for testing
  • HMM: word sequences (tags stripped) for unsupervised Baum-Welch EM training and Viterbi decoding
  • Language models: word sequences (tags stripped)

The CoNLL-U benchmark uses the UD_English-EWT treebank (English Universal Dependencies data), auto-downloaded to ~/.rustling/ud-english-ewt/.

The ELAN benchmark uses the CantoMap corpus (Cantonese conversation data with ELAN annotations), auto-downloaded to ~/.rustling/cantomap/.

The TextGrid benchmark uses TextGrid files generated from the CantoMap ELAN data via rustling.elan.ELAN.to_textgrid_files(), cached at ~/.rustling/cantomap_textgrid/.

Prerequisites

# Build Rustling (from repo root)
uv run maturin develop --release

# Install benchmark dependencies
uv sync --group benchmarks

Comparison Libraries

Benchmark Comparison Library
CHAT Parsing pylangacq
CoNLL-U Parsing conllu
ELAN Parsing pympi-ling Eaf
TextGrid Parsing pympi-ling TextGrid
HMM hmmlearn CategoricalHMM
Word Segmentation wordseg
POS Tagging NLTK PerceptronTagger
Language Models NLTK nltk.lm

All benchmarks degrade gracefully if a comparison library is not installed.

Results

Benchmarked against Python implementations from NLTK, wordseg (v0.0.5), pylangacq (v0.19.1), hmmlearn (v0.3.3), pympi-ling (v1.70.2), and conllu (v6.0.0).

Component Task Speedup vs.
Language Models Fit 11x NLTK
Score 2x NLTK
Generate 86--107x NLTK
Word Segmentation LongestStringMatching 9x wordseg
POS Tagging Training 5x NLTK
Tagging 17x NLTK
HMM Fit 14x hmmlearn
Predict 0.9x hmmlearn
Score 5x hmmlearn
CHAT Parsing Reading from a ZIP archive 30x pylangacq
Reading from strings 35x pylangacq
Parsing utterances 15x pylangacq
Parsing tokens 8x pylangacq
ELAN Parsing Parse single file 4x pympi-ling
Parse all files 17x pympi-ling
TextGrid Parsing Parse single file 3x pympi-ling
Parse all files 8x pympi-ling
CoNLL-U Parsing Parse from strings 15x conllu
Parse from files 15x conllu

Running Benchmarks

Each script supports --quick (fewer iterations), --export FILE (JSON output), and --quiet:

python benchmarks/run_chat.py
python benchmarks/run_conllu.py
python benchmarks/run_elan.py
python benchmarks/run_textgrid.py
python benchmarks/run_hmm.py
python benchmarks/run_wordseg.py
python benchmarks/run_perceptron_pos_tagger.py
python benchmarks/run_lm.py

Updating Benchmark Tables

After running benchmarks with --export, update the performance table in benchmarks/README.md:

python benchmarks/run_chat.py --export benchmarks/.results/chat.json
python benchmarks/run_conllu.py --export benchmarks/.results/conllu.json
python benchmarks/run_elan.py --export benchmarks/.results/elan.json
python benchmarks/run_textgrid.py --export benchmarks/.results/textgrid.json
python benchmarks/run_hmm.py --export benchmarks/.results/hmm.json
python benchmarks/run_wordseg.py --export benchmarks/.results/wordseg.json
python benchmarks/run_perceptron_pos_tagger.py --export benchmarks/.results/tagger.json
python benchmarks/run_lm.py --export benchmarks/.results/lm.json

python benchmarks/update_readme.py --from-json benchmarks/.results/

Tips

  • Use --release when building Rustling for accurate benchmarks: maturin develop --release
  • Close other applications to reduce noise
  • Run multiple times to verify consistency
  • Use --quiet with --export for machine-readable output only