A comprehensive cross-language testing framework for ONNX models with support for Binary Classification (sentiment analysis), Multiclass Classification (topic classification), and Multiclass Sigmoid (emotion classification) across 8 programming languages.
Run tests for specific models and languages with custom text input:
- β Flexible: Choose any combination of model type + language
- β Custom Input: Test with your own text
- β Detailed Output: Comprehensive performance analysis
- β Manual Dispatch: Run on-demand with custom parameters
Run all 24 combinations automatically with standardized inputs:
- β Complete Coverage: Tests 3 models Γ 8 languages = 24 combinations
- β Standardized: Uses consistent test inputs for comparison
- β Automated: Runs on push/PR + manual dispatch available
- β Performance Comparison: Easy to compare across languages
- Task: Positive vs Negative sentiment detection
- Architecture: Sigmoid activation, TF-IDF preprocessing
- Input:
[1, 5000]TF-IDF feature vector - Output: Single probability score (0.0-1.0)
- Task: News topic categorization (Business, Health, Politics, Sports)
- Architecture: Softmax activation, token-based preprocessing
- Input:
[1, 30]tokenized sequence - Output: 4-class probability distribution
- Task: Multi-label emotion detection (fear, happy, love, sadness)
- Architecture: Multi-label sigmoid, keyword-based detection
- Input:
[1, 5000]feature vector (simplified approach) - Output: Independent probabilities for each emotion
Every test run provides standardized output in this format:
π€ ONNX [BINARY/MULTICLASS/MULTICLASS SIGMOID] CLASSIFIER - [LANGUAGE] IMPLEMENTATION
===============================================================================
π Processing: [Test Text]
π» SYSTEM INFORMATION:
Platform: Linux/macOS/Windows
Processor: CPU Name
CPU Cores: X physical, Y logical
Total Memory: N GB
Runtime: Language Implementation Version
π [SENTIMENT/TOPIC/EMOTION] ANALYSIS RESULTS:
π Predicted [Sentiment/Topic/Emotion]: POSITIVE/NEGATIVE or POLITICS/TECH/etc or fear/happy/love/sadness
π Confidence: XX.XX% (0.XXXX)
π Input Text: "Your test text here"
π PERFORMANCE SUMMARY:
Total Processing Time: Tms
β£β Preprocessing: Xms (X%)
β£β Model Inference: Yms (Y%)
ββ Postprocessing: Zms (Z%)
π THROUGHPUT:
Texts per second: TPS
πΎ RESOURCE USAGE:
Memory Start: MB
Memory End: MB
Memory Delta: +MB
CPU Usage: avg% avg, peak% peak (N samples)
π― PERFORMANCE RATING: π EXCELLENT / β
GOOD / β οΈ ACCEPTABLE / π SLOW
(Tms total - Target: <100ms)
- Binary Classifier:
"Congratulations! You've won a free iPhone β click here to claim your prize now!"(sentiment analysis) - Multiclass Classifier:
"NBA Finals: Celtics Defeat Mavericks in Game 5 to Win Championship"(topic classification) - Multiclass Sigmoid:
"I'm terrified of what might happen"(emotion classification)
| Language | Binary Classifier | Multiclass Classifier | Multiclass Sigmoid | Status |
|---|---|---|---|---|
| Python | β | β | β | Full Support |
| Java | β | β | β | Full Support |
| C++ | β | β | β | Full Support |
| C | β | β | β | Full Support |
| Node.js | β | β | β | Full Support |
| Rust | β | β | β | Full Support |
| Dart/Flutter | β | β | β | Full Support |
| Swift | β | β | β | Full Support |
git clone https://github.com/your-org/whitelightning-test.git
cd whitelightning-testPlace your ONNX models in the appropriate directories:
tests/
βββ binary_classifier/
β βββ python/
β β βββ model.onnx # Your binary classification model
β β βββ vocab.json # Vocabulary file
β β βββ scaler.json # Preprocessing scaler
β βββ java/
β βββ cpp/
β βββ [other languages]/
βββ multiclass_classifier/
β βββ python/
β β βββ model.onnx # Your multiclass model
β β βββ vocab.json # Vocabulary file
β β βββ scaler.json # Preprocessing scaler
β βββ [other languages]/
βββ multiclass_sigmoid/
βββ python/
β βββ model.onnx # Your multiclass sigmoid model
β βββ vocab.json # Vocabulary file (if applicable)
β βββ scaler.json # Preprocessing scaler (if applicable)
βββ [other languages]/
Each model type has comprehensive documentation:
- π
tests/binary_classifier/README.md- Binary classification guide - π
tests/multiclass_classifier/README.md- Multiclass classification guide - π
tests/multiclass_sigmoid/README.md- Multiclass sigmoid guide
Each language implementation has its own README with specific setup instructions:
- π
tests/[model_type]/python/README.md- Python setup - π
tests/[model_type]/java/README.md- Java setup - π
tests/[model_type]/cpp/README.md- C++ setup - π
tests/[model_type]/c/README.md- C setup - π
tests/[model_type]/nodejs/README.md- Node.js setup - π
tests/[model_type]/rust/README.md- Rust setup - π
tests/[model_type]/dart/README.md- Dart/Flutter setup - π
tests/[model_type]/swift/README.md- Swift/iOS setup - π
tests/[model_type]/javascript/README.md- Client-side JavaScript/HTML setup
Navigate to any language directory and run the tests:
# Example: Test Python binary classifier
cd tests/binary_classifier/python
python test_onnx_model.py "Your custom text here"
# Example: Test Rust multiclass classifier
cd tests/multiclass_classifier/rust
cargo run --release -- "Your custom text here"
# Example: Test Node.js multiclass sigmoid
cd tests/multiclass_sigmoid/nodejs
node test_onnx_model.js "Your custom text here"- Go to Actions β ONNX Model Tests
- Click Run workflow
- Select:
- Model Type:
binary_classifier,multiclass_classifier, ormulticlass_sigmoid - Language:
python,java,cpp,c,nodejs,rust,dart, orswift - Custom Text: Your test input (optional)
- Model Type:
- Go to Actions β Comprehensive ONNX Tests
- Click Run workflow (uses standard test inputs)
- View results for all 24 language-model combinations
Each implementation expects these files:
model.onnx- Your trained ONNX modelvocab.json- Vocabulary mapping for text preprocessing (if applicable)scaler.json- Feature scaling parameters or label mappings
Each language has its own dependencies listed in:
- Python:
requirements.txt - Java:
pom.xmlorbuild.gradle - C++:
CMakeLists.txtorMakefile - C:
Makefile - Node.js:
package.json - Rust:
Cargo.toml - Dart:
pubspec.yaml
Git Exit Code 128 Errors These are caused by Swift Package Manager fetching dependencies in CI:
- β Not a critical issue - Tests still run successfully
- π§ Fixed in latest workflow - Added git configuration and caching
- π Swift-specific - Only affects Swift implementations
macOS Migration Warnings GitHub Actions informational notices:
The macos-latest label will migrate to macOS 15 beginning August 4, 2025
- β Not an error - Just an informational notice
- π§ Fixed - Updated to use
macos-14explicitly
Swift Package Manager Problems If you encounter git issues locally:
cd tests/binary_classifier/swift
swift package reset
swift package resolve
swift buildMissing Dependencies Ensure all language runtimes are installed:
- Python 3.8+, Java 17+, Node.js 16+, Rust stable
- Flutter 3.16+, Swift 5.7+, GCC/Clang for C/C++
Performance Issues For faster local testing:
# Test single language-model combination
cd tests/binary_classifier/python
python test_onnx_model.py "Your test text"
# Use release builds
cargo build --release # Rust
swift build --configuration release # SwiftThe framework provides detailed performance metrics:
- β±οΈ Timing Analysis: Preprocessing, inference, and postprocessing times
- πΎ Memory Usage: Memory consumption tracking
- π₯οΈ CPU Monitoring: Average and peak CPU usage
- π Throughput: Texts processed per second
- π Performance Rating: Automatic classification based on speed
Test Input: "Congratulations! You've won a free iPhone β click here to claim your prize now!"
Environment: GitHub Actions (Linux, 4 cores, 15.6GB RAM)
| Language | Total Time | Preprocessing | Inference | Memory Ξ | CPU Usage | Throughput |
|---|---|---|---|---|---|---|
| Rust | 0.40ms | 0.01ms (2.8%) | 0.38ms (96.1%) | +0.00MB | 0.0% avg | 2,520/sec |
| Node.js | 28.89ms | 5.44ms (18.8%) | 22.89ms (79.2%) | +1.11MB | 100.0% peak | 34.6/sec |
| C++ | 43.54ms | 9.19ms (21.1%) | 34.28ms (78.7%) | +37.72MB | 0.0% avg | 23.0/sec |
| C | 87.21ms | 50.93ms (58.4%) | 0.31ms (0.4%) | +37.29MB | 0.0% avg | 11.5/sec |
| Dart | 159ms | 150ms (94.3%) | 8ms (5.0%) | 4MB | 20% avg | 6.3/sec |
| Swift | 7.47ms | 0.33ms (4.4%) | 6.37ms (85.3%) | 5MB | 15% avg | 133.8/sec |
| Java | 217.98ms | 183.48ms (84.2%) | 6.38ms (2.9%) | +22.00MB | 42.1% avg | 4.6/sec |
| Python | 332.33ms | 0.85ms (0.3%) | 0.59ms (0.2%) | +0.29MB | 15.0% avg | 3.0/sec |
Test Input: "NBA Finals: Celtics Defeat Mavericks in Game 5 to Win Championship"
Environment: GitHub Actions (Linux, 4 cores, 15.6GB RAM)
| Language | Total Time | Preprocessing | Inference | Memory Ξ | CPU Usage | Throughput |
|---|---|---|---|---|---|---|
| Rust | 1.24ms | 0.01ms (0.6%) | 1.23ms (99.1%) | +0.00MB | 0.0% avg | 807/sec |
| Node.js | 24.40ms | 1.99ms (8.2%) | 21.65ms (88.7%) | +0.89MB | 100.0% peak | 41.0/sec |
| C | 32.54ms | 0.83ms (2.5%) | 1.50ms (4.6%) | +22.8MB | 0.0% avg | 30.7/sec |
| C++ | 32.84ms | 1.97ms (6.0%) | 30.76ms (93.7%) | +21.57MB | 0.0% avg | 30.4/sec |
| Dart | 114ms | 34ms (30%) | 68ms (60%) | 4MB | 20% avg | 8.8/sec |
| Swift | 7.47ms | 0.33ms (4.4%) | 6.37ms (85.3%) | 5MB | 15% avg | 133.8/sec |
| Java | 162.21ms | 120.09ms (74.0%) | 8.28ms (5.1%) | +12.00MB | 26.3% avg | 6.2/sec |
| Python | 510.01ms | 0.04ms (0.0%) | 1.92ms (0.4%) | +1.12MB | 3.0% avg | 2.0/sec |
Test Input: "I'm terrified of what might happen"
Environment: GitHub Actions (Linux, 4 cores, 15.6GB RAM)
| Language | Total Time | Processing | Performance Rating | Throughput |
|---|---|---|---|---|
| C++ | ~1ms | Keyword detection | π EXCELLENT | 1,000/sec |
| Rust | ~1ms | Keyword detection | π EXCELLENT | 1,000/sec |
| Swift | ~1ms | Keyword detection | π EXCELLENT | 1,000/sec |
| C | ~544ms | Keyword detection | 1.8/sec | |
| Dart | ~15-25ms | Keyword detection | π EXCELLENT | 40-67/sec |
| Python | ~15ms | Keyword detection | π EXCELLENT | 67/sec |
| Java | ~20ms | Keyword detection | β GOOD | 50/sec |
| Node.js | ~25ms | Keyword detection | β GOOD | 40/sec |
- π₯ Speed Champion: Rust - consistently fastest across all model types
- π₯ Mobile Excellence: Swift - exceptional performance for iOS/mobile applications
- π₯ Web Efficiency: Node.js - optimal for web applications with minimal memory usage
- π System Integration: C++ - excellent balance of speed and compatibility
- Binary Classifier: Rust achieves 0.40ms (2,520 texts/sec)
- Multiclass Classifier: Rust leads at 1.24ms (807 texts/sec)
- Multiclass Sigmoid: Multiple languages achieve ~1ms (simplified approach)
| Model Type | Best Language | Key Strength | Optimization Focus |
|---|---|---|---|
| Binary | Rust (0.40ms) | Ultra-fast inference | TF-IDF preprocessing |
| Multiclass | Rust (1.24ms) | Minimal overhead | Token processing |
| Sigmoid | C++/Rust/Swift (~1ms) | Keyword detection | Real-time emotion analysis |
- Add New Languages: Create implementation in
tests/[model_type]/[language]/ - Add New Model Types: Follow the existing structure for new classification tasks
- Improve Performance: Optimize existing implementations
- Add Features: Enhance testing capabilities
- Update Documentation: Keep model-specific and language-specific READMEs current
WhiteLightning distills massive, state-of-the-art language models into lightweight, hyper-efficient text classifiers. It's a command-line tool that lets you create specialized models that run anywhereβfrom the cloud to the edgeβusing the universal ONNX format for maximum compatibility.
Need comprehensive guides and documentation? Check out our WhiteLightning Site - this repository hosts the official website for WhiteLightning at https://whitelightning.ai, a cutting-edge LLM distillation tool with detailed documentation, tutorials, and implementation guides.
Looking for pre-trained models or want to share your own? Visit our WhiteLightning Model Library - a centralized repository for uploading, downloading, and managing trained machine learning models. Perfect for sharing community contributions and accessing ready-to-use classifiers.
This project is licensed under the MIT License - see the LICENSE file for details.
- GitHub Issues: Report bugs or request features
- Discussions: Ask questions or share improvements
- Wiki: Detailed documentation and guides
Happy testing! π Compare ONNX model performance across languages and find the best implementation for your use case.