Human-readable data serialization for the modern age
TOON (Token-Oriented Object Notation) is a compact, human-readable serialization format designed for passing structured data to Large Language Models with significantly reduced token usage. Created by Johann Schopplich, TOON combines the best aspects of JSON, YAML, and CSV:
- Token-efficient: Typically 30-60% fewer tokens than JSON
- Human-readable: Clean indentation-based syntax without excessive punctuation
- Machine-friendly: Unambiguous parsing with strict mode validation
- Tabular arrays: Automatic CSV-like rendering for uniform object arrays
- Git-friendly: Line-oriented format that plays well with version control diffs
This repository provides a complete Rust implementation following the official TOON v3.0 specification with zero-copy parsing, streaming serialization via serde, and a full-featured CLI.
// JSON: verbose, hard to read large datasets
{"users": [{"id": 1, "name": "Alice", "role": "admin"}, {"id": 2, "name": "Bob", "role": "user"}]}
// TOON: clean, structured, tabular when appropriate
users[2]{id,name,role}:
1,Alice,admin
2,Bob,userTOON automatically detects when arrays contain uniform objects and renders them as inline tables, dramatically improving readability for datasets while maintaining full structural fidelity.
- Zero-copy scanner and parser (borrowed slices) for fast decode
- Direct serde::Deserializer over the scanner (feature
de_direct) - Smart tabular arrays β CSV-like rows under a
[N]{fields}:header for uniform object arrays - Strict mode β Optional validation for production-grade data integrity
- Streaming serialization β Memory-efficient encoding of large datasets
- Full serde integration β Serialize/deserialize any Rust type with
#[derive] - Key folding β Collapse nested single-key objects into dotted paths (
a.b.c: value) - Path expansion β Decode dotted keys back into nested structures
- DateTime support β Native
chronointegration (optional feature) - Powerful CLI β Standalone tool for JSON β TOON conversion
- Spec conformant β Full v3.0 test suite (345/345 tests passing)
- Configurable delimiters β Use comma (default), tab, or pipe
- Configurable indentation β Custom indent size (default: 2 spaces)
- No-std support β Works in embedded environments (with
alloc)
Per the TOON v1.4 specification, this implementation normalizes host values to the JSON data model before encoding:
- Non-finite floats (
NaN,Infinity,-Infinity) are emitted asnull. - Canonical numbers always use decimal form with no exponent, no trailing zeros, and
-0normalized to0. - Round-trip guarantee: for any JSON-representable value
x,decode(encode(x)) == x. - Out-of-range numeric literals rely on Rustβs
f64semantics when decoding; overflow produces Β±infinity, which the encoder will subsequently normalize tonullunless the application handles them explicitly.
Feature flags:
de_directβ enable direct Deserializer (bypasses intermediate JSON Value)perf_memchrβ faster scanning/splitting via memchrperf_smallvecβ reduce small allocations in hot pathsperf_lexicalβ faster numeric parsing via lexical-corechronoβ DateTime support
Add to your Cargo.toml:
[dependencies]
toon = "0.1"
# Optional: DateTime serialization
toon = { version = "0.1", features = ["chrono"] }cargo install toon-cli
# Or run from source
cargo install --path crates/toon-cliuse toon::Options;
use serde_json::json;
// Encode JSON value to TOON
let data = json!({
"name": "toon-rs",
"version": "0.1.0",
"features": ["fast", "safe", "expressive"]
});
let opts = Options::default();
let toon_str = toon::encode_to_string(&data, &opts)?;
println!("{}", toon_str);
// Decode back to JSON
let value: serde_json::Value = toon::decode_from_str(&toon_str, &opts)?;
assert_eq!(value, data);use serde::{Serialize, Deserialize};
#[derive(Debug, Serialize, Deserialize, PartialEq)]
struct Config {
database: Database,
servers: Vec<Server>,
}
#[derive(Debug, Serialize, Deserialize, PartialEq)]
struct Database {
host: String,
port: u16,
ssl: bool,
}
#[derive(Debug, Serialize, Deserialize, PartialEq)]
struct Server {
name: String,
region: String,
capacity: u32,
}
let config = Config {
database: Database {
host: "db.example.com".into(),
port: 5432,
ssl: true,
},
servers: vec![
Server { name: "web-1".into(), region: "us-east".into(), capacity: 1000 },
Server { name: "web-2".into(), region: "eu-west".into(), capacity: 1500 },
],
};
// Serialize with streaming (memory efficient)
let opts = Options::default();
let toon_output = toon::ser::to_string_streaming(&config, &opts)?;
// Deserialize back
let parsed: Config = toon::de::from_str(&toon_output, &opts)?;
assert_eq!(config, parsed);Output:
database:
host: db.example.com
port: 5432
ssl: true
servers[2]{name,region,capacity}:
web-1,us-east,1000
web-2,eu-west,1500
# Convert JSON to TOON
toon-cli data.json > data.toon
# Convert TOON to JSON
toon-cli --decode data.toon > data.json
# Use pipes
echo '{"hello": "world"}' | toon-cli | toon-cli --decode
# Custom delimiter for tables
toon-cli --delimiter ',' data.json
# Strict mode validation
toon-cli --strict --decode data.toonTOON includes WebAssembly bindings for use in browsers and Node.js.
Try the live demo on GitHub Pages β
Build locally:
# Install wasm-pack
cargo install wasm-pack
# Build the WASM module
./examples/web/build.sh
# Serve the example page
cd examples/web
python3 -m http.server 8000Then open http://localhost:8000 to try the interactive JSON β TOON converter.
JavaScript API:
import init, { json_to_toon, toon_to_json } from './pkg/toon_wasm.js';
await init();
// Convert JSON to TOON
const toon = json_to_toon(
'{"name": "Alice", "age": 30}',
true, // use pipe delimiter
false // strict mode
);
// Convert TOON to JSON
const json = toon_to_json(
'name: Alice\nage: 30',
false, // strict mode
true // pretty format
);See examples/web/README.md for details.
Criterion benchmarks (decode) with optional perf features and direct deserializer produce large gains on typical datasets. To run:
# Save a baseline
cargo bench --bench decode_bench -- --sample-size 200 --measurement-time 10 --warm-up-time 5 --save-baseline before
# Compare after enabling direct + perf features
cargo bench --bench decode_bench \
--features "de_direct perf_memchr perf_smallvec" \
-- --sample-size 200 --measurement-time 10 --warm-up-time 5 --baseline beforeHighlights (representative):
- small documents: ~1.5β2.0x faster decode
- 1k-row tabular arrays: ~2β3x faster decode
Notes:
perf_lexicalcan further improve numeric-heavy workloads.- Results vary by CPU and dataset.
We use cargo-fuzz (libFuzzer) to stress the decoder.
cargo install cargo-fuzz
# From repo root
cargo fuzz run fuzz_decode_default -- -runs=0
cargo fuzz run fuzz_decode_strict -- -runs=0# Clone the repository
git clone https://github.com/toon-format/toon
cd toon
# Initialize spec conformance fixtures
git submodule update --init --recursive
# Build everything
cargo build --workspace
# Run tests
cargo test --workspace
# Run conformance tests
TOON_CONFORMANCE=1 cargo test -p toon --tests
# Run CLI
cargo run -p toon-cli -- --help
# Format and lint
cargo fmt --all
cargo clippy --workspace -- -D warnings# Build (alloc-only)
cargo check -p toon --no-default-features --features "alloc,serde"
# Run alloc-only tests (JSON interop disabled)
cargo test -p toon --no-default-features --features "alloc,serde"# Only the conformance test file
TOON_CONFORMANCE=1 cargo test -p toon --test spec_conformance
# Filter to one group
TOON_CONFORMANCE=1 cargo test -p toon --test spec_conformance decode_fixtures
TOON_CONFORMANCE=1 cargo test -p toon --test spec_conformance encode_fixturesNote: Tests are skipped if TOON_CONFORMANCE is unset or fixtures are missing under spec/tests/fixtures. Decode conformance runs with strict validation; encode comparison normalizes newlines.
toon-rs/
βββ crates/
β βββ toon/ # Core library
β β βββ encode/ # TOON encoder
β β βββ decode/ # TOON parser
β β βββ ser/ # serde::Serializer
β β βββ de/ # serde::Deserializer
β βββ toon-cli/ # Command-line tool
β βββ toon-wasm/ # WebAssembly bindings
βββ examples/
β βββ web/ # Interactive web demo
βββ spec/ # Official test fixtures (submodule)
βββ README.md
This implementation follows the official TOON v3.0 specification with extensive conformance testing (345/345 tests passing). Key features:
- Line-oriented: Each logical token on its own line (or tabular row)
- Indentation-based: Configurable indentation (default: 2 spaces)
- Minimal quoting: Strings only quoted when necessary (delimiters, special chars)
- Tabular arrays: Compact
[N]{fields}:header format for uniform object arrays - Key folding: Encode nested single-key paths as
a.b.c: value - Path expansion: Decode dotted keys into nested objects (
expandPaths: safe) - Strict validation: Optional mode catches malformed data and conflicts
For the complete format specification and conformance tests, see:
We welcome contributions! Here's how to get started:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes with tests
- Ensure tests pass (
cargo test --workspace) - Run conformance tests (
TOON_CONFORMANCE=1 cargo test -p toon --tests) - Format your code (
cargo fmt --all) - Lint with clippy (
cargo clippy --workspace -- -D warnings) - Commit with conventional format:
feat(encoder): add support for X - Push to your fork
- Open a Pull Request
- Performance optimizations and benchmarks
- Additional CLI features and utilities
- Documentation improvements and examples
- Property-based testing with proptest
- Fuzzing harness for parser robustness
- Language bindings (Python, JavaScript, etc.)
- Strengthen CI by caching
cargo-hackinstalls or sharing a toolchain stage so the feature-matrix job stays fast - Expand alloc/no-std smoke tests that don't rely on
serde_jsonto catch regressions early - Add a βFeature matrix & testingβ section that explains the default/perf/no-std bundles and which commands exercise them
- Expose lower-level APIs (around
LineWriter) for users who want tabular control without going throughserde_json::Value - Add Criterion benchmarks for the streaming serializer path to quantify perf flags such as
perf_smallvecandde_direct
- Core encoding/decoding
- Full serde integration
- Streaming serialization
- Tabular array detection
- Strict mode validation
- CLI tool
- Spec conformance tests
- Performance benchmarks
- Property-based tests
- No-std support (alloc-only)
- WASM bindings
- Language server protocol (LSP) for editors
This project is licensed under the MIT License - see the LICENSE file for details.
- TOON format created by Johann Schopplich
- Official specification maintained at toon-format/spec
- Reference implementation in TypeScript: toon-format/toon
- Repository structure inspired by serde's excellent design
- Built with Rust's powerful serialization ecosystem
- Official TOON TypeScript/JavaScript β Reference implementation with npm packages
- TOON Specification β Format specification v3.0 and conformance tests
- Other TOON implementations β Community ports to Python, Go, .NET, Swift, and more