gotoon

package module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 13, 2025 License: MIT Imports: 8 Imported by: 11

README ΒΆ

GoTOON - Token-Oriented Object Notation for Go

CI Go Report Card codecov Go Reference Go Version

GoTOON is a Go implementation of TOON (Token-Oriented Object Notation), a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage.

TOON excels at uniform complex objects – multiple fields per row, same structure across items. It achieves 30-60% token reduction compared to JSON while maintaining high LLM comprehension accuracy.

Toon

Why TOON?

LLM tokens cost money, and standard JSON is verbose and token-expensive:

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

TOON conveys the same information with fewer tokens:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Key Features

  • πŸ’Έ Token-efficient: typically 30–60% fewer tokens than JSON
  • 🀿 LLM-friendly guardrails: explicit lengths and field lists help models validate output
  • 🍱 Minimal syntax: removes redundant punctuation (braces, brackets, most quotes)
  • πŸ“ Indentation-based structure: replaces braces with whitespace for better readability
  • 🧺 Tabular arrays: declare keys once, then stream rows without repetition
  • πŸ› οΈ Go-idiomatic API: clean, simple interface with functional options

Installation

go get github.com/alpkeskin/gotoon

Quick Start

package main

import (
    "fmt"
    "log"

    "github.com/alpkeskin/gotoon"
)

func main() {
    data := map[string]interface{}{
        "users": []map[string]interface{}{
            {"id": 1, "name": "Alice", "role": "admin"},
            {"id": 2, "name": "Bob", "role": "user"},
        },
    }

    encoded, err := gotoon.Encode(data)
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println(encoded)
}

Output:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

API

Encode(input interface{}, opts ...EncodeOption) (string, error)

Converts any Go value to TOON format string.

Input normalization:

  • Primitives (bool, int, float, string) are encoded as-is
  • Structs are converted to maps using exported fields (respects json tags)
  • Slices and arrays remain as arrays
  • Maps with string keys remain as objects
  • time.Time is converted to RFC3339Nano format
  • NaN and Infinity become null
  • nil, functions become null

Example:

type User struct {
    ID   int    `json:"id"`
    Name string `json:"name"`
    Role string `json:"role"`
}

users := []User{
    {ID: 1, Name: "Alice", Role: "admin"},
    {ID: 2, Name: "Bob", Role: "user"},
}

encoded, _ := gotoon.Encode(map[string]interface{}{"users": users})
// Output:
// users[2]{id,name,role}:
//   1,Alice,admin
//   2,Bob,user
Encoding Options

GoTOON supports functional options for customization:

WithIndent(n int)

Sets the number of spaces per indentation level (default: 2).

gotoon.Encode(data, gotoon.WithIndent(4))
WithDelimiter(d string)

Sets the delimiter for array values and tabular rows. Valid values: "," (comma, default), "\t" (tab), "|" (pipe).

// Using tab delimiter
gotoon.Encode(data, gotoon.WithDelimiter("\t"))
// Output:
// users[2	]{id	name	role}:
//   1	Alice	admin
//   2	Bob	user
WithLengthMarker()

Adds # prefix to array lengths for clarity (e.g., [#3] instead of [3]).

gotoon.Encode(data, gotoon.WithLengthMarker())
// Output:
// users[#2]{id,name,role}:
//   1,Alice,admin
//   2,Bob,user
Combining Options
encoded, _ := gotoon.Encode(data,
    gotoon.WithIndent(4),
    gotoon.WithDelimiter("\t"),
    gotoon.WithLengthMarker(),
)

Format Overview

Objects

Simple objects with primitive values:

data := map[string]interface{}{
    "id":     123,
    "name":   "Ada",
    "active": true,
}
// Output:
// id: 123
// name: Ada
// active: true

Nested objects:

data := map[string]interface{}{
    "user": map[string]interface{}{
        "id":   123,
        "name": "Ada",
    },
}
// Output:
// user:
//   id: 123
//   name: Ada
Arrays
Primitive Arrays (Inline)
data := map[string]interface{}{
    "tags": []string{"admin", "ops", "dev"},
}
// Output:
// tags[3]: admin,ops,dev
Arrays of Objects (Tabular)

When all objects share the same primitive fields, TOON uses an efficient tabular format:

data := map[string]interface{}{
    "items": []map[string]interface{}{
        {"sku": "A1", "qty": 2, "price": 9.99},
        {"sku": "B2", "qty": 1, "price": 14.5},
    },
}
// Output:
// items[2]{price,qty,sku}:
//   9.99,2,A1
//   14.5,1,B2
Mixed and Non-Uniform Arrays

Arrays that don't meet tabular requirements use list format:

data := map[string]interface{}{
    "items": []interface{}{
        1,
        "text",
        map[string]interface{}{"key": "value"},
    },
}
// Output:
// items[3]:
//   - 1
//   - text
//   - key: value
Quoting Rules

TOON quotes strings only when necessary to maximize token efficiency:

  • Empty strings: ""
  • Contains delimiter, colon, quotes, or control chars: "hello, world"
  • Leading/trailing spaces: " padded "
  • Looks like boolean/number/null: "true", "42"
  • Unicode and emoji are safe unquoted: hello πŸ‘‹ world

Examples

See the examples/basic directory for more comprehensive examples including:

  • Simple objects
  • Tabular arrays
  • Nested structures
  • Primitive arrays
  • Using structs with JSON tags
  • Custom delimiters
  • Mixed arrays
  • Time values
  • E-commerce orders

Run the examples:

cd examples/basic
go run main.go

Testing

go test -v

Benchmarks

Based on the original TOON benchmarks using GPT-5's tokenizer:

  • GitHub Repositories (100 repos): 42.3% token reduction vs JSON
  • Daily Analytics (180 days): 58.9% token reduction vs JSON
  • E-Commerce Order: 35.4% token reduction vs JSON

Overall: 49.1% token reduction vs JSON across all benchmarks

Comparison with JSON

JSON (257 tokens):

{
  "order": {
    "id": "ORD-12345",
    "customer": {
      "name": "John Doe",
      "email": "john@example.com"
    },
    "items": [
      { "sku": "WIDGET-1", "quantity": 2, "price": 19.99 },
      { "sku": "GADGET-2", "quantity": 1, "price": 49.99 }
    ],
    "total": 89.97
  }
}

TOON (166 tokens - 35.4% reduction):

order:
  customer:
    email: john@example.com
    name: John Doe
  id: ORD-12345
  items[2]{price,quantity,sku}:
    19.99,2,WIDGET-1
    49.99,1,GADGET-2
  total: 89.97

Project Structure

gotoon/
β”œβ”€β”€ go.mod              # Go module definition
β”œβ”€β”€ README.md           # This file
β”œβ”€β”€ toon.go             # Public API (Encode function)
β”œβ”€β”€ types.go            # Options and type definitions
β”œβ”€β”€ constants.go        # String constants and delimiters
β”œβ”€β”€ normalize.go        # Value normalization and type guards
β”œβ”€β”€ writer.go           # LineWriter implementation
β”œβ”€β”€ primitives.go       # Primitive encoding and quoting
β”œβ”€β”€ encoders.go         # Core encoding logic
β”œβ”€β”€ toon_test.go        # Unit tests
└── examples/
    └── basic/
        └── main.go     # Example usage

Implementation Notes

  • Deterministic output: Map keys are sorted alphabetically for consistent encoding
  • Reflection-based normalization: Automatically converts structs, slices, and maps
  • Efficient string building: Uses strings.Builder for performance
  • Type-safe options: Functional options pattern for clean API
  • Comprehensive testing: Full test coverage with table-driven tests

Using TOON with LLMs

TOON works best when you show the format instead of describing it. The structure is self-documenting – models parse it naturally once they see the pattern.

Sending TOON to LLMs (Input)

Wrap your encoded data in a fenced code block:

```toon
users[3]{id,name,role}:
  1,Alice,admin
  2,Bob,user
  3,Charlie,user
```
Generating TOON from LLMs (Output)

For output, be more explicit:

Data is in TOON format (2-space indent, arrays show length and fields).

Task: Return only users with role "user" as TOON. Use the same header.
Set [N] to match the row count. Output only the code block.

Credits

GoTOON is a Go port of the original TOON format created by Johann Schopplich.

Documentation ΒΆ

Overview ΒΆ

Package gotoon provides encoding for Token-Oriented Object Notation (TOON), a compact, human-readable format designed for passing structured data to Large Language Models with significantly reduced token usage.

TOON is optimized for uniform complex objects and provides 30-60% token reduction compared to JSON while maintaining high LLM comprehension accuracy.

Example usage:

data := map[string]interface{}{
	"users": []map[string]interface{}{
		{"id": 1, "name": "Alice", "role": "admin"},
		{"id": 2, "name": "Bob", "role": "user"},
	},
}

encoded, err := gotoon.Encode(data)
if err != nil {
	log.Fatal(err)
}
fmt.Println(encoded)
// Output:
// users[2]{id,name,role}:
//   1,Alice,admin
//   2,Bob,user

Index ΒΆ

Constants ΒΆ

View Source
const (
	ListItemMarker = "-"
	ListItemPrefix = "- "
)

List markers

View Source
const (
	Comma = ","
	Colon = ":"
	Space = " "
	Pipe  = "|"
	Tab   = "\t"
)

Structural characters

View Source
const (
	OpenBracket  = "["
	CloseBracket = "]"
	OpenBrace    = "{"
	CloseBrace   = "}"
)

Brackets and braces

View Source
const (
	NullLiteral  = "null"
	TrueLiteral  = "true"
	FalseLiteral = "false"
)

Literals

View Source
const (
	Backslash      = "\\"
	DoubleQuote    = "\""
	Newline        = "\n"
	CarriageReturn = "\r"
)

Escape characters

View Source
const (
	DelimiterComma = ","
	DelimiterTab   = "\t"
	DelimiterPipe  = "|"
)

Delimiters

View Source
const DefaultDelimiter = DelimiterComma

DefaultDelimiter is the default delimiter for arrays and tabular data

Variables ΒΆ

This section is empty.

Functions ΒΆ

func Encode ΒΆ

func Encode(input interface{}, opts ...EncodeOption) (string, error)

Encode converts any Go value to TOON format string.

The input value is normalized to a JSON-compatible representation:

  • Primitives (bool, int, float, string) are encoded as-is
  • Structs are converted to maps using exported fields (respects json tags)
  • Slices and arrays remain as arrays
  • Maps with string keys remain as objects
  • time.Time is converted to RFC3339Nano format
  • NaN and Infinity become null
  • Nil, undefined, functions become null

Options can be provided to customize the encoding:

  • WithIndent(n): Set indentation size (default: 2 spaces)
  • WithDelimiter(d): Set delimiter for arrays ("," | "\t" | "|", default: ",")
  • WithLengthMarker(): Add "#" prefix to array lengths (e.g., [#3])

Example with options:

encoded, err := gotoon.Encode(data,
	gotoon.WithIndent(4),
	gotoon.WithDelimiter("\t"),
	gotoon.WithLengthMarker(),
)

Types ΒΆ

type EncodeOption ΒΆ

type EncodeOption func(*EncodeOptions)

EncodeOption is a function that modifies EncodeOptions

func WithDelimiter ΒΆ

func WithDelimiter(d string) EncodeOption

WithDelimiter sets the delimiter for array values and tabular rows

func WithIndent ΒΆ

func WithIndent(n int) EncodeOption

WithIndent sets the number of spaces per indentation level

func WithLengthMarker ΒΆ

func WithLengthMarker() EncodeOption

WithLengthMarker enables the length marker prefix for arrays

type EncodeOptions ΒΆ

type EncodeOptions struct {
	// Indent is the number of spaces per indentation level (default: 2)
	Indent int

	// Delimiter is the delimiter to use for array values and tabular rows
	// Valid values: "," (comma), "\t" (tab), "|" (pipe)
	// Default: ","
	Delimiter string

	// LengthMarker when true adds "#" prefix to array lengths (e.g., [#3] instead of [3])
	// Default: false
	LengthMarker bool
}

EncodeOptions represents the options for encoding values to TOON format

type LineWriter ΒΆ

type LineWriter struct {
	// contains filtered or unexported fields
}

LineWriter manages indented line output for TOON format

func NewLineWriter ΒΆ

func NewLineWriter(indentSize int) *LineWriter

NewLineWriter creates a new LineWriter with the specified indentation size

func (*LineWriter) Push ΒΆ

func (w *LineWriter) Push(depth int, content string)

Push adds a new line with the specified depth and content

func (*LineWriter) String ΒΆ

func (w *LineWriter) String() string

String returns the accumulated lines joined with newlines

Directories ΒΆ

Path Synopsis
examples
basic command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL