Kotlin/JVM implementation of GCF -- the most token-efficient wire format for LLMs. A drop-in alternative to JSON and TOON for any structured data.
79% fewer input tokens than JSON. 63% fewer output tokens. 90.7% average comprehension accuracy across 10 models and 3 providers (four models hit 100%). 1,300+ LLM evaluations. Zero training.
Docs: gcformat.com · Playground · GCF vs TOON
Add the JitPack repository, then the dependency:
repositories {
maven("https://jitpack.io")
}
dependencies {
implementation("com.github.blackwell-systems:gcf-kotlin:v0.5.0")
}repositories {
maven { url 'https://jitpack.io' }
}
dependencies {
implementation 'com.github.blackwell-systems:gcf-kotlin:v0.5.0'
}<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
</repository>
</repositories>
<dependency>
<groupId>com.github.blackwell-systems</groupId>
<artifactId>gcf-kotlin</artifactId>
<version>v0.5.0</version>
</dependency>Don't want to change code? Use the MCP proxy for zero-code adoption.
import com.blackwellsystems.gcf.*
val output = encodeGeneric(mapOf(
"employees" to listOf(
mapOf("id" to 1, "name" to "Alice", "department" to "Engineering", "salary" to 95000),
mapOf("id" to 2, "name" to "Bob", "department" to "Sales", "salary" to 72000),
)
))Output:
## employees [2]{department,id,name,salary}
Engineering|1|Alice|95000
Sales|2|Bob|72000
val payload = Payload(
tool = "context_for_task", tokenBudget = 5000, tokensUsed = 1847,
symbols = listOf(
Symbol(qualifiedName = "pkg.Auth", kind = "function", score = 0.78, provenance = "lsp", distance = 0),
Symbol(qualifiedName = "pkg.Server", kind = "function", score = 0.54, provenance = "lsp", distance = 1),
),
edges = listOf(Edge(source = "pkg.Server", target = "pkg.Auth", edgeType = "calls"))
)
val output = encode(payload)Output:
GCF tool=context_for_task budget=5000 tokens=1847 symbols=2 edges=1
## targets
@0 fn pkg.Auth 0.78 lsp
## related
@1 fn pkg.Server 0.54 lsp
## edges [1]
@0<@1 calls
val p = decode(input)
println("${p.tool} ${p.symbols.size} symbols ${p.edges.size} edges")Throws DecodeException on invalid input.
Track transmitted symbols across multiple tool responses. Previously-sent symbols become bare references instead of full declarations:
val session = Session()
val out1 = encodeWithSession(payload1, session) // full declarations
val out2 = encodeWithSession(payload2, session) // reused symbols as "@N # previously transmitted"By the 5th call in a session: 92.7% token savings vs JSON.
Write GCF output incrementally as symbols and edges arrive. Zero buffering, O(1) memory per row:
val enc = StreamEncoder(writer, "context_for_task", StreamOptions(tokenBudget = 5000))
enc.writeSymbol(Symbol(qualifiedName = "pkg.Auth", kind = "function", score = 0.95, provenance = "lsp", distance = 0))
enc.writeEdge(Edge(source = "pkg.Server", target = "pkg.Auth", edgeType = "calls"))
enc.close() // emits ## _summary trailerOutput uses [?] deferred counts and ## _summary trailer. Standard decode() handles streaming output with no changes. Thread-safe via @Synchronized.
When the consumer already has a prior context pack, send only what changed:
val delta = DeltaPayload(
tool = "context_for_task",
baseRoot = "aaa111",
newRoot = "bbb222",
removed = listOf(Symbol(qualifiedName = "pkg.OldFunc", kind = "function")),
added = listOf(Symbol(qualifiedName = "pkg.NewFunc", kind = "function", score = 0.85, provenance = "rwr")),
deltaTokens = 30,
fullTokens = 200
)
val output = encodeDelta(delta)81.2% savings on re-queries where the pack changed slightly.
Encode any value (not just graph payloads) into GCF tabular format:
val data = mapOf(
"employees" to listOf(
mapOf("id" to 1, "name" to "Alice", "department" to "Engineering", "salary" to 95000),
mapOf("id" to 2, "name" to "Bob", "department" to "Sales", "salary" to 72000),
)
)
val output = encodeGeneric(data)Output:
## employees [2]{department,id,name,salary}
Engineering|1|Alice|95000
Sales|2|Bob|72000
Works on maps, lists, and primitives. Arrays of uniform maps get tabular rows. Nested maps use ## key section headers.
| Function | Description |
|---|---|
encode(payload: Payload): String |
Encode a graph payload to GCF text |
encodeGeneric(data: Any?): String |
Encode any value to GCF tabular format |
decode(input: String): Payload |
Parse GCF text back to a Payload |
encodeWithSession(payload: Payload, session: Session?): String |
Encode with session deduplication |
encodeDelta(delta: DeltaPayload): String |
Encode a delta (added/removed only) |
Session() |
Create a new session tracker (thread-safe) |
| Type | Purpose |
|---|---|
Payload |
Full GCF payload: tool, budget, symbols, edges, pack root |
Symbol |
Graph node: qualified name, kind, score, provenance, distance |
Edge |
Directed relationship: source, target, edge type |
DeltaPayload |
Diff between two packs: added/removed symbols and edges |
Session |
Thread-safe tracker for multi-call deduplication |
Components |
Score breakdown: blast radius, confidence, recency, distance |
DecodeException |
Thrown on invalid GCF input |
kindAbbrev / kindExpand |
Bidirectional kind abbreviation maps |
1,300+ LLM evaluations across 10 models, 3 providers, and 51 independent test runs.
| GCF | TOON | JSON | |
|---|---|---|---|
| Comprehension (23 runs, 10 models) | 90.7% | 68.5% | 53.6% |
| Generation (28 runs, 9 models) | 5/5 | 1.0/5 | 5.0/5 |
| Input tokens (500 symbols) | 11,090 | 16,378 | 53,341 |
| Output tokens (100 symbols) | 5,976 | 8,937 | 16,121 |
GCF wins all 6 datasets on TOON's own benchmark. Full results: gcformat.com/guide/benchmarks
- Documentation
- Playground
- Specification
- Go library
- TypeScript library
- Python library
- Rust library
- MCP Proxy (zero-code adoption)
- GCF vs TOON
MIT - Dayna Blackwell