Skip to content

blackwell-systems/gsm

Repository files navigation

gsm - Governed State Machines

Blackwell Systems™ Go Reference Go Report Card CI DOI

Act like UDP, receive like TCP.

What if distributed systems don't have to coordinate - because they agree on the rules ahead of time, so ordering and compensations are deterministic? The underlying theory - normalization confluence - proves that compensation is sufficient for convergence when two algebraic properties hold, without the expressiveness limits of CRDTs or the latency cost of consensus.

gsm is a Go library for constructing state machines where events may arrive out of order and violate business rules, but automatic compensation ensures all replicas converge to the same valid state. Convergence is verified at build time via exhaustive state-space enumeration. Runtime event application is O(1) table lookup with zero compensation overhead.

CRDTs solve convergence by requiring operations to commute. But when your operations can violate business invariants - shipping an unpaid order, overdrawing an account - commutativity alone isn't enough. gsm provides convergence through compensation: declare what valid means and how to repair violations, and the library proves that all event orderings converge to the same valid state.

Example: Order Fulfillment

Full example from the paper (see gsm_test.go):

r := gsm.NewRegistry("order_fulfillment")

status := r.Enum("status", "pending", "paid", "shipped", "cancelled")
paid := r.Bool("paid")
inventory := r.Int("inventory", 0, 5)

// Invariant: can't ship unpaid orders
r.Invariant("no_ship_unpaid").
    Watches(status, paid).
    Holds(func(s gsm.State) bool {
        return s.Get(status) != "shipped" || s.GetBool(paid)
    }).
    Repair(func(s gsm.State) gsm.State {
        return s.Set(status, "pending")
    }).
    Add()

// Invariant: inventory can't go negative
r.Invariant("stock_non_negative").
    Watches(inventory).
    Holds(func(s gsm.State) bool {
        return s.GetInt(inventory) >= 0
    }).
    Repair(func(s gsm.State) gsm.State {
        return s.SetInt(inventory, 0)
    }).
    Add()

r.Event("process_payment").
    Writes(status, paid).
    Guard(func(s gsm.State) bool {
        return s.Get(status) == "pending"
    }).
    Apply(func(s gsm.State) gsm.State {
        return s.Set(status, "paid").SetBool(paid, true)
    }).
    Add()

r.Event("ship_item").
    Writes(status, inventory).
    Guard(func(s gsm.State) bool {
        return s.Get(status) == "paid" && s.GetInt(inventory) > 0
    }).
    Apply(func(s gsm.State) gsm.State {
        return s.Set(status, "shipped").SetInt(inventory, s.GetInt(inventory)-1)
    }).
    Add()

r.Event("restock").
    Writes(inventory).
    Apply(func(s gsm.State) gsm.State {
        return s.SetInt(inventory, s.GetInt(inventory)+1)
    }).
    Add()

// Only check independent pairs (restock comes from different source)
r.Independent("process_payment", "restock")
r.Independent("ship_item", "restock")

machine, report, err := r.Build() // Verifies convergence
if err != nil {
    panic(fmt.Sprintf("convergence not guaranteed: %v\n%s", err, report))
}

// Runtime: O(1) table lookup, no compensation logic runs
s := machine.NewState()
s = machine.Apply(s, "ship_item")       // Arrives before payment
s = machine.Apply(s, "process_payment") // Arrives after shipment
// Compensation fired automatically - converges to valid state

New to convergent systems? See CONCEPTS.md for foundational definitions, theory explanations, and a glossary mapping paper terms to code.

Want rigorous mathematical foundations? See THEORY.md for formal definitions, proofs, and connections to rewriting theory.

Research paper: Normalization Confluence in Federated Registry Networks (Blackwell, 2026)

When to Use gsm

Use gsm when:

  • Building event-sourced systems with out-of-order events
  • Operations can violate invariants (need compensation/repair)
  • State space is finite and enumerable (< ~1M states)
  • You want mathematical convergence guarantees
  • You're using Go

Don't use gsm when:

  • Operations already commute (use CRDTs instead)
  • Operations preserve invariants in all orderings (use invariant confluence)
  • State space is unbounded or infinite
  • Real-time latency requirements conflict with build-time verification cost

How It Works

Build Time (Verification)

When you call registry.Build():

  1. Enumerate state space - All combinations of variable values (must be finite)
  2. Compute normal forms - For every state, apply compensation until valid
  3. Verify WFC - Compensation terminates and reaches valid states
  4. Build step table - For every (event, state) pair, precompute the normal form after applying the event
  5. Verify CC - Different event orderings reach the same normal form (with footprint optimization)

If verification passes, you get an immutable Machine with precomputed lookup tables.

If verification fails, you get a detailed report showing:

  • Which events violate CC
  • At which state the violation occurs
  • The divergent traces (order 1 vs order 2)

Runtime (O(1) Execution)

machine.Apply(state, "ship_item")

This does one table lookup: step[event_index][state_id] returns the precomputed normal form.

No compensation logic runs at runtime. All the complexity is resolved at build time.

Core Concepts

Invariants

An invariant is a property that must always hold on valid states. Each invariant has three parts:

  • Watches(vars...): Declares which variables the invariant depends on (its "footprint"). The repair function can only modify these variables.
  • Holds(func): The boolean condition that must be true. When this returns false, compensation fires.
  • Repair(func): How to fix states where the invariant is violated. This is the compensation function.

Invariants fire in declaration order (priority). If multiple invariants are violated, the first one repairs first, then the next, until all hold.

Compensation

Compensation is the automatic repair process that fires when events violate invariants:

  1. Event modifies state (e.g., "withdraw $100")
  2. Invariant check fails (e.g., balance becomes -50)
  3. Repair function fires (e.g., set balance to 0)
  4. Repeat until all invariants hold

For convergence, compensation must be:

  • Well-Founded (WFC): Repairs must eventually terminate (no infinite loops)
  • Commutative (CC): Event order shouldn't matter after compensation runs

The library verifies both properties at build time by exhaustively checking all possible states and event orderings.

Events

Events are operations that modify state. Each event declares:

  • Writes(vars...): Which variables this event modifies
  • Guard(func): Optional precondition - if false, event is a no-op
  • Apply(func): The effect function that transforms the state

Events can arrive in any order. The library verifies that different orderings converge to the same final state.

Independence Declarations

By default, gsm checks all event pairs for commutativity. For large systems, you can optimize by declaring which pairs are independent:

// Calling Independent() automatically switches to declared-only mode
r.Independent("deposit", "send_notification")
r.Independent("withdraw", "send_notification")

Independent events can arrive in either order (they're not causally related). Only declared pairs will be checked for commutativity.

Tip: Events with disjoint Writes() sets and non-overlapping invariant footprints are automatically proved commutative via footprint analysis (no exhaustive checking needed).

API Overview

Using Machines

// Create initial state (all variables at min/first value)
s := machine.NewState()

// Apply events (returns new state, original unchanged)
s = machine.Apply(s, "increment")
s = machine.Apply(s, "increment")

// Check validity
if machine.IsValid(s) {
    fmt.Println("State satisfies all invariants")
}

// Manually normalize (usually not needed - Apply does this)
s = machine.Normalize(s)

// Get event list
events := machine.Events() // ["increment", "enable", "disable"]

Reading State

// Enum variables
status := s.Get(statusVar)           // returns string

// Bool variables
enabled := s.GetBool(enabledVar)     // returns bool

// Int variables
count := s.GetInt(countVar)          // returns int (adjusted for min offset)

Writing State

// Enum (panics if value not in declared set)
s = s.Set(statusVar, "active")

// Bool
s = s.SetBool(enabledVar, true)

// Int (silently clamped to declared range - SetInt(countVar, 999) on [0,100] becomes 100)
s = s.SetInt(countVar, 42)

Verification Report

The Report returned by Build() shows:

Machine: order_fulfillment
  Variables: 3
  States: 48
  Events: 5

  WFC: PASS (max repair depth: 1)
  CC (Compensation Commutativity): PASS (3 pairs: 3 disjoint, 0 brute-force)

  Convergence: GUARANTEED
Checked in: 234µs

WFC (Well-Founded Compensation): Compensation terminates from every state. The report shows the maximum number of repair steps needed.

CC (Compensation Commutativity): Different event orderings converge to the same normal form. The report shows:

  • Disjoint pairs - Proved by footprint analysis (no exhaustive check needed)
  • Brute-force pairs - Checked by testing all states

If verification fails, you get a counterexample:

CC (Compensation Commutativity): FAIL
  Events: (grant_read, grant_write)
  State:  {can_read=false, can_write=false}
  grant_read→grant_write: {can_read=true, can_write=true}
  grant_write→grant_read: {can_read=true, can_write=false}

This shows the exact state and event pair where CC fails, plus the divergent traces.

Performance

Build Time

Verification cost depends on state space size:

States Variables Build Time Notes
6 2 bools < 1ms Instant
48 1 enum(5) + 1 bool + 1 int[0..5] < 1ms Instant
1,024 10 bools ~5ms Very fast
1M ~20 bits total ~500ms Acceptable

State space grows as the product of variable domains: 5 enums × 100 ints = 500 states.

Default limit: 2²⁰ ≈ 1M states. Configurable but exhaustive verification becomes slow beyond this.

Runtime

Event application: O(1) - single array lookup, no computation.

Memory: One uint64 per state for normal form table, plus one uint64 per (event, state) pair for step table. For 1M states × 10 events = ~80MB.

Relationship to the Paper

This library implements the single-registry governance model from Section 3 of the paper:

  • Registry = the machine definition (variables, invariants, compensation, events)
  • WFC (Definition 4.1) = well-founded measure on compensation depth
  • CC (Definition 4.3) = compensation commutativity (CC1 + CC2)
  • Theorem 5.1 = WFC + CC ⟹ unique normal forms (proven via Newman's Lemma)
  • Section 9 = verification calculus with footprint optimization (implemented in verify.go)

The paper proves: if WFC and CC hold, all processors consuming the same events converge to the same valid state regardless of application order.

This library verifies: does your machine satisfy WFC and CC?

Limitations

  • Finite state spaces only - Cannot model unbounded domains (arbitrary strings, lists)
  • Build-time cost - Large state spaces (> 1M states) verification becomes slow
  • Verification requires Go - Runtime portable via JSON export, but verification engine is Go-only
  • Single-registry - Federation (Section 7 of paper) not yet implemented
  • No runtime monitoring - Once built, machine is immutable (cannot add events/invariants dynamically)

Multi-Language Support

While verification requires Go, runtime is portable to any language. Use Machine.Export() to serialize the verified machine to JSON:

machine, _, err := registry.Build()
if err != nil {
    log.Fatal(err)
}

machine.Export("order.gsm.json")

The exported JSON contains:

  • Variable definitions (types, domains)
  • Event names (ordered)
  • Normal form table: nf[stateID] → normalized stateID
  • Step table: step[eventID][stateID] → normalized result stateID

Runtime Implementation (Python Example)

import json

class Machine:
    def __init__(self, path):
        with open(path) as f:
            d = json.load(f)
        self.events = {n: i for i, n in enumerate(d['events'])}
        self.step = d['step']
        self.nf = d['nf']

    def apply(self, state, event):
        """O(1) event application via table lookup"""
        return self.step[self.events[event]][state]

    def normalize(self, state):
        return self.nf[state]

# Use it
m = Machine('order.gsm.json')
s = 0
s = m.apply(s, 'ship_item')
s = m.apply(s, 'process_payment')

That's it - ~20 lines of code for a complete runtime. The same pattern works in JavaScript, Rust, Java, or any language that can:

  1. Load JSON
  2. Index arrays

Verification complexity stays in Go. Runtime simplicity is universal.

Installation

go get github.com/blackwell-systems/gsm

Testing

go test -v

Tests cover:

  • WFC verification (termination, cycles, depth)
  • CC verification (disjoint footprints, brute force, failures)
  • Event order independence
  • Compensation behavior
  • State encoding/decoding

License

MIT License - see LICENSE file

Citation

@techreport{blackwell2026nc,
  author = {Blackwell, Dayna},
  title = {Normalization Confluence in Federated Registry Networks},
  year = {2026},
  publisher = {Zenodo},
  doi = {10.5281/zenodo.18677400},
  url = {https://doi.org/10.5281/zenodo.18677400}
}

Related Tools

  • nccheck - YAML-based verifier for registry specs (reference implementation from the paper)
  • gsm (this library) - Go library for building verified convergent state machines

Further Reading

About

Governed state machines with build-time convergence verification. Process events in any order, get the same valid state. O(1) runtime via precomputed table lookup.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages