Sara Brain — A Measurement Instrument for Transformer Behavior

Sara Brain is a structured knowledge substrate that lets you teach arbitrary facts to an LLM session and measure how faithfully the model retrieves them versus confabulating from training weight. It is also a path-of-thought cognitive architecture: knowledge is stored as directed neuron-segment chains in SQLite, not as model weights.

Quickstart — Instrument Use

pip install sara-brain

sara-new --out my_brain.db                                                    # create fresh brain
sara-teach --subject zilkrap --relation is_a --object pruvnet --brain my_brain.db
sara-ask-stateless "what is zilkrap?" --brain my_brain.db                     # fresh session

For training-orthogonal validation substrates (concepts that cannot be in any model's training data):

sara-synth --out synth.db --seed 42    # generates random nonsense-word substrate + manifest

The A/B/C Measurement Protocol

Session	Command	Purpose
A — Teach	`sara-teach` or `sara-mcp`	Load substrate into a session
B — Test	`sara-ask-stateless` (fresh session)	Measure retrieval with substrate present
C — Control	Same question, no substrate	Bare model, training weights only

Divergence between B and C is the measurement. Content in B not present in the substrate is confabulation from training weight.

Why This Works

LLM sessions accumulate context. Teaching facts in Session A infects the session context. A truly fresh session (B or C) carries no teaching history — answers can only come from the substrate or model weights. The substrate is the ground truth; the model's answer is the measurement.

Paper

DOI: 10.5281/zenodo.19436522

What is Sara Brain?

A path-of-thought brain simulation. It learns facts expressed as natural-language statements, stores them as directed neuron-segment chains in SQLite, and recognizes concepts by launching parallel wavefronts from input neurons and finding intersections.

This is not a neural network, LLM, or activation-based system. Recognition follows actual recorded paths through neuron chains — no decay, no forgetting, no pattern matching.

Authenticity

Every essay I publish on Substack (at Path of Thought) or anywhere else is first committed to this repository, in the substack/ directory, before it appears on the public venue. The git history is a timestamped, public record of everything I have written under the Path of Thought name.

If any text appears attributed to me as a Path of Thought essay and is not in this repository, it is not from me. Screenshots can be faked. Substack accounts can be impersonated. AI-generated text can be attributed to anyone. But this repository has been visibly mine for a long time, with consistent work, published research, and real code — it cannot be spoofed by anyone who does not have years of my commit history behind them.

To verify a post: search this repository for a distinctive phrase from the post. If you find it, look at the commit that introduced it — the commit author will be me, and the timestamp will precede the Substack publication date. If you cannot find the text in this repository at all, the text did not come from me.

This authenticity claim applies to written essays under the Path of Thought name. It does not mean every word I have ever written lives in this repo — private conversations, unpublished drafts, and work in other contexts are not required to be here. The claim is narrower and stronger: anything I publish publicly as a Path of Thought essay will be committable-and-committed here first, and anything claimed to be a Path of Thought essay that is not here is not from me.

— Jennifer Pearl

Core Principles

Path-of-thought, not activation levels — Recognition traces real recorded paths, not spreading activation
Never forgets — Strength only increases. Path similarity replaces forgetting
Parallel wavefront propagation — Multiple wavefronts launch simultaneously; commonality across paths determines recognition

Quick Start

python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"

# Run the demo
.venv/bin/python demos/apple_demo.py

# Run tests
.venv/bin/pytest tests/ -v

# Launch the REPL
.venv/bin/sara

Requires Python 3.11+.

Documentation

Doc	Contents
docs/v001_foundation.md	Architecture foundation, data model, storage schema
docs/v002_user_guide.md	Design philosophy, algorithms, associations, categories, LLM, full REPL reference
docs/v003_perception.md	Vision perception, cognitive development model, tribal trust, correction, security
docs/v004_web_app.md	Web app guide: guided UI, image viewer, region selection, neural graph, vision proxy
docs/v005_design_philosophy.md	Design philosophy + user guide: origin story, why paths not activation, never forgets, parallel thought, tribal trust, "You Need More Than Attention" (transformers as sensory cortex, inventor skepticism, shared computational roots — 22 academic references), complete REPL reference

How It Works

Data Model

Neurons — four types: concept (apple), property (red), relation (apple_color), association (taste)
Segments — directed edges between neurons with strength (1 + ln(1 + traversals)) and traversal counts
Paths — recorded chains of segments representing a learned fact, with source text provenance

Learning

Teach it facts in natural language:

sara> teach an apple is red
sara> teach an apple is round
sara> teach a banana is yellow
sara> teach a banana is long

Each statement creates a 3-neuron chain: property → relation → concept.

Recognition

Give it properties and it finds matching concepts via parallel wavefront intersection:

sara> recognize red, round
Recognized: apple (score: 1.00)

Exploration

sara> trace apple       # All outgoing paths from a neuron
sara> why apple         # All paths leading to a neuron with provenance
sara> similar apple     # Neurons with shared downstream paths

Architecture

src/sara_brain/
  models/        — Pure dataclasses (Neuron, Segment, Path, PathTrace, RecognitionResult)
  storage/       — SQLite repos (NeuronRepo, SegmentRepo, PathRepo, AssociationRepo, CategoryRepo, SettingsRepo, Database)
  parsing/       — Statement parser and taxonomy
  core/          — Brain orchestrator, Learner, Recognizer, SimilarityAnalyzer, Perceiver
  nlp/           — VisionObserver (Claude Vision), LLM translator (Claude-only)
  visualization/ — ASCII tree and Graphviz DOT export
  repl/          — Interactive shell with commands and formatters

REPL Commands

Command	Description
`teach <statement>`	Learn a fact
`recognize <inputs>`	Recognize from comma-separated properties
`trace <label>`	Show all outgoing paths from a neuron
`why <label>`	Show all paths leading to a neuron
`similar <label>`	Find neurons with shared downstream paths
`analyze`	Scan all neurons for path similarities
`define <name> <qword>`	Create a new association with a question word
`describe <name> as <props>`	Register properties under an association
`associations`	List all associations and their properties
`<question_word> <concept> <assoc>`	Query properties (e.g., `what apple color`)
`questions`	List all available question words
`categorize <concept> <cat>`	Tag a concept with a category
`categories`	List all categories
`ask <question>`	Translate natural language via Claude LLM
`llm set <key> [model]`	Configure Claude API
`llm status` / `llm clear`	Check or remove LLM config
`perceive <path> [label]`	Run multi-phase image perception via Claude Vision
`no <correct_label>`	Correct a misidentification
`see <property>`	Point out a missed property
`stats`	Brain statistics
`neurons` / `paths`	List all
`save`	Force flush to disk
`quit` / `exit`	Exit

Tests

166 tests covering models, storage, parsing, learning, recognition, similarity, associations, categories, queries, LLM translation, vision, perception, proxy, and end-to-end integration.

.venv/bin/pytest tests/ -v

Storage

SQLite with WAL mode and foreign keys. Schema in src/sara_brain/storage/schema.sql. Current plan is SQLite now, with migration to data-nut-squirrel later.

License

See LICENSE if present.

Name		Name	Last commit message	Last commit date
Latest commit History 344 Commits
.amazonq/rules		.amazonq/rules
.claude		.claude
benchmarks		benchmarks
data		data
demos		demos
docs		docs
feedback_to_anthropic		feedback_to_anthropic
harness		harness
narrative		narrative
papers		papers
scripts		scripts
src		src
substack		substack
tests		tests
training_docs		training_docs
.gitignore		.gitignore
.mcp.json		.mcp.json
LICENSE		LICENSE
README.md		README.md
aptamer_exec.db.reverse		aptamer_exec.db.reverse
aptamer_exec.db.reverse-shm		aptamer_exec.db.reverse-shm
aptamer_exec.db.reverse-wal		aptamer_exec.db.reverse-wal
aptamer_full.db.bak		aptamer_full.db.bak
aptamer_full.db.reverse		aptamer_full.db.reverse
bio_full.db.crawl_progress		bio_full.db.crawl_progress
bio_v2.db.crawl_progress		bio_v2.db.crawl_progress
biology2e.db.regions.json		biology2e.db.regions.json
brain.db.bulk_ch10_nopartof		brain.db.bulk_ch10_nopartof
brain.db.bulk_reteach_backup		brain.db.bulk_reteach_backup
brain.db.ch10_expanded		brain.db.ch10_expanded
brain.db.ch10_expanded.reverse		brain.db.ch10_expanded.reverse
brain.db.drifted_s1		brain.db.drifted_s1
brain.db.flatten_lift_backup		brain.db.flatten_lift_backup
brain.db.hand_curated_nopartof		brain.db.hand_curated_nopartof
brain.db.hand_faithful		brain.db.hand_faithful
brain.db.hand_faithful.reverse		brain.db.hand_faithful.reverse
brain.db.openie_no_verb_teach		brain.db.openie_no_verb_teach
brain.db.openie_verbs		brain.db.openie_verbs
brain.db.reverse		brain.db.reverse
brain.db.with_part_of_leak		brain.db.with_part_of_leak
bundle.sh		bundle.sh
claude_taught.db.regions.json		claude_taught.db.regions.json
compartment.db.regions.json		compartment.db.regions.json
install-linux.sh		install-linux.sh
install-sara.sh		install-sara.sh
install.sh		install.sh
jkd_limitations.db.reverse		jkd_limitations.db.reverse
pyproject.toml		pyproject.toml
sara_q.py		sara_q.py
training_hamroby_extract_20260509_192958.log		training_hamroby_extract_20260509_192958.log
training_hamroby_extract_20260509_202404.log		training_hamroby_extract_20260509_202404.log
training_hamroby_extract_20260509_205629.log		training_hamroby_extract_20260509_205629.log
training_hamroby_extract_20260509_210102.log		training_hamroby_extract_20260509_210102.log
training_hamroby_extract_20260509_213142.log		training_hamroby_extract_20260509_213142.log
training_hamroby_extract_20260510_021735.log		training_hamroby_extract_20260510_021735.log
training_hamroby_extract_20260510_121151.log		training_hamroby_extract_20260510_121151.log
training_hamroby_extract_20260510_124404.log		training_hamroby_extract_20260510_124404.log
training_hamroby_extract_20260510_131829.log		training_hamroby_extract_20260510_131829.log
training_hamroby_extract_20260510_144553.log		training_hamroby_extract_20260510_144553.log
training_hamroby_extract_20260510_155406.log		training_hamroby_extract_20260510_155406.log
training_hamroby_extract_20260510_171919.log		training_hamroby_extract_20260510_171919.log
training_hamroby_extract_aug2_20260510_012818.log		training_hamroby_extract_aug2_20260510_012818.log
training_hamroby_extract_aug3_20260510_021049.log		training_hamroby_extract_aug3_20260510_021049.log
training_prod.log		training_prod.log
training_synth.log		training_synth.log
v005_design_philosophy.docx		v005_design_philosophy.docx
v005_design_philosophy_gdocs.docx		v005_design_philosophy_gdocs.docx
v005_design_philosophy_v2.docx		v005_design_philosophy_v2.docx
v006_design_philosophy.docx		v006_design_philosophy.docx
v007_design_philosophy.docx		v007_design_philosophy.docx
v008_design_philosophy.docx		v008_design_philosophy.docx
~$05_design_philosophy_v2.docx		~$05_design_philosophy_v2.docx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sara Brain — A Measurement Instrument for Transformer Behavior

Quickstart — Instrument Use

The A/B/C Measurement Protocol

Why This Works

Paper

What is Sara Brain?

Authenticity

Core Principles

Quick Start

Documentation

How It Works

Data Model

Learning

Recognition

Exploration

Architecture

REPL Commands

Tests

Storage

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sara Brain — A Measurement Instrument for Transformer Behavior

Quickstart — Instrument Use

The A/B/C Measurement Protocol

Why This Works

Paper

What is Sara Brain?

Authenticity

Core Principles

Quick Start

Documentation

How It Works

Data Model

Learning

Recognition

Exploration

Architecture

REPL Commands

Tests

Storage

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages