Skip to content

Feature: Theorist Skill — Per-Repo Operating Theory Documents (inspired by blader/theorist) #381

@teknium1

Description

@teknium1

Overview

Inspired by blader/theorist (MIT, by Siqi Chen), this proposes adding a Theorist skill that maintains a per-repo THEORY.MD — a living narrative document capturing the operating theory behind the current work. Unlike AGENTS.md (static project rules) or the memory tool (global operational notes), THEORY.MD is an agent-maintained strategic narrative that answers "why does this system work this way?" and "where does uncertainty remain?"

The concept fills a genuine gap in Hermes's context architecture. Today, Hermes auto-loads human-authored context (AGENTS.md, SOUL.md, .cursorrules) and maintains global memory (MEMORY.md/USER.md). But there's no per-project, agent-maintained document that captures evolving strategic understanding — the mental model of how a system works, why the current approach was chosen, and what's still uncertain. THEORY.MD provides exactly this.

The original skill (v1.3.0, 2026-02-28) is well-designed and MIT-licensed. The full SKILL.md has been analyzed below. Implementation would be a new bundled skill plus a minor core enhancement to auto-load THEORY.MD alongside AGENTS.md at session start.


Research Findings

How Theorist Works

The skill maintains a single THEORY.MD file at the repo root. Key behaviors:

What THEORY.MD Contains:

  • Problem thesis: What problem is being solved and why it matters — the structural reason the problem exists, not just symptoms
  • Operating theory: Current mental model of how the system works, what the leverage points are, what has been tried and learned
  • Systematic strategy: The higher-order approach — not tasks but principles connecting the changes
  • Key discoveries and pivots: Moments where understanding shifted, what the old theory was, what broke it, what replaced it
  • Open questions: What's still unknown, where the theory might be wrong, what would change the approach

What THEORY.MD Is NOT:

  • Not a changelog (no timestamped entries — holistic rewrites only)
  • Not a plan/todo list (no checkboxes or step-by-step instructions)
  • Not a postmortem (present tense of ongoing work)
  • Not a status report (no "today I did X")

Session Behavior:

  • At session start: read THEORY.MD silently, orient to the work
  • During work: update when understanding shifts (root cause found, strategy pivot, new uncertainty), NOT on every code change
  • Update cadence: after each investigate/implement/verify loop, every ~10min of active work, or when 2-3 learnings accumulate
  • Updates are holistic rewrites of relevant sections, keeping the full document coherent
  • Trivial sessions (one-liner fix, config change) — no-op the document

Practical constraints:

  • One THEORY.MD per repo, at repo root
  • Max ~200 lines — tighten prose if longer
  • Rewrite holistically, never append
  • Tone: thoughtful engineer explaining mental model to a peer, direct, specific, no filler

Key Design Decisions

  1. Narrative, not structured data: THEORY.MD is prose, not JSON/YAML/typed facts. This is deliberate — strategic understanding is inherently narrative, not decomposable into discrete facts.
  2. Holistic rewrites, not appends: Prevents the document from becoming a log. Forces coherent synthesis at every update.
  3. Theory-triggered, not code-triggered: Updates happen when understanding shifts, not when files change. This is the key distinction from changelogs.
  4. Superseded theories as pivots: Old theories aren't deleted, they're noted as pivots ("Initially the hypothesis was X, but Y revealed Z"). The evolution of understanding is part of the theory.
  5. Always active: The skill activates every session and stays active throughout, not invoked per-request.

Where Theorist Fits in the Context File Ecosystem

The AI coding agent ecosystem has converged on several per-repo context formats:

Format Purpose Author Static/Dynamic
AGENTS.md Project rules, conventions, build commands Human Static
SOUL.md Agent persona, tone, identity Human Static
.cursorrules IDE-specific rules Human Static
MEMORY.md Operational notes, facts Agent Append-based
THEORY.MD Strategic understanding, mental model Agent Holistic rewrite

THEORY.MD is the only format that captures the dynamic, synthesized understanding of a project. AGENTS.md tells the agent HOW to work; THEORY.MD tells it WHY the system is shaped this way.


Current State in Hermes Agent

What we have:

  • prompt_builder.py (build_context_files_prompt): Auto-loads AGENTS.md (recursive), .cursorrules/.cursor/rules/*.mdc, SOUL.md (cwd then ~/.hermes/ fallback). Each capped at 20,000 chars. Content is injected into system prompt under # Project Context.
  • Memory tool: Global MEMORY.md + USER.md with §-delimited entries. Not per-project, not narrative.
  • Issue Feature: Structured Memory System — Typed Nodes, Graph Edges, and Hybrid Search #346 (Structured Memory System): Proposes typed memory nodes, graph edges, vector search. Fundamentally different — structured facts vs. narrative synthesis.

Gap: No per-project, agent-maintained document for strategic understanding. The prompt_builder loads human-authored context but nothing the agent itself maintains to capture evolving project comprehension.

Integration point: Adding THEORY.MD to build_context_files_prompt() in agent/prompt_builder.py — roughly 15 lines of code, following the SOUL.md loading pattern. This ensures every session starts with the latest theory, even without the skill being explicitly invoked.


Implementation Plan

Skill vs. Tool Classification

This should be a bundled skill because:

  • The core behavior is instructions for maintaining a document — expressible entirely as skill prose
  • Uses only existing tools (read_file, write_file, terminal for git status)
  • No custom Python integration or API key management
  • Per CONTRIBUTING.md: "Make it a Skill when the capability can be expressed as instructions + shell commands + existing tools"
  • Broadly useful — maintaining project understanding is valuable for any sustained development work, not niche

The auto-loading enhancement is a minor core code change (prompt_builder.py), not a new tool.

What We'd Need

  1. A new skill at skills/software-development/theorist/SKILL.md — adapted from the original (MIT-licensed)
  2. A small enhancement to agent/prompt_builder.py to auto-load THEORY.MD
  3. Documentation updates noting THEORY.MD as a recognized context file

Phased Rollout

Phase 1: Skill Only (Zero Code Changes)

  • Port the theorist SKILL.md to Hermes skill format
  • Adapt trigger conditions for Hermes skill loading (e.g., "maintain the theory", "update the operating theory", "what's the current theory?")
  • The skill instructs Hermes to create/maintain THEORY.MD using read_file/write_file
  • Users manually invoke the skill or the system loads it on matching phrases

Phase 2: Auto-Loading (Minor Core Change)

  • Add THEORY.MD to build_context_files_prompt() in agent/prompt_builder.py
  • Load from cwd (repo root), case-insensitive match for THEORY.MD/theory.md
  • Cap at 20,000 chars like other context files
  • Scan for prompt injection via _scan_context_content()
  • This ensures every session starts with project context even without explicitly invoking the skill

Phase 3: Integration & Polish


Pros & Cons

Pros

  • Solves context drift: Long agent sessions and multi-session projects lose strategic coherence. THEORY.MD provides a persistent anchor.
  • Zero code changes for Phase 1: Just a new skill file — immediate value.
  • Proven concept: Siqi Chen reports it's a "game changer for keeping long running agent sessions on track." MIT licensed.
  • Complements existing context: Fills the gap between static AGENTS.md (human rules) and global MEMORY.md (operational notes).
  • Low token overhead: ~200 lines max = ~500-1000 tokens added to system prompt.
  • Universal applicability: Any project benefits from maintained strategic understanding.

Cons / Risks

  • Token cost: Adds to system prompt length every session, even if not actively needed.
  • Stale theory danger: If the agent doesn't update THEORY.MD after significant changes, it could mislead future sessions with outdated mental models.
  • Git noise: Frequent rewrites to THEORY.MD could clutter commit history. May want to recommend .gitignoring it or using a dedicated branch.
  • "Always active" challenge: Hermes skills are loaded on-demand, not "always active." The auto-loading in Phase 2 handles reading, but the "update during work" behavior requires the skill to be in context. This may need the concept of persistent/background skills (not currently supported).
  • Quality variance: The value of THEORY.MD depends heavily on the quality of the agent's synthesis. Poor models may produce generic or unhelpful theories.

Open Questions

  • Should THEORY.MD be tracked in git by default, or recommended for .gitignore?
  • How to handle the "always active" behavior in Hermes's on-demand skill loading model? Options:
    • Add THEORY.MD maintenance instructions to the base system prompt (heavy)
    • Add a "persistent skills" concept that loads certain skills every session
    • Rely on the auto-loaded THEORY.MD content to remind the agent to maintain it (self-bootstrapping)
  • Should THEORY.MD include a machine-readable header (YAML frontmatter with last-updated timestamp, session count)?
  • How to handle multi-project sessions where cwd changes?
  • Should the skill create THEORY.MD proactively or only when the user asks?

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions