The Missing Context Layer for AI Agents
CoreText is a local-first, AI-native knowledge graph that automatically synchronizes with your Git repository. It solves the "Lost in the Middle" problem for AI coding agents by providing a structured, topologically-aware "Second Brain" that bridges the gap between your files and the AI's understanding.
AI agents struggle with large codebases. RAG (Retrieval Augmented Generation) helps, but it often misses the structure of your project—the dependencies between files, the hierarchy of documents, and the architectural constraints defined in your specs.
CoreText changes this by:
- Treating Markdown as Source Code: It parses your documentation (specs, architecture, stories) into a structured graph, not just text chunks.
- Invisible Synchronization: A
git hookensures your knowledge graph is always perfectly synced with your codebase. No manual updates required. - Hybrid Search: Combines Vector Search (Meaning) with Graph Traversal (Topology) to give agents precise context.
- Local & Private: Everything runs locally on your machine using SurrealDB. No data leaves your perimeter.
- ⚡ Git-Native Sync: Automatically updates the graph on every
git commit. - 🧠 Hybrid Retrieval: Semantic search + Graph dependency traversal.
- 🤖 Agent-Ready (MCP): Exposes a Model Context Protocol (MCP) server for Claude, Gemini, and other agents.
- 🛡️ Integrity Checks: Lints your graph for broken links and dangling references before you commit.
- 📍 Topology Awareness: Understands
depends_on,parent_of, andreferencesrelationships.
CoreText provides two distinct sets of tools: standard CLI commands for system management and "Thick Tools" for AI agents via the MCP server.
These are "Outer Loop" tools for system lifecycle, file operations, and infrastructure management. Available in your terminal via coretext <command>.
init: Initializes the project, configuration, database binary, and embedding model.start/stop: Manages the background daemon (SurrealDB + MCP Server).status: Checks the health of the database and server.sync: Manually synchronizes Markdown files to the graph.new: Generates structured documentation from built-in BMAD templates (e.g.,coretext new story ...).lint: Runs integrity checks on your knowledge graph (broken links, schema violations).inspect: Visualizes the dependency tree of a specific node in the terminal.apply-schema: Applies database schema updates.install-hooks: Installs Git hooks for automatic synchronization.
These are "Inner Loop" tools designed for AI Agents. They are exposed via the Model Context Protocol (MCP) and are not available as standalone CLI commands because they require the persistent state of the MCP server (loaded embedding models, active database connections) to function efficiently.
query_knowledge(The "Thick Tool"):- Function: A universal context retrieval engine. It combines Vector Search (Semantic), Regex/Keyword Filtering, and Graph Traversal in a single round-trip.
- Why Exclusive? It requires the embedding model (~300MB) to be resident in memory for sub-second performance. Running this via CLI would incur a massive "cold start" penalty (3-10s) per query.
search_topology:- Function: Performs hybrid semantic search to find "Anchor Nodes" relevant to a natural language query.
- Why Exclusive? Relies on the same resident embedding model as
query_knowledge.
get_dependencies:- Function: Retrieves direct and indirect dependencies for a node in structured JSON format.
- Note: The CLI
inspectcommand wraps this tool for human-readable output, but the raw tool is optimized for Agent consumption.
CoreText is a Python application managed via poetry.
# Clone the repository
git clone https://github.com/bnmbanhmi/coretext.git
cd coretext
# Install dependencies
poetry installSets up the local SurrealDB instance, downloads the embedding model, and configures the project. You will be prompted to choose a directory (e.g., docs, wiki, or _coretext-knowledge) to serve as your knowledge graph.
poetry run coretext initRuns the SurrealDB database and the MCP Server in the background.
poetry run coretext startVerifies that the daemon is running and healthy.
poetry run coretext statusUse built-in templates to create structured documentation inside your configured knowledge directory (e.g., _coretext-knowledge).
poetry run coretext new story _coretext-knowledge/my-new-feature.mdVisualize the dependencies of any file or node.
poetry run coretext inspect _coretext-knowledge/my-new-feature.mdCoreText exposes a Model Context Protocol (MCP) server at http://localhost:8001/mcp. You can connect any MCP-compliant agent (like Claude Desktop or Gemini CLI) to give it access to your knowledge graph.
To install CoreText as a native extension in the Gemini CLI:
# Ensure the daemon is running
poetry run coretext start
# Link the extension from the project root (Development Mode)
gemini extensions link .
# Verify installation
gemini extensions listOnce linked, verify the connection:
gemini mcp listYou should see ✓ coretext ... - Connected. The Gemini Agent will now automatically discover and use the CoreText tools:
search_topology: Semantic search across files and headers.get_dependencies: Analyze relationships between components.query_knowledge: Universal context retrieval for complex queries.
- Knowledge Query: Ask "What does the Flux Capacitor depend on?" and the agent will use
search_topologyorget_dependenciesto find the answer from your synced docs. - Structure Analysis: The agent can visualize your project's knowledge graph topology.
- Commands: Run any CoreText command (e.g.,
status,sync,lint) directly via the Gemini prompt.
CoreText operates as a background daemon composed of:
- Sync Engine: Watches your Git repository and uses AST parsing to transform Markdown files into graph nodes.
- SurrealDB: A multi-model database storing the graph (nodes/edges) and vector embeddings.
- MCP Server: A FastAPI server that provides the interface for AI agents.
- Nodes: Files (
.md) and Headers (# H1,## H2). - Edges:
contains(File -> Header),parent_of(H1 -> H2),references(Link -> Target).
poetry run pytestpoetry run coretext lintIf you need to force a sync without committing:
poetry run coretext sync