🧠 GraphBrain

A true graph database backend for GBrain — Neo4j-powered, one-click provisioning, drop-in compatible.

GBrain is an incredible knowledge graph. But under the hood, it stores links as rows in a Postgres table and does graph traversal with recursive CTEs. GraphBrain replaces that with real index-free adjacency — every link is a native Neo4j relationship, and traversal is O(1) per hop. One POST provisions a brain. You get a unique URL and API key. All of GBrain's operations work exactly the same.

Why This Exists

GBrain (by Garry Tan) is the best personal knowledge graph out there. Its engine interface is clean and well-designed. But it runs on Postgres + pgvector — a relational database with a graph-shaped API on top. That works fine at small scale, but:

Traversal uses recursive CTEs (O(log n) per hop instead of O(1))
"How am I connected to X?" requires painful multi-joins
Community detection / PageRank aren't possible in SQL
Batch link creation takes individual INSERTs (slow at scale)
Full-graph visualization requires loading everything into memory

GraphBrain solves all of these by implementing GBrain's exact engine interface on Neo4j — the industry-standard native graph database. Same API, real graph performance.

Operation	Postgres (GBrain)	Neo4j (GraphBrain)
Traversal (depth 5)	Recursive CTE, ~50ms+	Index-free adjacency, ~1ms
Shortest path between two nodes	Multi-join quagmire	`shortestPath()` — one call
Batch link creation	Sequential INSERTs	Sub-second (UNWIND)
Community detection	N/A	Native Louvain algorithm
PageRank	Painful SQL	`gds.pageRank()` built-in
Full-graph visualization	Load all into memory	Stream from native graph

Current Status — v0.1.0

What works:

✅ Full GBrain engine interface (pages, links, traversal, search, stats, timeline, tags)
✅ Brain provisioning — POST /v1/brains creates an isolated Neo4j database
✅ Per-brain API keys with auth on all endpoints
✅ GBrain adapter — drop-in engine that talks to GraphBrain REST API (25/25 integration tests pass)
✅ Native Neo4j graph traversal (BFS up to depth 10, cycle prevention)
✅ Batch link creation via Cypher UNWIND
✅ Full-text search via Neo4j fulltext indexes
✅ Database-level isolation (each brain = separate Neo4j database)
✅ Single-DB mode for AuraDB free tier (property-level isolation via brain_id)
✅ Cloudflare Tunnel deployment (public HTTPS, no port forwarding)
✅ One-click server setup script (curl | bash)
✅ Custom domain support (live at graphbrain.belweave.ai)

Live instance: https://graphbrain.belweave.ai — running on a home server via Cloudflare Tunnel.

What's next (see Roadmap below):

🔲 Persistent API key storage (currently in-memory)
🔲 Migration tool — export from Postgres GBrain, import to GraphBrain Neo4j
🔲 Rate limiting
🔲 Horizontal scaling (Neo4j read replicas, multiple GraphBrain instances)
🔲 Web dashboard for brain management

Architecture

┌──────────┐     ┌─────────────────┐     ┌──────────┐
│  GBrain  │────▶│  GraphBrain API │────▶│  Neo4j   │
│  (CLI)   │     │  (Hono / Bun)   │     │  (5.x)   │
└──────────┘     └─────────────────┘     └──────────┘
                       │
            ┌──────────┴──────────┐
            │   Multi-DB isolation │
            │   (production)       │
            │                      │
            │  POST /v1/brains     │
            │  → CREATE DATABASE   │
            │    brain_abc123      │
            │  → CREATE DATABASE   │
            │    brain_def456      │
            │                      │
            │  Each brain is a     │
            │  separate Neo4j      │
            │  database. No shared │
            │  namespace. No query │
            │  filtering. Real     │
            │  walled-off isolation.│
            └──────────────────────┘

Data Model

(:Page {slug, title, type, content, frontmatter})
    -[:LINKS_TO {type: "knows|messaged|works_at|invested_in|...", context, created_at}]->
(:Page)
    -[:HAS_TIMELINE]->
(:TimelineEntry {date, summary, detail, source})

Every GBrain concept maps cleanly: pages are nodes, links are typed relationships, timeline entries are connected nodes. No impedance mismatch.

Quick Start

Prerequisites

Bun >= 1.3
Docker

1. Clone and Install

git clone https://github.com/pkyanam/graphbrain
cd graphbrain
bun install

2. Start Neo4j

docker run --name graphbrain-neo4j -d \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/graphbrain-dev \
  neo4j:5-community

Wait ~30 seconds for Neo4j to boot (docker logs graphbrain-neo4j).

3. Start GraphBrain

bun run dev
# → http://localhost:3000

4. Create Your First Brain

curl -X POST http://localhost:3000/v1/brains \
  -H "Content-Type: application/json" \
  -d '{"name": "my-brain"}'

Response:

{
  "brain_id": "brain_a1b2c3d4",
  "name": "my-brain",
  "url": "http://localhost:3000/v1/brain_a1b2c3d4",
  "api_key": "sk_...",
  "endpoints": {
    "pages":     "http://localhost:3000/v1/brain_a1b2c3d4/pages",
    "links":     "http://localhost:3000/v1/brain_a1b2c3d4/links",
    "traverse":  "http://localhost:3000/v1/brain_a1b2c3d4/traverse",
    "graph":     "http://localhost:3000/v1/brain_a1b2c3d4/graph",
    "stats":     "http://localhost:3000/v1/brain_a1b2c3d4/stats",
    "search":    "http://localhost:3000/v1/brain_a1b2c3d4/search"
  }
}

5. Use It

# Save these for convenience
KEY="sk_..."
BRAIN="http://localhost:3000/v1/brain_a1b2c3d4"

# Create pages
curl -X PUT "$BRAIN/pages/alice" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $KEY" \
  -d '{"title":"Alice Chen","type":"person","content":"Software engineer and open source contributor."}'

curl -X PUT "$BRAIN/pages/bob" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $KEY" \
  -d '{"title":"Bob Smith","type":"person","content":"Designer and co-founder."}'

# Create a typed link — this is a real Neo4j relationship
curl -X POST "$BRAIN/links" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $KEY" \
  -d '{"from_slug":"alice","to_slug":"bob","link_type":"knows","context":"Met at a conference"}'

# Traverse the graph — native Neo4j BFS, not SQL CTEs
curl -X POST "$BRAIN/traverse" \
  -H "Content-Type: application/json" \
  -H "X-API-Key: $KEY" \
  -d '{"start_slug":"alice","depth":3,"direction":"out"}'

# Full-text search
curl "$BRAIN/search?q=engineer" -H "X-API-Key: $KEY"

Run the Smoke Test

chmod +x scripts/smoke-test.sh
./scripts/smoke-test.sh

GBrain Integration

GraphBrain implements the exact same engine interface as GBrain's src/core/engine.ts. To use it as GBrain's backend, drop in the adapter:

import { GraphBrainEngine } from "graphbrain/src/adapter";

const engine = new GraphBrainEngine({
  url: "https://your-graphbrain.example.com/v1/brain_abc123",
  apiKey: "sk_..."
});

// All existing GBrain code works unchanged:
await engine.putPage(brainId, { slug: "alice", title: "Alice Chen", type: "person" });
await engine.addLink(brainId, { from_slug: "alice", to_slug: "bob", link_type: "knows" });
await engine.traverseGraph(brainId, "alice", 3, "out");

The adapter handles all HTTP communication — your GBrain CLI, MCP server, and cron jobs don't change. Full integration test suite: 25/25 passing.

API Reference

All brain endpoints require authentication via X-API-Key: sk_... header (or Authorization: Bearer sk_...).

Provisioning

Method	Path	Description
`POST`	`/v1/brains`	Create a new brain
`GET`	`/v1/brains`	List all brains
`DELETE`	`/v1/brains/:brainId`	Delete a brain

Pages

Method	Path	Description
`PUT`	`/v1/:brainId/pages/:slug`	Create or update a page
`GET`	`/v1/:brainId/pages/:slug`	Get a page by slug
`DELETE`	`/v1/:brainId/pages/:slug`	Delete a page
`GET`	`/v1/:brainId/pages`	List pages (`?type=person&limit=50&offset=0`)

Put page:

{
  "title": "Alice Chen",
  "type": "person",
  "content": "Software engineer and open source contributor.",
  "frontmatter": { "tags": ["engineering"] }
}

Links (Graph Edges)

Method	Path	Description
`POST`	`/v1/:brainId/links`	Create a typed edge
`POST`	`/v1/:brainId/links/batch`	Batch create edges (UNWIND — sub-second for 1K+)
`DELETE`	`/v1/:brainId/links`	Remove an edge
`GET`	`/v1/:brainId/links/:slug`	Get outgoing edges from a page
`GET`	`/v1/:brainId/backlinks/:slug`	Get incoming edges to a page

Create link:

{
  "from_slug": "alice",
  "to_slug": "bob",
  "link_type": "knows",
  "context": "Met at ReactConf 2025"
}

Batch create:

{
  "links": [
    { "from_slug": "alice", "to_slug": "bob", "link_type": "knows" },
    { "from_slug": "alice", "to_slug": "carol", "link_type": "works_at" },
    { "from_slug": "bob", "to_slug": "carol", "link_type": "knows" }
  ]
}

Graph Operations

Method	Path	Description
`POST`	`/v1/:brainId/traverse`	BFS graph traversal
`GET`	`/v1/:brainId/graph`	Full graph for visualization
`GET`	`/v1/:brainId/orphans`	Pages with no inbound links

Traverse:

{
  "start_slug": "alice",
  "depth": 3,
  "direction": "out",
  "link_type": "knows"
}

Timeline

Method	Path	Description
`POST`	`/v1/:brainId/timeline`	Add a timeline entry
`POST`	`/v1/:brainId/timeline/batch`	Batch add timeline entries
`GET`	`/v1/:brainId/timeline/:slug`	Get timeline for a page

Search & Stats

Method	Path	Description
`GET`	`/v1/:brainId/search?q=query&limit=20`	Full-text search across pages
`GET`	`/v1/:brainId/stats`	Brain statistics

Stats response:

{
  "page_count": 47,
  "link_count": 68,
  "brain_score": 44,
  "pages_by_type": { "person": 18, "company": 16, "vc_firm": 5 },
  "most_connected": [{ "slug": "alice", "title": "Alice Chen", "link_count": 12 }]
}

Deployment

Option 1: One-Click Server Setup

For any Ubuntu/Debian server with Docker:

curl -fsSL https://raw.githubusercontent.com/pkyanam/graphbrain/main/scripts/setup-server.sh | bash

This installs Bun, Docker, Neo4j, GraphBrain, and creates a Cloudflare Tunnel — you get a public https://*.trycloudflare.com URL in ~5 minutes.

Option 2: Docker Compose (Local Dev)

docker compose up -d
# GraphBrain: http://localhost:3000
# Neo4j Browser: http://localhost:7474 (neo4j / graphbrain-dev)

Option 3: Railway + Neo4j

Deploy from GitHub — Railway auto-detects Bun
Add Neo4j as a service (or connect external)
Set NEO4J_URI, NEO4J_USER, NEO4J_PASSWORD, PUBLIC_URL
Multi-DB isolation is active by default

Option 4: AuraDB Free Tier (Single-DB Mode)

Set NEO4J_SINGLE_DB=true to work with AuraDB's free tier (limited to 1 database). Brains are isolated via a brain_id property on every node. Not for production.

Roadmap

v0.2 — Production Readiness

Persistent API key storage — SQLite or Postgres backing for brain keys (currently in-memory, lost on restart)
Rate limiting — per-brain, per-endpoint request caps
Migration tooling — export from Postgres GBrain, import into GraphBrain Neo4j
Health dashboard — /health expanded with Neo4j connectivity, uptime, throughput

v0.3 — Web Dashboard

Browser-based brain management — create/delete brains, view stats, explore graph
Graph visualization — interactive D3/Three.js force-directed graph of your brain
API key management — rotate, revoke, view usage per key
Custom domains — done: live at graphbrain.belweave.ai via Cloudflare Tunnel

v0.4 — Scale

Horizontal scaling — multiple GraphBrain API instances behind a load balancer
Neo4j read replicas — route read queries to replicas, writes to primary
Causal clustering — Neo4j Enterprise for HA and multi-region
Usage metering — per-brain page/link/traversal counts

v0.5 — Ecosystem

GBrain CLI plugin — gbrain engine use graphbrain without manual config
MCP server support — expose GraphBrain via Model Context Protocol for AI agents
LangChain / LlamaIndex integration — use your brain as a knowledge graph RAG backend
Community templates — pre-built brain schemas (CRM, investor CRM, research graph)

FAQ

Does this replace GBrain? No. GraphBrain is a backend for GBrain. GBrain's CLI, MCP server, sync, and enrichment pipeline remain unchanged. GraphBrain replaces the storage layer for graph operations.

Do I need to migrate my data? Yes — you'd export pages and links from GBrain's Postgres database and import them into GraphBrain. A migration script is planned (v0.2).

Can I use both Postgres and Neo4j? Yes. The hybrid approach is recommended for large brains: Postgres for content/search/embeddings, GraphBrain (Neo4j) for links/traversal/graph algorithms.

What about embeddings / pgvector? pgvector is excellent for vector search and GBrain should keep using it. GraphBrain handles the graph layer, not embeddings. A hybrid deployment pairs Postgres (embeddings + full-text) with Neo4j (graph traversal + algorithms).

Can I use a custom domain? Yes. This instance runs at https://graphbrain.belweave.ai on a home server behind a Cloudflare Tunnel. To set up your own:

Add your domain to Cloudflare DNS (free tier)
Create a named tunnel in the Cloudflare Zero Trust dashboard
Install cloudflared as a systemd service with the tunnel token:
```
sudo cloudflared service install <token>
```
Add a public hostname in the dashboard pointing to localhost:3000

Zero port forwarding, auto-renewing SSL, survives reboots.

Is this production-ready? v0.1.0 — functional and tested at small scale. Needs persistent key storage, rate limiting, and migration tooling before production workloads. See the roadmap above.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.cursor/rules		.cursor/rules
docs		docs
scripts		scripts
src		src
test		test
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
bun.lock		bun.lock
docker-compose.yml		docker-compose.yml
index.ts		index.ts
package.json		package.json
railway.json		railway.json
railway.toml		railway.toml
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation