A true graph database backend for GBrain — Neo4j-powered, one-click provisioning, drop-in compatible.
GBrain is an incredible knowledge graph. But under the hood, it stores links as rows in a Postgres table and does graph traversal with recursive CTEs. GraphBrain replaces that with real index-free adjacency — every link is a native Neo4j relationship, and traversal is O(1) per hop. One POST provisions a brain. You get a unique URL and API key. All of GBrain's operations work exactly the same.
GBrain (by Garry Tan) is the best personal knowledge graph out there. Its engine interface is clean and well-designed. But it runs on Postgres + pgvector — a relational database with a graph-shaped API on top. That works fine at small scale, but:
- Traversal uses recursive CTEs (O(log n) per hop instead of O(1))
- "How am I connected to X?" requires painful multi-joins
- Community detection / PageRank aren't possible in SQL
- Batch link creation takes individual INSERTs (slow at scale)
- Full-graph visualization requires loading everything into memory
GraphBrain solves all of these by implementing GBrain's exact engine interface on Neo4j — the industry-standard native graph database. Same API, real graph performance.
| Operation | Postgres (GBrain) | Neo4j (GraphBrain) |
|---|---|---|
| Traversal (depth 5) | Recursive CTE, ~50ms+ | Index-free adjacency, ~1ms |
| Shortest path between two nodes | Multi-join quagmire | shortestPath() — one call |
| Batch link creation | Sequential INSERTs | Sub-second (UNWIND) |
| Community detection | N/A | Native Louvain algorithm |
| PageRank | Painful SQL | gds.pageRank() built-in |
| Full-graph visualization | Load all into memory | Stream from native graph |
What works:
- ✅ Full GBrain engine interface (pages, links, traversal, search, stats, timeline, tags)
- ✅ Brain provisioning —
POST /v1/brainscreates an isolated Neo4j database - ✅ Per-brain API keys with auth on all endpoints
- ✅ GBrain adapter — drop-in engine that talks to GraphBrain REST API (25/25 integration tests pass)
- ✅ Native Neo4j graph traversal (BFS up to depth 10, cycle prevention)
- ✅ Batch link creation via Cypher UNWIND
- ✅ Full-text search via Neo4j fulltext indexes
- ✅ Database-level isolation (each brain = separate Neo4j database)
- ✅ Single-DB mode for AuraDB free tier (property-level isolation via
brain_id) - ✅ Cloudflare Tunnel deployment (public HTTPS, no port forwarding)
- ✅ One-click server setup script (
curl | bash) - ✅ Custom domain support (live at
graphbrain.belweave.ai)
Live instance: https://graphbrain.belweave.ai — running on a home server via Cloudflare Tunnel.
What's next (see Roadmap below):
- 🔲 Persistent API key storage (currently in-memory)
- 🔲 Migration tool — export from Postgres GBrain, import to GraphBrain Neo4j
- 🔲 Rate limiting
- 🔲 Horizontal scaling (Neo4j read replicas, multiple GraphBrain instances)
- 🔲 Web dashboard for brain management
┌──────────┐ ┌─────────────────┐ ┌──────────┐
│ GBrain │────▶│ GraphBrain API │────▶│ Neo4j │
│ (CLI) │ │ (Hono / Bun) │ │ (5.x) │
└──────────┘ └─────────────────┘ └──────────┘
│
┌──────────┴──────────┐
│ Multi-DB isolation │
│ (production) │
│ │
│ POST /v1/brains │
│ → CREATE DATABASE │
│ brain_abc123 │
│ → CREATE DATABASE │
│ brain_def456 │
│ │
│ Each brain is a │
│ separate Neo4j │
│ database. No shared │
│ namespace. No query │
│ filtering. Real │
│ walled-off isolation.│
└──────────────────────┘
(:Page {slug, title, type, content, frontmatter})
-[:LINKS_TO {type: "knows|messaged|works_at|invested_in|...", context, created_at}]->
(:Page)
-[:HAS_TIMELINE]->
(:TimelineEntry {date, summary, detail, source})Every GBrain concept maps cleanly: pages are nodes, links are typed relationships, timeline entries are connected nodes. No impedance mismatch.
git clone https://github.com/pkyanam/graphbrain
cd graphbrain
bun installdocker run --name graphbrain-neo4j -d \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/graphbrain-dev \
neo4j:5-communityWait ~30 seconds for Neo4j to boot (docker logs graphbrain-neo4j).
bun run dev
# → http://localhost:3000curl -X POST http://localhost:3000/v1/brains \
-H "Content-Type: application/json" \
-d '{"name": "my-brain"}'Response:
{
"brain_id": "brain_a1b2c3d4",
"name": "my-brain",
"url": "http://localhost:3000/v1/brain_a1b2c3d4",
"api_key": "sk_...",
"endpoints": {
"pages": "http://localhost:3000/v1/brain_a1b2c3d4/pages",
"links": "http://localhost:3000/v1/brain_a1b2c3d4/links",
"traverse": "http://localhost:3000/v1/brain_a1b2c3d4/traverse",
"graph": "http://localhost:3000/v1/brain_a1b2c3d4/graph",
"stats": "http://localhost:3000/v1/brain_a1b2c3d4/stats",
"search": "http://localhost:3000/v1/brain_a1b2c3d4/search"
}
}# Save these for convenience
KEY="sk_..."
BRAIN="http://localhost:3000/v1/brain_a1b2c3d4"
# Create pages
curl -X PUT "$BRAIN/pages/alice" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"title":"Alice Chen","type":"person","content":"Software engineer and open source contributor."}'
curl -X PUT "$BRAIN/pages/bob" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"title":"Bob Smith","type":"person","content":"Designer and co-founder."}'
# Create a typed link — this is a real Neo4j relationship
curl -X POST "$BRAIN/links" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"from_slug":"alice","to_slug":"bob","link_type":"knows","context":"Met at a conference"}'
# Traverse the graph — native Neo4j BFS, not SQL CTEs
curl -X POST "$BRAIN/traverse" \
-H "Content-Type: application/json" \
-H "X-API-Key: $KEY" \
-d '{"start_slug":"alice","depth":3,"direction":"out"}'
# Full-text search
curl "$BRAIN/search?q=engineer" -H "X-API-Key: $KEY"chmod +x scripts/smoke-test.sh
./scripts/smoke-test.shGraphBrain implements the exact same engine interface as GBrain's src/core/engine.ts. To use it as GBrain's backend, drop in the adapter:
import { GraphBrainEngine } from "graphbrain/src/adapter";
const engine = new GraphBrainEngine({
url: "https://your-graphbrain.example.com/v1/brain_abc123",
apiKey: "sk_..."
});
// All existing GBrain code works unchanged:
await engine.putPage(brainId, { slug: "alice", title: "Alice Chen", type: "person" });
await engine.addLink(brainId, { from_slug: "alice", to_slug: "bob", link_type: "knows" });
await engine.traverseGraph(brainId, "alice", 3, "out");The adapter handles all HTTP communication — your GBrain CLI, MCP server, and cron jobs don't change. Full integration test suite: 25/25 passing.
All brain endpoints require authentication via X-API-Key: sk_... header (or Authorization: Bearer sk_...).
| Method | Path | Description |
|---|---|---|
POST |
/v1/brains |
Create a new brain |
GET |
/v1/brains |
List all brains |
DELETE |
/v1/brains/:brainId |
Delete a brain |
| Method | Path | Description |
|---|---|---|
PUT |
/v1/:brainId/pages/:slug |
Create or update a page |
GET |
/v1/:brainId/pages/:slug |
Get a page by slug |
DELETE |
/v1/:brainId/pages/:slug |
Delete a page |
GET |
/v1/:brainId/pages |
List pages (?type=person&limit=50&offset=0) |
Put page:
{
"title": "Alice Chen",
"type": "person",
"content": "Software engineer and open source contributor.",
"frontmatter": { "tags": ["engineering"] }
}| Method | Path | Description |
|---|---|---|
POST |
/v1/:brainId/links |
Create a typed edge |
POST |
/v1/:brainId/links/batch |
Batch create edges (UNWIND — sub-second for 1K+) |
DELETE |
/v1/:brainId/links |
Remove an edge |
GET |
/v1/:brainId/links/:slug |
Get outgoing edges from a page |
GET |
/v1/:brainId/backlinks/:slug |
Get incoming edges to a page |
Create link:
{
"from_slug": "alice",
"to_slug": "bob",
"link_type": "knows",
"context": "Met at ReactConf 2025"
}Batch create:
{
"links": [
{ "from_slug": "alice", "to_slug": "bob", "link_type": "knows" },
{ "from_slug": "alice", "to_slug": "carol", "link_type": "works_at" },
{ "from_slug": "bob", "to_slug": "carol", "link_type": "knows" }
]
}| Method | Path | Description |
|---|---|---|
POST |
/v1/:brainId/traverse |
BFS graph traversal |
GET |
/v1/:brainId/graph |
Full graph for visualization |
GET |
/v1/:brainId/orphans |
Pages with no inbound links |
Traverse:
{
"start_slug": "alice",
"depth": 3,
"direction": "out",
"link_type": "knows"
}| Method | Path | Description |
|---|---|---|
POST |
/v1/:brainId/timeline |
Add a timeline entry |
POST |
/v1/:brainId/timeline/batch |
Batch add timeline entries |
GET |
/v1/:brainId/timeline/:slug |
Get timeline for a page |
| Method | Path | Description |
|---|---|---|
GET |
/v1/:brainId/search?q=query&limit=20 |
Full-text search across pages |
GET |
/v1/:brainId/stats |
Brain statistics |
Stats response:
{
"page_count": 47,
"link_count": 68,
"brain_score": 44,
"pages_by_type": { "person": 18, "company": 16, "vc_firm": 5 },
"most_connected": [{ "slug": "alice", "title": "Alice Chen", "link_count": 12 }]
}For any Ubuntu/Debian server with Docker:
curl -fsSL https://raw.githubusercontent.com/pkyanam/graphbrain/main/scripts/setup-server.sh | bashThis installs Bun, Docker, Neo4j, GraphBrain, and creates a Cloudflare Tunnel — you get a public https://*.trycloudflare.com URL in ~5 minutes.
docker compose up -d
# GraphBrain: http://localhost:3000
# Neo4j Browser: http://localhost:7474 (neo4j / graphbrain-dev)- Deploy from GitHub — Railway auto-detects Bun
- Add Neo4j as a service (or connect external)
- Set
NEO4J_URI,NEO4J_USER,NEO4J_PASSWORD,PUBLIC_URL - Multi-DB isolation is active by default
Set NEO4J_SINGLE_DB=true to work with AuraDB's free tier (limited to 1 database). Brains are isolated via a brain_id property on every node. Not for production.
- Persistent API key storage — SQLite or Postgres backing for brain keys (currently in-memory, lost on restart)
- Rate limiting — per-brain, per-endpoint request caps
- Migration tooling — export from Postgres GBrain, import into GraphBrain Neo4j
- Health dashboard —
/healthexpanded with Neo4j connectivity, uptime, throughput
- Browser-based brain management — create/delete brains, view stats, explore graph
- Graph visualization — interactive D3/Three.js force-directed graph of your brain
- API key management — rotate, revoke, view usage per key
- Custom domains — done: live at
graphbrain.belweave.aivia Cloudflare Tunnel
- Horizontal scaling — multiple GraphBrain API instances behind a load balancer
- Neo4j read replicas — route read queries to replicas, writes to primary
- Causal clustering — Neo4j Enterprise for HA and multi-region
- Usage metering — per-brain page/link/traversal counts
- GBrain CLI plugin —
gbrain engine use graphbrainwithout manual config - MCP server support — expose GraphBrain via Model Context Protocol for AI agents
- LangChain / LlamaIndex integration — use your brain as a knowledge graph RAG backend
- Community templates — pre-built brain schemas (CRM, investor CRM, research graph)
Does this replace GBrain? No. GraphBrain is a backend for GBrain. GBrain's CLI, MCP server, sync, and enrichment pipeline remain unchanged. GraphBrain replaces the storage layer for graph operations.
Do I need to migrate my data? Yes — you'd export pages and links from GBrain's Postgres database and import them into GraphBrain. A migration script is planned (v0.2).
Can I use both Postgres and Neo4j? Yes. The hybrid approach is recommended for large brains: Postgres for content/search/embeddings, GraphBrain (Neo4j) for links/traversal/graph algorithms.
What about embeddings / pgvector? pgvector is excellent for vector search and GBrain should keep using it. GraphBrain handles the graph layer, not embeddings. A hybrid deployment pairs Postgres (embeddings + full-text) with Neo4j (graph traversal + algorithms).
Can I use a custom domain?
Yes. This instance runs at https://graphbrain.belweave.ai on a home server behind a Cloudflare Tunnel. To set up your own:
- Add your domain to Cloudflare DNS (free tier)
- Create a named tunnel in the Cloudflare Zero Trust dashboard
- Install
cloudflaredas a systemd service with the tunnel token:sudo cloudflared service install <token>
- Add a public hostname in the dashboard pointing to
localhost:3000
Zero port forwarding, auto-renewing SSL, survives reboots.
Is this production-ready? v0.1.0 — functional and tested at small scale. Needs persistent key storage, rate limiting, and migration tooling before production workloads. See the roadmap above.
MIT