Project Title Karpatheon

Tagline The Operating System for High-Velocity Intelligence (GraphRAG for Personal Knowledge)

💡 Inspiration: Solving the "Signal Decay" We are living in the age of infinite information but finite attention. For founders, researchers, and high-performers, the bottleneck isn't finding information—it's synthesizing it.

Existing tools fall into two traps:

Google: A map of what exists (filled with SEO slop).

Note Apps (Notion/Obsidian): Static graveyards where knowledge goes to die.

We wanted to build Search for the Curious. We hypothesized that if we could combine Vector Search (semantic understanding) with Knowledge Graphs (structured relationships), we could build a "Second Brain" that actually thinks with you.

Karpatheon is the "Palantir for Knowledge Workers"—a tool that turns passive consumption into active, compounded intelligence.

🚀 What it does Karpatheon is an AI-native knowledge engine that automates the "Zettelkasten" method. It ingests your reading, maps it against what you already know, and helps you execute faster.

Auto-Ontology: You read; Karpatheon structures. It automatically chunks, vectorizes, and extracts entities from your notes and web sources to build a dynamic Knowledge Graph.

Context-Aware Search: Unlike standard RAG (Retrieval Augmented Generation), we don't just look for keywords. We look for concepts. If you search for "Mental Models," it pulls related nodes from your graph even if they don't share the exact text.

The "Curiosity" Engine: A Chrome Extension that acts as your research co-pilot. It doesn't just save links; it analyzes the page, compares it to your existing database, and tells you: "Here is what’s new to you, and here is how it connects to that paper you read three months ago."

Modes of Operation:

Execution Mode: Synthesizes context for 10x productivity.

Curiosity Mode: Explores adjacent possibilities and "side quests" in your knowledge graph.

⚙️ How we built it: The "Hybrid Memory" Architecture We built a GraphRAG (Graph-based Retrieval Augmented Generation) pipeline that outperforms standard vector search.

Ingestion Layer:

Chrome Extension: Scrapes and sanitizes web content.

FastAPI Backend: Handles the heavy lifting of text chunking (markdown-header based) and embedding.

The "Dual-Brain" Database:

Supabase (pgvector): Stores high-dimensional vector embeddings of text chunks for semantic similarity search (Cosine Distance).

Neo4j (Graph DB): Stores entities and their relationships. This allows the AI to "hop" between concepts (e.g., linking "Elon Musk" to "Mars" to "Propulsion") in a way pure vectors cannot.

Synthesis Layer:

LLM Integration: We inject retrieved context (Top-K similarity + Graph Neighbors) into the context window to generate grounded, hallucination-free insights.

Infrastructure:

Dockerized containers for easy deployment.

Next.js frontend for a reactive, high-performance UI.

🚧 Challenges we ran into The "Entity Resolution" Problem: Teaching the AI that "React.js" and "React" are the same node in the graph was difficult. We had to implement fuzzy matching logic to prevent graph clutter.

Orchestrating Hybrid Search: Balancing the weight between Vector Search results and Graph Traversal results took significant tuning. Sometimes the most semantically similar note isn't the most conceptually relevant one.

Latency: Running embeddings and graph queries in real-time is expensive. We optimized by caching crawl jobs and using lightweight embedding models for the MVP.

🏆 Accomplishments that we're proud of Functional GraphRAG: We moved beyond simple "chat with PDF" wrappers and built a genuine knowledge graph implementation in under 24 hours.

Seamless Context Injection: The system successfully retrieves "forgotten" knowledge from previous sessions and weaves it into current conversations.

Founder Mode UI: We stripped away the clutter. The UI focuses purely on the signal, designed for flow state.

🧠 What we learned Structure > Scale: A small, well-structured graph beats a massive, unstructured vector store. The relationships between data points are often more valuable than the data points themselves.

The Cold Start Problem: Users need immediate value. By implementing the "Curiosity Mode" extension, we allow users to get value from the web before they've even written their first note.

🔮 What's next for Karpatheon Multi-Agent Research Teams: Spawning autonomous agents to "go read the internet" on a topic and report back with a synthesized syllabus.

Social Knowledge Graphs: Imagine "forking" Linus Torvalds’ knowledge graph to learn Linux, or syncing graphs with your co-founder to merge mental models.

Deep Learning Modes: Generating quizzes and spaced-repetition schedules automatically from the knowledge graph.

Built With

  • gemini-3-api
  • neo4j
  • s3
  • supabase
Share this project:

Updates