CORTEX

USER DASHBOARD
Custom MCP

Inspiration

Most AI chats forget everything the moment you start a new thread. That means users keep re-explaining the same context, preferences, and decisions over and over. Cortex was built to fix that with a persistent memory layer and a brain-like visual map that makes old chats feel easy to explore.

What it does

Cortex turns past conversations into useful memory, that can re-used.

Hybrid search (keyword + semantic): mixes exact matches with meaning-based recall (65/35 keyword-to-semantic) so it finds both facts and related ideas.
Context Selection Users can select chats based of the hybrid-search, extract the context and create custom system prompts within seconds.
MCP delivery: a custom Model Context Protocol (MCP) server injects retrieved memories directly into Claude.
Quality checks: Backboard.io helps score memory results (relevance, redundancy, coverage) before they get returned.
Visual interface: chats appear as nodes; search results light up so you can instantly see where the memory lives.

How we built it

Cortex is four main pieces:

Backend: stores chats, indexes them, and runs hybrid retrieval.
Frontend: interactive node-based visualization for browsing and searching memory.
Local model: Qwen 2.5 (8B) running locally for summarization + prompt generation.
MCP server: two endpoints—search (find best memories) and fetch (return structured memory payload).

Challenges we ran into

Speed: PEGASUS (a fine-tuned summarization model) was solid for summarization but too slow for real-time topic extraction, so we moved core extraction to our local Qwen setup.
Clean context: retrieved memory had to be short, structured, and actually useful or it would hurt model performance instead of helping. (This is what the whole project revolves around)

Accomplishments that we’re proud of

Built a model-agnostic memory layer that works externally and plugs in only when needed.
Made hybrid retrieval feel reliable, especially for specific facts, not just “yap.”
Created a brain-like visualization where relevant chats highlight during search and grouped by clusters.
Got a custom MCP integration working

What we learned

The biggest key to success is planning (and a claude pro subscription)
LLMs don’t just need more context, they need the right context.
Hybrid retrieval beats pure semantic search in real usage.
Ranking + formatting matter as much as the retrieval itself.

What’s next for Cortex

User-controlled memory: edit, pin, tag, or delete what the system remembers.
Smarter summarization + pruning: keep long-term memory compact without losing important details.
Better relevance weighting: improve ranking using recency and usage over time.
Support more AI clients: plug Cortex into multiple chat tools, not just one.

Built With

3d-visualization
css-backend:-fastapi
html
http:
javascript
languages:-python
nomic-embed-text
pydantic
scikit-learn-(kmeans)
sqlalchemy
three.js
umap-learn-llm/embeddings:-ollama-(qwen-2.5-for-summaries
uvicorn-database:-sqlite-ml/vector:-numpy

Submitted to

CXC 2026 - AI Hackathon
- Winner First Place Overall ($250 x 4)

Created by

Worked on the backend implementation via FastAPI (parsing -> summmarizing -> embedding -> Clustering (k-means) -> retireval). Created a Custom MCP server plugin for Claude to call search and fetch endpoints respectively.

Vibhor7-7 Sharma
Built the project’s 3D visualization (front-end rendering, interactions, and UI). Parsed ChatGPT HTML exports into structured chat nodes, clustered them by semantic similarity, and used Qwen 2.5 to generate summaries and reusable system prompts.

Divyam Banga
Tanay Jagadeesh