Inspiration

Modern developers rely on a multitude of AI tools like Cursor, Claude, and Gemini, but each operates in isolation with no shared understanding of the codebase. This fragmentation makes onboarding and collaboration slow, especially in large repositories. We sought to solve this by using Chroma to vectorize and visualize your entire repo, revealing semantic relationships between files.

What it does

Contextualize parses any GitHub repository, leveraging Chroma Sync to intelligently chunk files and create vector embeddings, allowing for both semantic and regex search. Use the 3D interactive interface to hover over files represented as spheres, find closely related files, and view commit history in a sidebar. When you select a file, Contextualize uses Lava to find and highlight its most related files and generates a copyable context summary that can be pasted into any AI tool. This gives all your assistants a unified understanding of your codebase and creates a shared context layer between developers and AI.

How we built it

  • Vector database built with Chroma Sync, hosted by Chroma Cloud; generates automatic vector embeddings for K-NN and semantic/regex search
  • GitHub API adds onto the vector database metadata, supplying file diffs for added context
  • Frontend built with Vite and Three.js, styled with Tailwind and Framer Motion, allowing for interaction with the 3D representation of the repository
  • FastAPI Python backend connects the vector database to the front end, Lava formats the response to a human-readable format
  • Claude MCP offers agentic workflows with direct access to Chroma tools

Challenges we ran into

  • Chunking was automatic with Chroma Sync, resulting in less flexibility
  • Example resources were limited for all the new tools we used
  • We were unfamiliar with using Three.js at such an advanced level to create a user-friendly experience

Accomplishments that we're proud of

  • Seamless integration between Chroma, GitHub API, and Three.js on the first attempt
  • We were unfamiliar with lots of the technologies that we used
  • Coming from Canada for our first overseas hack!

What we learned

  • This was our first time building RAG tools, so we had to learn RAG architectures at a high level in a short time
  • How to use a vector DB (Chroma) and its applicable use, leveraging different types of search (semantic/regex)
  • We gained experience in developing a product to solve real developer pain points

What's next for Contextualize

  • Deploying to the cloud
  • Implementing OAuth to support a user’s private GitHub repositories
  • Real-time updates between Chroma invocations and commits to a repository

Built With

Share this project:

Updates