Doc Oc

Default Page
Example of Doc Oc anwsering "What is this repo about?"

Inspiration

Doc Oc was mainly made for 2 reasons, to help people understand and learn how code bases work, and to make onboarding easier.

What it does

At a high level, Doc Oc is an LLM that utilizes RAG to retrieve context from Github repositories. It's primary purpose is to answer specific questions about a code base.

How we built it

In the frontend we used React, Next, and Typescript. In the backend, we implemented 2 endpoints. The first was to index the specified Github repo, which we did using Cohere to get code emeddings which we then inserted into a Pinecone index. Furthermore, we paired Pinecone with a Postgres database (Supabase) in which we stored to content of each file. Our second endpoint was actually retrieving context from our vectorstore and database to feed to our LLM which was also from Cohere. We then sent the LLM's response as a stream to the frontend.

What we learned

Building Doc Oc was amazing! The process taught us so much about integrating multiple AI into an application, and how to design not just RAG systems, but also made us aware of all the places that our system could improve!

What's next for Doc Oc

Doc Oc in its current form has room for growth in the backend and frontend. For example, in the backend, we can improve our indexing by chunking up source code into more digestible bits for our retrieval system. In the frontend, we could include chat history.

Built With

cohere
fastapi
langchain
next.js
pinecone
postgresql
python
react.js
supabase
typescript

Updates

Merrick Liu started this project — Mar 31, 2024 05:45 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.