Inspiration
Doc Oc was mainly made for 2 reasons, to help people understand and learn how code bases work, and to make onboarding easier.
What it does
At a high level, Doc Oc is an LLM that utilizes RAG to retrieve context from Github repositories. It's primary purpose is to answer specific questions about a code base.
How we built it
In the frontend we used React, Next, and Typescript. In the backend, we implemented 2 endpoints. The first was to index the specified Github repo, which we did using Cohere to get code emeddings which we then inserted into a Pinecone index. Furthermore, we paired Pinecone with a Postgres database (Supabase) in which we stored to content of each file. Our second endpoint was actually retrieving context from our vectorstore and database to feed to our LLM which was also from Cohere. We then sent the LLM's response as a stream to the frontend.
What we learned
Building Doc Oc was amazing! The process taught us so much about integrating multiple AI into an application, and how to design not just RAG systems, but also made us aware of all the places that our system could improve!
What's next for Doc Oc
Doc Oc in its current form has room for growth in the backend and frontend. For example, in the backend, we can improve our indexing by chunking up source code into more digestible bits for our retrieval system. In the frontend, we could include chat history.
Built With
- cohere
- fastapi
- langchain
- next.js
- pinecone
- postgresql
- python
- react.js
- supabase
- typescript
Log in or sign up for Devpost to join the conversation.