๐ก Inspiration
While navigating through dense research papers ๐, legal documents โ๏ธ, and lengthy technical manuals ๐ง , we found ourselves endlessly skimming and scrolling ๐ฉ. The traditional PDF readers felt static and unhelpful. So, we thought โ what if your documents could talk back? ๐ฃ๏ธ Thatโs how AskMyDoc was born โ a smart, interactive PDF assistant ๐งโโ๏ธ that helps users truly understand rather than just read their documents.
๐ What it does ๐ช
AskMyDoc is your AI-powered reading companion ๐ค๐. Hereโs what it brings to the table:
- ๐ค Upload any PDF document
- ๐ฅ๏ธ View it in a split-screen PDF reader
- ๐ฌ Ask questions about the whole document or specific pages
- ๐ Get context-aware, accurate answers that reference page numbers
- ๐ง Maintains a chat history linked to your reading journey
- ๐งญ Like having an expert co-reader guiding you in real-time
๐ ๏ธ How we built it ๐งช
- Frontend: Built with Streamlit ๐งโ๐จ for a clean and interactive split-screen interface ๐๐ฌ.
- Backend: FastAPI handles uploads, routing, and real-time responses โ๏ธ.
- PDF Processing: PyMuPDF extracts text from each page quickly and efficiently ๐โโ๏ธ.
- Embeddings: Open AI Embeddings
Vector Database: FAISS database to store vector embeddings of the PDF.
- Page-level and full-document embeddings created ๐งฉ
- Stored in vector databases for accurate semantic retrieval ๐๐ฆ
LLM Agent: Integrated a ReAct(Reasoning and Action)-style agent ๐ง using OpenAI APIs to answer, reason, and clarify ๐.
State Management: Chat history tracked ๐ฐ๏ธ with roles and page metadata, for a more personalized feel ๐ฅ๐.
โ๏ธ Challenges we ran into ๐งโโ๏ธ
- ๐งฉ Balancing page-level vs full-document embeddings without leaking context.
- ๐ Keeping latency low even with huge PDFs and long chats.
- ๐จ Designing a distraction-free UI that blends clarity and interactivity.
๐ Accomplishments that we're proud of ๐
- ๐ฏ Built an end-to-end RAG (Retrieval-Augmented Generation) pipeline.
- ๐ Developed a ReAct agent that not only answers but thinks and reasons.
- ๐งผ Created a smooth, intuitive UI merging reading & real-time assistance.
- ๐ Linked chat history to document structure for seamless reference.
๐ What we learned ๐จโ๐
- How to build a full-stack LLM-powered app from scratch ๐๏ธ.
- Deep understanding of document chunking ๐ vs global context ๐.
- Learned to blend AI capabilities with thoughtful UX for better collaboration ๐ฅโจ.
๐ฎ Whatโs next for AskMyDoc ๐
- ๐๏ธ Multi-document support: Compare and query across multiple PDFs!
- ๐ด Offline mode: Use open-source LLMs locally for privacy and speed ๐ก๏ธ.
- ๐๏ธ Interactive annotations: Get answers linked to text and export highlights!
Built With
- fastapi
- streamlit
Log in or sign up for Devpost to join the conversation.