Skip to content

JithendraNara/rag-document-detective

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

37 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ•΅οΈ Private Document Detective (RAG Pipeline)

A Retrieval-Augmented Generation (RAG) application that allows users to perform semantic search over private PDF documents.

Unlike standard chatbots, this system "grounds" the AI's responses in specific, user-provided data, reducing hallucinations and enabling queries over domain-specific knowledge (contracts, manuals, research papers).

πŸ”— Live Demo: Deployed on Vercel


πŸ—οΈ Architecture

The system consists of two distinct pipelines:

1. Ingestion Pipeline (Python/LangChain)

  • Loads raw PDF data from the documents/ folder
  • Chunks text into manageable segments (1000 characters) with 200-character overlap to preserve context
  • Generates vector embeddings using text-embedding-3-small
  • Upserts vectors to Pinecone (Serverless)

2. Retrieval Pipeline (Next.js/Vercel AI SDK)

  • Converts user queries into vector embeddings
  • Performs a semantic similarity search in Pinecone to retrieve the top 3 relevant chunks
  • Injects these chunks as "System Context" into the LLM (GPT-4o-mini)
  • Streams the response back to the user in real-time
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     INGESTION PIPELINE                          β”‚
β”‚  PDF β†’ Chunking β†’ Embeddings β†’ Pinecone Vector DB               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     RETRIEVAL PIPELINE                          β”‚
β”‚  User Query β†’ Embedding β†’ Pinecone Search β†’ Context + LLM β†’ Response β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ’‘ Engineering Decisions: Why not just use ChatGPT?

A common question is: "Why build this app when I can just upload a file to ChatGPT?"

This system addresses specific Enterprise constraints that consumer tools cannot:

Challenge ChatGPT This System
Scale ~128K token context limit βœ… Handles infinite documents - retrieves only relevant chunks
Cost Expensive (entire document in prompt) βœ… ~95% cheaper - only sends 3 relevant paragraphs
Data Freshness Manual re-uploads required βœ… Programmatic real-time updates
Embeddability Locked to ChatGPT interface βœ… API-first, embed anywhere
Privacy Data goes to OpenAI βœ… Control over data flow

πŸ› οΈ Tech Stack

Layer Technology
Frontend Next.js 16 (App Router), React 19, Tailwind CSS 4
AI Orchestration Vercel AI SDK v6 (streaming responses)
Vector Database Pinecone (Serverless)
LLM OpenAI GPT-4o-mini (cost-optimized)
Embeddings OpenAI text-embedding-3-small
Ingestion Python, LangChain, PyPDF
Deployment Vercel

πŸš€ Getting Started

Prerequisites

1. Clone the Repository

git clone https://github.com/JithendraNara/rag-document-detective.git
cd rag-document-detective

2. Set Up Environment Variables

Create a .env file in the root directory:

OPENAI_API_KEY=sk-your-openai-api-key
PINECONE_API_KEY=your-pinecone-api-key
PINECONE_INDEX_NAME=doc-chat

3. Ingest Documents (Python)

Before running the app, populate the vector database with your documents:

# Place your PDF files in the documents/ folder
mkdir -p documents
cp your-file.pdf documents/

# Install Python dependencies
pip install -r requirements.txt

# Run the ingestion script
python ingest.py

The script will:

  • Create a Pinecone index if it doesn't exist
  • Process all PDFs in the documents/ folder
  • Chunk, embed, and upload to Pinecone

4. Run the Web App

# Install Node.js dependencies
npm install

# Start the development server
npm run dev

Open http://localhost:3000 to start chatting with your documents!


πŸ“ Project Structure

β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── chat/
β”‚   β”‚       └── route.ts      # Chat API - retrieval + LLM
β”‚   β”œβ”€β”€ admin/
β”‚   β”‚   └── page.tsx          # Admin page with ingestion instructions
β”‚   β”œβ”€β”€ page.tsx              # Main chat interface
β”‚   β”œβ”€β”€ layout.tsx            # Root layout
β”‚   └── globals.css           # Global styles
β”œβ”€β”€ documents/                 # Place PDFs here for ingestion
β”œβ”€β”€ ingest.py                  # Python ingestion script
β”œβ”€β”€ requirements.txt           # Python dependencies
β”œβ”€β”€ package.json              # Node.js dependencies
└── README.md

πŸ”§ Configuration

Chunk Settings (ingest.py)

chunk_size = 1000      # Characters per chunk
chunk_overlap = 200    # Overlap between chunks

Retrieval Settings (app/api/chat/route.ts)

topK: 3                // Number of chunks to retrieve
model: 'gpt-4o-mini'   // LLM model (cost-optimized)

πŸ’° Cost Optimization

This app is configured for minimal costs:

Component Model Cost
Chat gpt-4o-mini $0.15/1M input, $0.60/1M output
Embeddings text-embedding-3-small $0.02/1M tokens
Vector DB Pinecone Serverless Free tier available

Estimated cost: < $0.01 per conversation for typical usage.


🚒 Deployment

Deploy to Vercel

  1. Push your code to GitHub
  2. Import the project in Vercel
  3. Add environment variables in Vercel dashboard:
    • OPENAI_API_KEY
    • PINECONE_API_KEY
    • PINECONE_INDEX_NAME
  4. Deploy!

Note: Document ingestion must be done locally using the Python script. The web app handles chat/retrieval only.


πŸ“ License

MIT License - feel free to use this for your own projects!


🀝 Contributing

Contributions are welcome! Please open an issue or submit a pull request.


Built with ❀️ using Next.js, Vercel AI SDK, and Pinecone

About

RAG system for document search

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors