📄 AI PDF Chatbot

This is an AI-powered chatbot that allows users to upload PDF documents and ask questions about the content. The chatbot extracts text from the PDF, creates a vector store using FAISS, and generates responses using OpenAI GPT models.

🚀 Features

Upload PDF files and extract text.
Store and retrieve text using FAISS vector search.
Ask questions based on the document content.
Uses LangChain and OpenAI API for question answering.
FastAPI backend with a React.js frontend.
Stylish UI with dark mode and modern design.

🛠️ Tech Stack

Backend:

FastAPI (for API endpoints)
FAISS (for vector storage and similarity search)
LangChain (for text chunking and embedding generation)
OpenAI GPT-4 (for answering questions)
pdfplumber & pytesseract (for text extraction)
Uvicorn (for running the FastAPI server)

Frontend:

React.js (for UI)
Axios (for API calls)
Styled Components / CSS (for UI styling)
React Loader Spinner (for better user experience)

📂 Folder Structure

AI-PDF-CHATBOT/
│── backend/
│   ├── main.py              # FastAPI backend logic
│   ├── pdf_processing.py    # PDF text extraction logic
│   ├── vector_store.py      # FAISS vector storage and retrieval
│   ├── settings.py          # Centralized configuration (env-based)
│   ├── uploads/             # Directory to store uploaded PDFs
│   ├── cache/               # Directory to store cached embeddings
│   ├── .env                 # API keys and environment variables (not committed)
│   ├── .env.example         # Example backend environment variables
│   ├── venv/                # Virtual environment (ignored in Git)
│
│── frontend/
│   ├── src/
│   │   ├── components/
│   │   │   ├── FileUpload.js  # Component to upload PDF
│   │   │   ├── Chat.js        # Component for asking questions
│   │   │   ├── ErrorMessage.js# Shared error display component
│   │   ├── App.js             # Main React app logic
│   │   ├── styles/App.css     # Styling for UI
│   ├── public/
│   ├── package.json           # Frontend dependencies
│   ├── .env.example           # Example frontend environment variables
│
│── README.md                 # Project documentation
│── requirements.txt           # Python dependencies
│── package.json               # React dependencies

🔧 Installation & Setup

Backend Setup:

Clone the repository:

git clone https://github.com/yourusername/AI-PDF-CHATBOT.git
cd AI-PDF-CHATBOT/backend

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # macOS/Linux
venv\Scripts\activate     # Windows

Install dependencies:
```
pip install -r requirements.txt
```

Set up your .env file (based on .env.example):

cp .env.example .env
# Then edit .env to include your real values

Run the FastAPI server (development):

uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Visit http://127.0.0.1:8000/docs to test the API, or http://127.0.0.1:8000/health for a simple health check.

For a production-style run (no auto-reload), you can omit the --reload flag:

uvicorn main:app --host 0.0.0.0 --port 8000

Frontend Setup:

Navigate to the frontend folder:
```
cd ../frontend
```
Install dependencies:
```
npm install
```

(Optional) Create a .env file from the example to point to a non-local backend:

cd src/..
cp .env.example .env
# REACT_APP_API_URL defaults to http://localhost:8000 if not set

Start the React app:
```
npm start
```
Open http://localhost:3000 in your browser.

Environment Variables

Backend (backend/.env based on backend/.env.example):

OPENAI_API_KEY – your OpenAI API key (required).
OPENAI_MODEL – OpenAI model name to use (default: gpt-4-turbo).
ALLOWED_ORIGINS – comma-separated list of allowed frontend origins for CORS (default includes http://localhost:3000).
UPLOAD_DIR – directory for uploaded PDFs (default: uploads).
CACHE_DIR – directory for cached embeddings (default: cache).
MAX_PDF_SIZE_MB – maximum allowed PDF size in megabytes (default: 20).
MAX_PDF_PAGES – maximum number of pages to process (default: 200).
FAISS_K – number of chunks to retrieve from FAISS (default: 5).
CONTEXT_MAX_CHARS – maximum number of characters of context sent to the language model (default: 8000).

Frontend (frontend/.env based on frontend/.env.example):

REACT_APP_API_URL – base URL of the backend API (default: http://localhost:8000).

🚀 Usage

Upload a PDF file using the Upload PDF button.
Once uploaded, ask a question related to the PDF content.
The chatbot will retrieve the most relevant context and generate an AI-powered response.
You can clear chat history and re-upload different PDFs.

🛠️ Troubleshooting

1. Backend Issues

If FastAPI does not start, ensure your virtual environment is activated.
If you get an OpenAI API key error, set your .env file correctly.
If dependencies are missing, install them again:
```
pip install -r requirements.txt
```

2. Frontend Issues

If React app does not start, ensure all dependencies are installed:
```
npm install
```
If the backend is not reachable, ensure FastAPI is running at http://127.0.0.1:8000.
If CORS issues occur, make sure CORS is enabled in main.py.

🎯 Future Improvements

Multi-PDF Support: Handle multiple document uploads and queries.
Better Search Optimization: Improve FAISS vector search accuracy.
UI Enhancements: Dark mode toggle, better animations.
Deploy to Cloud: Host on AWS, Vercel, or Heroku.

🎉 Credits & Contributors

Developed by Rohan Jain. Contributions are welcome! Feel free to fork and improve the project.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 AI PDF Chatbot

🚀 Features

🛠️ Tech Stack

Backend:

Frontend:

📂 Folder Structure

🔧 Installation & Setup

Backend Setup:

Frontend Setup:

Environment Variables

🚀 Usage

🛠️ Troubleshooting

1. Backend Issues

2. Frontend Issues

🎯 Future Improvements

🎉 Credits & Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📄 AI PDF Chatbot

🚀 Features

🛠️ Tech Stack

Backend:

Frontend:

📂 Folder Structure

🔧 Installation & Setup

Backend Setup:

Frontend Setup:

Environment Variables

🚀 Usage

🛠️ Troubleshooting

1. Backend Issues

2. Frontend Issues

🎯 Future Improvements

🎉 Credits & Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages