AI PDF Reader

A multi-tenant Flask app that turns a stack of PDFs into an embeddable, retrieval-augmented chatbot. Upload your documents, generate a knowledge base, and drop a one-line <script> tag into any website to render a ChatGPT-style widget that answers questions grounded in your PDFs.

Built in late 2023 as an early exploration of the RAG pattern over the (then-new) LangChain + OpenAI + FAISS stack, with a real account system and S3-backed storage so multiple users could maintain their own private assistants.

Why I built it

I wanted to feel the full RAG loop end to end, not just the "retrieve + answer" toy version: real user accounts, per-tenant document isolation, persisted vector stores, and an embed widget so the chatbot could actually live on someone else's page. The original use case was a chatbot for a small institute's website (see the footer of templates/index.html) that could answer questions about their programs from a handful of brochures.

How it works

PDF upload  ->  S3 (uploads/{user_id}/)
                     |
                     v
              PyPDF2 text extract
                     |
                     v
        CharacterTextSplitter (1000 / 200 overlap)
                     |
                     v
          OpenAI embeddings (batched, 5000 / call)
                     |
                     v
                FAISS vector store
                     |
                     v
   pickle.dumps -> Postgres (User.assistant_data column)

Question  ->  ConversationalRetrievalChain (ChatOpenAI + ConversationBufferMemory)
                     |
                     v
                Grounded answer

Each registered user gets a random 16-byte pin, which is the only key the embed widget needs. The chatbot is generated as a customized embedChatbot.js (with the user's pin, name, icon, greeting baked in), uploaded to S3, and served through CloudFront so any external page can include it with one tag.

In-memory assistant instances are reaped after 10 minutes of inactivity via a threading.Timer to keep RAM bounded; they rehydrate lazily from Postgres on the next chat request.

Stack

Backend: Flask, Flask-Login, Flask-SQLAlchemy, Postgres (psycopg2)
LLM: OpenAI ChatOpenAI via LangChain ConversationalRetrievalChain
Embeddings: OpenAIEmbeddings (text-embedding-ada-002 era)
Vector store: FAISS (CPU), pickled into a LargeBinary column per user
Memory: ConversationBufferMemory (full history, no summarization)
Storage: S3 for PDFs and generated JS, CloudFront for widget delivery
PDF parsing: PyPDF2

Quickstart

git clone https://github.com/dparikh79/AI-PDF-Reader.git
cd AI-PDF-Reader
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env   # then fill in real values
python app.py          # http://127.0.0.1:5000

Required environment variables

# LLM
OPENAI_API_KEY=sk-...
OPENAI_API_ENDPOINT=https://api.openai.com/v1   # optional override

# Flask
FLASK_SECRET_KEY=replace-with-a-long-random-string

# Postgres
DATABASE_URL=postgresql://user:pass@host:5432/aipdfreader

# Local working dir for the embed JS template
BASE_UPLOAD_FOLDER=./uploads
ALLOWED_EXTENSIONS=pdf

# AWS (S3 + CloudFront for storage + widget delivery)
AWS_ACCESS_KEY_ID=...
AWS_SECRET_ACCESS_KEY=...
AWS_REGION=us-east-1
AWS_BUCKET_NAME=your-bucket
CLOUDFRONT_DOMAIN=https://xxxx.cloudfront.net/

There is no .env.example checked in; treat the block above as the source of truth.

What I would change today

This was a 2023 build. With hindsight:

Drop the pickle-into-Postgres pattern. Storing FAISS indexes as pickle.dumps in a LargeBinary column is convenient but couples deserialization to the exact LangChain/FAISS version, and pickle.loads on user-scoped data is a footgun. A managed vector DB (pgvector, Pinecone, Qdrant) is the right answer.
Replace CharacterTextSplitter with RecursiveCharacterTextSplitter and chunk on semantic boundaries rather than raw \n. Current chunking can split mid-sentence on densely formatted PDFs.
Swap ConversationBufferMemory for a summarizing or windowed memory. Long sessions will eventually blow past context limits.
Pin the LangChain version explicitly and migrate to langchain-openai + langchain-community. This repo is on langchain==0.0.312, which predates the v0.1 split.
Move PDF parsing off PyPDF2. pypdf, pdfplumber, or unstructured handle layout and tables better.
Rate-limit the chat endpoint and add per-user spend caps. Right now a single rogue embed page could pull on someone's OpenAI key indefinitely.

Known limits

Designed for short brochures and handbooks, not large corpora. Batching sleeps 60 s between 5000-chunk batches to stay under rate limits, so very large uploads are slow.
Single global in-memory dict means horizontal scaling needs a shared cache (Redis) and inactivity timer redesign.
No SSE / streaming on the chat endpoint; the widget waits for the full response.
WHILE TRUE pin generation has a (vanishingly small) collision retry loop with no upper bound.

Repo layout

app.py            Flask routes, auth, S3 + Postgres glue, lifecycle of in-memory assistants
assistant.py      VectorStore (PDF -> chunks -> FAISS) and Assistant (LangChain chain)
templates/        Jinja templates for auth pages and the admin / chatbot-creation UI
static/           embedChatbot.js template + base stylesheet
requirements.txt  Pinned dependency set from late 2023

License

MIT. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI PDF Reader

Why I built it

How it works

Stack

Quickstart

Required environment variables

What I would change today

Known limits

Repo layout

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
static		static
templates		templates
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
assistant.py		assistant.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AI PDF Reader

Why I built it

How it works

Stack

Quickstart

Required environment variables

What I would change today

Known limits

Repo layout

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages