AI Chatbot Architecture

A custom RAG-powered chatbot I built to answer questions about my work and experience. Not an external service — I designed and implemented every component.

How it works

💬

Your question

🔍

Hybrid Search FAISS + BM25

🧠

LLM Groq

✨

Response

When you ask a question, the system searches my knowledge base using both semantic similarity (understanding meaning) and keyword matching. The top results are combined using Reciprocal Rank Fusion, then passed to an LLM to generate a natural response.

Technology Stack

FastAPI

Async Python API

FAISS

Vector similarity search

BM25

Keyword-based ranking

🤗

Sentence Transformers

HuggingFace embeddings

Groq

Fast LLM inference

Redis

Response caching

PDF

PyMuPDF

Document processing

Livewire Volt

Real-time chat widget

SSE

Streaming

Server-sent events

Key Features

Hybrid search — Combines semantic understanding with keyword matching using Reciprocal Rank Fusion for better results
Streaming responses — Answers appear in real-time via Server-Sent Events for a natural conversation feel
Conversation history — Maintains context across messages for follow-up questions
Response caching — Redis caching for fast repeated queries and cost optimization
Document ingestion — Processes PDFs, Markdown, and text files with intelligent chunking

View the RAG service code

Navigation