How it works
💬
Your question
🔍
Hybrid Search
FAISS + BM25
🧠
LLM
Groq
✨
Response
When you ask a question, the system searches my knowledge base using both semantic similarity (understanding meaning) and keyword matching. The top results are combined using Reciprocal Rank Fusion, then passed to an LLM to generate a natural response.
Technology Stack
Py
FastAPI
Async Python API
AI
FAISS
Vector similarity search
25
BM25
Keyword-based ranking
🤗
Sentence Transformers
HuggingFace embeddings
G
Groq
Fast LLM inference
R
Redis
Response caching
PDF
PyMuPDF
Document processing
LW
Livewire Volt
Real-time chat widget
SSE
Streaming
Server-sent events
Key Features
- Hybrid search — Combines semantic understanding with keyword matching using Reciprocal Rank Fusion for better results
- Streaming responses — Answers appear in real-time via Server-Sent Events for a natural conversation feel
- Conversation history — Maintains context across messages for follow-up questions
- Response caching — Redis caching for fast repeated queries and cost optimization
- Document ingestion — Processes PDFs, Markdown, and text files with intelligent chunking