A sophisticated Retrieval-Augmented Generation (RAG) application powered by NVIDIA AI Endpoints and LangChain for intelligent document question-answering.
- Powered by NVIDIA's state-of-the-art language models
- Support for multiple models: Llama 3 70B/8B, Mixtral 8x7B
- Configurable temperature and generation parameters
- PDF Directory: Process entire directories of PDF files
- File Upload: Support for PDF and text file uploads
- Web Scraping: Extract content from web URLs
- Flexible Processing: Configurable chunk size and overlap
- Semantic search using FAISS vector database
- NVIDIA embeddings for high-quality document representations
- Context-aware answer generation
- Track query performance and response times
- Interactive charts and metrics dashboard
- Usage statistics and optimization insights
- Modern, responsive chat UI with animations
- Source document citations
- Chat history management
- Real-time streaming responses
- Dark mode with gradient backgrounds
- Glassmorphism design elements
- Smooth animations and transitions
- Mobile-responsive layout
| Component | Technology |
|---|---|
| Framework | Streamlit |
| AI Models | NVIDIA AI Endpoints |
| Document Processing | LangChain |
| Vector Database | FAISS |
| Visualizations | Plotly |
| PDF Processing | PyPDF |
| Web Scraping | LangChain WebBaseLoader |
- Python 3.8+
- NVIDIA API Key (Get yours here)
- pip package manager
git clone https://github.com/yourusername/nvidia-rag-assistant.git
cd nvidia-rag-assistantpip install -r requirements.txtCreate a .env file in the root directory:
NVIDIA_API_KEY=your_nvidia_api_key_herestreamlit run app2.pyThe application will open in your browser at http://localhost:8501
- Navigate to the "Document Upload" tab
- Choose your document source:
- PDF Directory: Enter the path to your PDF folder
- Upload Files: Drag and drop PDF/text files
- Web URL: Enter a webpage URL
- Click "Process Documents" to create embeddings
- Go to the "Chat" tab
- Type your question in the input field
- Click "Search" to get AI-powered answers
- View source documents for context
- Check the "Analytics" tab for:
- Response time trends
- Query statistics
- Performance metrics
Choose from available NVIDIA models:
meta/llama3-70b-instruct(Recommended)meta/llama3-8b-instruct(Faster)mistralai/mixtral-8x7b-instruct-v0.1
- Chunk Size: 100-2000 characters (default: 700)
- Chunk Overlap: 0-500 characters (default: 50)
- Temperature: 0.0-1.0 (default: 0.7)
nvidia-rag-assistant/
├── app2.py # Main Streamlit application
├── requirements.txt # Python dependencies
├── .env # Environment variables (create this)
├── us_census/ # Sample PDF directory
├── README.md # This file
└── docs/ # Documentation
The application uses optimized prompts for:
- Context-aware responses
- Source attribution
- Concise but thorough answers
- Search Strategy: Similarity search with k=5
- Embedding Model: NVIDIA embeddings
- Index Type: FAISS for efficient retrieval
- Caching of embeddings and vector stores
- Async processing for better UX
- Memory-efficient document chunking
- Research & Analysis: Query large document collections
- Customer Support: AI-powered knowledge base
- Legal Document Review: Search through contracts and legal texts
- Academic Research: Literature review and citation finding
- Technical Documentation: API docs and manual querying
"What are the main findings about population growth?"
"Summarize the key statistics from the census data"
"What trends are mentioned in the economic section?"
"Compare the demographic changes between regions"
1. NVIDIA API Key Error
Solution: Ensure your NVIDIA_API_KEY is correctly set in the .env file
2. Memory Issues with Large Documents
Solution: Reduce chunk_size or process documents in smaller batches
3. Slow Response Times
Solution: Try using the llama3-8b-instruct model for faster responses
- Use smaller chunk sizes for faster processing
- Enable GPU acceleration if available
- Process documents in batches for large collections
We welcome contributions! Please see our contributing guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- NVIDIA for providing powerful AI endpoints
- LangChain for the excellent RAG framework
- Streamlit for the amazing web app framework
- FAISS for efficient vector similarity search
- 📧 Email: quachphuwork@gmail.com
- 💬 Issues: GitHub Issues
- 📖 Documentation: Wiki
- Support for more document formats (DOCX, HTML)
- Multi-language support
- Advanced filtering and search options
- API endpoint for programmatic access
- Integration with more AI models
- Collaborative document annotation
⭐ Star this repo if you find it helpful! ⭐
Made with ❤️ by Phu Quach