A voice-enabled application that helps users fill out PDF forms using natural language conversation with an AI assistant. No typing or reading required!
- 🎤 Voice Input: Speak your answers instead of typing
- 🔊 Voice Output: AI responses are read aloud
- 🤖 AI-Powered: Uses Google Gemini AI for natural conversation
- 📄 PDF Form Filling: Automatically extracts and fills PDF form fields
- ♿ Accessible: Designed for users who prefer voice interaction
- FastAPI (Python) - High-performance web framework
- Google Gemini API - AI conversation engine
- PyPDF2 - PDF processing and form field extraction
- React with TypeScript - Modern UI framework
- Web Speech API - Browser-native speech recognition and synthesis
- Vite - Fast build tool
Before you begin, make sure you have:
- Python 3.8+ installed (Download)
- Node.js 16+ and npm installed (Download)
- Git installed (Download)
- A Google Gemini API key (Get one here)
git clone https://github.com/yourusername/HackHive2026.git
cd HackHive2026-
Navigate to the backend directory:
cd backend -
Create a virtual environment (recommended):
# On Windows python -m venv venv .\.venv\Scripts\Activate.ps1 # On macOS/Linux python3 -m venv venv source venv/bin/activate.ps1
-
Install Python dependencies:
pip install -r requirements.txt
-
Create a
.envfile in the backend directory:# Windows copy .env.example .env # macOS/Linux cp .env.example .env
-
Add your Gemini API key to
.env:GEMINI_API_KEY=your_actual_api_key_here -
Start the backend server: With uvicorn:
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
✅ Backend will run at
http://localhost:8000
-
Open a new terminal and navigate to the frontend directory:
cd frontend -
Install Node dependencies:
npm install
-
Start the development server:
npm run dev
✅ Frontend will run at
http://localhost:5173
Open your browser and go to:
http://localhost:5173
🎉 You're ready to use the app!
Backend won't start:
- Make sure Python 3.8+ is installed:
python --version - Check that all dependencies are installed:
pip install -r requirements.txt - Make sure port 8000 is not in use, or change it:
uvicorn app.main:app --port 8001
Frontend won't start:
- Make sure Node.js 16+ is installed:
node --version - Delete
node_modulesand try again:rm -r node_modules npm install npm run dev
API Key errors:
- Verify your
.envfile is in thebackendfolder - Make sure the API key is valid:
GEMINI_API_KEY=your_key_here - The app works without a key but with limited AI responses
Speech recognition not working:
- Use a modern browser (Chrome or Edge recommended)
- Ensure your microphone is connected and permitted
- Check browser console for errors (F12 → Console)
- Upload a PDF: Click "Choose PDF File" and select a PDF form
- Start Voice Input: Click the microphone button to start speaking
- Answer Questions: The AI will ask you questions about each form field
- Review Progress: See which fields have been filled
- Download: Once complete, download your filled PDF
POST /api/upload-pdf- Upload and process a PDFPOST /api/chat- Send a message to the AI assistantPOST /api/update-field- Manually update a form fieldPOST /api/fill-pdf- Generate and download filled PDFGET /api/session/{sessionId}- Get session status
- Speech Recognition: Works best in Chrome/Edge (uses WebKit Speech API)
- Speech Synthesis: Supported in all modern browsers
- PDF Upload: Supported in all modern browsers
- PDF form field extraction works best with PDFs that have proper form fields (AcroForm)
- Raihan Carder
- Tony Park
- Suhi Kasim
- Aarnav Shrivastava