FujiVoice AI Tutor 🎓

An intelligent AI-powered learning platform that provides real-time feedback on student work through voice assistance, PDF annotation, and conversational AI tutoring.

✨ Features

📄 Smart PDF Workspace - Upload PDFs and annotate them with an integrated drawing canvas
🎨 Real-time Drawing - Draw, highlight, and mark up documents with multi-page support
🔊 Voice Interaction - Ask questions using voice input and receive audio feedback
🤖 AI-Powered Analysis - Automatic OCR processing and error detection using Mathpix & Google Gemini
💬 Conversational AI Tutor - Context-aware tutoring with conversation memory
🎵 Natural Voice Feedback - Text-to-speech responses using ElevenLabs
📤 Export Functionality - Save annotated PDFs with your work
🎯 Auto-capture - Automatic screenshot capture 5 seconds after drawing stops

🏗️ Tech Stack

Frontend

Next.js 15.5 - React framework with Turbopack
React 19 - UI library
Tailwind CSS 4 - Styling
React PDF - PDF rendering
Fabric.js - Canvas drawing
React Dropzone - File uploads

Backend

FastAPI - Python web framework
Google Gemini AI - Conversational AI and analysis
Mathpix OCR - Handwriting and math formula recognition
ElevenLabs - Text-to-speech synthesis
Uvicorn - ASGI server

🚀 Getting Started

Prerequisites

Node.js 18.x or higher
Python 3.8 or higher
npm or yarn
API Keys (see Configuration section)

Installation

Clone the repository

git clone https://github.com/your-username/FUJI-hackathon.git
cd FUJI-hackathon

Set up the Frontend
```
cd tutor
npm install
```

Set up the Backend

cd ../backend
pip install -r requirements.txt

Configuration

Backend Environment Variables

Create a .env file in the backend/ directory:

# Required API Keys
GEMINI_API_KEY=your_google_gemini_api_key_here
MATHPIX_APP_ID=your_mathpix_app_id_here
MATHPIX_APP_KEY=your_mathpix_app_key_here
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here

# Optional: ElevenLabs Voice ID (default provided in code)
ELEVENLABS_VOICE_ID=your_preferred_voice_id

Get your API keys:

Google Gemini: https://makersuite.google.com/app/apikey
Mathpix: https://mathpix.com/
ElevenLabs: https://elevenlabs.io/

Frontend Environment Variables

Create a .env.local file in the tutor/ directory:

NEXT_PUBLIC_API_URL=http://localhost:8000

Running the Application

1. Start the Backend Server

cd backend
uvicorn main:app --reload --port 8000

The API will be available at http://localhost:8000

2. Start the Frontend Development Server

In a new terminal:

cd tutor
npm run dev

The application will be available at http://localhost:3000

📖 Usage

Smart Workspace

Navigate to the Workspace page
Upload a PDF document using drag-and-drop or the "Add PDF" button
Enable Drawing Mode to annotate the document
Your work will be automatically analyzed after 5 seconds of inactivity

Voice Interaction

Click the "Talk" button to start voice recognition
Ask your question or describe your problem
The system will:
- Capture the current PDF page
- Process handwriting with OCR
- Analyze your work with AI
- Provide audio and text feedback

AI Tutor Chat

Navigate to the Tutor page
Type your questions in the chat interface
Receive instant AI-powered responses
Context is maintained throughout the conversation

Export Your Work

Click the "Export" button in the workspace
Choose to export:
- Current page only
- All pages with annotations
Download the annotated PDF

🎨 Color Scheme & Branding

Primary Colors:

Indigo: #4F46E5 (indigo-600)
Purple: #9333EA (purple-600)
Pink: #EC4899 (pink-600)

Gradients:

Primary: from-indigo-600 to-purple-600
Extended: from-indigo-600 via-purple-600 to-pink-600

🏗️ Project Structure

FUJI-hackathon/
├── backend/                 # FastAPI backend
│   ├── main.py             # Main API routes and logic
│   ├── requirements.txt    # Python dependencies
│   └── .env                # Environment variables (create this)
│
├── tutor/                  # Next.js frontend
│   ├── src/
│   │   ├── app/           # Next.js app directory
│   │   │   ├── page.js           # Home page
│   │   │   ├── work/page.js      # PDF workspace
│   │   │   ├── tutor/page.js     # AI chat tutor
│   │   │   └── login/page.js     # Login page
│   │   └── components/    # React components
│   │       ├── Navbar.js         # Navigation bar
│   │       └── DrawingCanvas.js  # Canvas component
│   ├── public/            # Static assets
│   └── .env.local         # Environment variables (create this)
│
└── README.md              # This file

🔧 API Endpoints

Backend (FastAPI)

POST /api/ocr - Process image with Mathpix OCR
POST /api/analyze - Analyze student work with Gemini AI
GET /api/voice - Get voice settings
POST /api/voice - Update voice settings
POST /api/chat - Chat with AI tutor

🎯 Workflow

Student uploads PDF → Renders in workspace
Student draws/annotates → Auto-capture after 5 seconds
Screenshot sent to Mathpix → OCR extracts LaTeX/text
Data sent to Gemini AI → Analyzes work and context
AI generates feedback → Text response created
ElevenLabs TTS → Converts to natural speech
Student receives → Audio playback + visual feedback panel

🐛 Troubleshooting

Backend Issues

Port already in use:

uvicorn main:app --reload --port 8001

API keys not working:

Double-check .env file is in the backend/ directory
Ensure no extra spaces in API keys
Verify API keys are active and have credits

Frontend Issues

PDF not rendering:

Check that pdf.worker.min.js is in the public/ folder
Clear browser cache and reload

Voice not working:

Use Chrome or Safari (required for Web Speech API)
Grant microphone permissions
Check if backend is running

📝 License

This project was created for the FUJI Hackathon.

👥 Contributors

Usmaan Sayed
Iqbal Ghanci
Furqan Ahcom
Jonathan Robin

🙏 Acknowledgments

Google Gemini AI for intelligent tutoring
Mathpix for OCR capabilities
ElevenLabs for natural voice synthesis
Next.js and FastAPI communities

Made with ❤️ for students everywhere

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
backend-accounts		backend-accounts
backend		backend
tutor		tutor
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

FujiVoice AI Tutor 🎓

✨ Features

🏗️ Tech Stack

Frontend

Backend

🚀 Getting Started

Prerequisites

Installation

Configuration

Backend Environment Variables

Frontend Environment Variables

Running the Application

1. Start the Backend Server

2. Start the Frontend Development Server

📖 Usage

Smart Workspace

Voice Interaction

AI Tutor Chat

Export Your Work

🎨 Color Scheme & Branding

🏗️ Project Structure

🔧 API Endpoints

Backend (FastAPI)

🎯 Workflow

🐛 Troubleshooting

Backend Issues

Frontend Issues

📝 License

👥 Contributors

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages