Skip to content

iqbatrg/FUJI-hackathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

124 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FujiVoice AI Tutor πŸŽ“

An intelligent AI-powered learning platform that provides real-time feedback on student work through voice assistance, PDF annotation, and conversational AI tutoring.

FujiVoice Banner Next.js FastAPI

✨ Features

  • πŸ“„ Smart PDF Workspace - Upload PDFs and annotate them with an integrated drawing canvas
  • 🎨 Real-time Drawing - Draw, highlight, and mark up documents with multi-page support
  • πŸ”Š Voice Interaction - Ask questions using voice input and receive audio feedback
  • πŸ€– AI-Powered Analysis - Automatic OCR processing and error detection using Mathpix & Google Gemini
  • πŸ’¬ Conversational AI Tutor - Context-aware tutoring with conversation memory
  • 🎡 Natural Voice Feedback - Text-to-speech responses using ElevenLabs
  • πŸ“€ Export Functionality - Save annotated PDFs with your work
  • 🎯 Auto-capture - Automatic screenshot capture 5 seconds after drawing stops

πŸ—οΈ Tech Stack

Frontend

  • Next.js 15.5 - React framework with Turbopack
  • React 19 - UI library
  • Tailwind CSS 4 - Styling
  • React PDF - PDF rendering
  • Fabric.js - Canvas drawing
  • React Dropzone - File uploads

Backend

  • FastAPI - Python web framework
  • Google Gemini AI - Conversational AI and analysis
  • Mathpix OCR - Handwriting and math formula recognition
  • ElevenLabs - Text-to-speech synthesis
  • Uvicorn - ASGI server

πŸš€ Getting Started

Prerequisites

  • Node.js 18.x or higher
  • Python 3.8 or higher
  • npm or yarn
  • API Keys (see Configuration section)

Installation

  1. Clone the repository

    git clone https://github.com/your-username/FUJI-hackathon.git
    cd FUJI-hackathon
  2. Set up the Frontend

    cd tutor
    npm install
  3. Set up the Backend

    cd ../backend
    pip install -r requirements.txt

Configuration

Backend Environment Variables

Create a .env file in the backend/ directory:

# Required API Keys
GEMINI_API_KEY=your_google_gemini_api_key_here
MATHPIX_APP_ID=your_mathpix_app_id_here
MATHPIX_APP_KEY=your_mathpix_app_key_here
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here

# Optional: ElevenLabs Voice ID (default provided in code)
ELEVENLABS_VOICE_ID=your_preferred_voice_id

Get your API keys:

Frontend Environment Variables

Create a .env.local file in the tutor/ directory:

NEXT_PUBLIC_API_URL=http://localhost:8000

Running the Application

1. Start the Backend Server

cd backend
uvicorn main:app --reload --port 8000

The API will be available at http://localhost:8000

2. Start the Frontend Development Server

In a new terminal:

cd tutor
npm run dev

The application will be available at http://localhost:3000

πŸ“– Usage

Smart Workspace

  1. Navigate to the Workspace page
  2. Upload a PDF document using drag-and-drop or the "Add PDF" button
  3. Enable Drawing Mode to annotate the document
  4. Your work will be automatically analyzed after 5 seconds of inactivity

Voice Interaction

  1. Click the "Talk" button to start voice recognition
  2. Ask your question or describe your problem
  3. The system will:
    • Capture the current PDF page
    • Process handwriting with OCR
    • Analyze your work with AI
    • Provide audio and text feedback

AI Tutor Chat

  1. Navigate to the Tutor page
  2. Type your questions in the chat interface
  3. Receive instant AI-powered responses
  4. Context is maintained throughout the conversation

Export Your Work

  1. Click the "Export" button in the workspace
  2. Choose to export:
    • Current page only
    • All pages with annotations
  3. Download the annotated PDF

🎨 Color Scheme & Branding

Primary Colors:

  • Indigo: #4F46E5 (indigo-600)
  • Purple: #9333EA (purple-600)
  • Pink: #EC4899 (pink-600)

Gradients:

  • Primary: from-indigo-600 to-purple-600
  • Extended: from-indigo-600 via-purple-600 to-pink-600

πŸ—οΈ Project Structure

FUJI-hackathon/
β”œβ”€β”€ backend/                 # FastAPI backend
β”‚   β”œβ”€β”€ main.py             # Main API routes and logic
β”‚   β”œβ”€β”€ requirements.txt    # Python dependencies
β”‚   └── .env                # Environment variables (create this)
β”‚
β”œβ”€β”€ tutor/                  # Next.js frontend
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ app/           # Next.js app directory
β”‚   β”‚   β”‚   β”œβ”€β”€ page.js           # Home page
β”‚   β”‚   β”‚   β”œβ”€β”€ work/page.js      # PDF workspace
β”‚   β”‚   β”‚   β”œβ”€β”€ tutor/page.js     # AI chat tutor
β”‚   β”‚   β”‚   └── login/page.js     # Login page
β”‚   β”‚   └── components/    # React components
β”‚   β”‚       β”œβ”€β”€ Navbar.js         # Navigation bar
β”‚   β”‚       └── DrawingCanvas.js  # Canvas component
β”‚   β”œβ”€β”€ public/            # Static assets
β”‚   └── .env.local         # Environment variables (create this)
β”‚
└── README.md              # This file

πŸ”§ API Endpoints

Backend (FastAPI)

  • POST /api/ocr - Process image with Mathpix OCR
  • POST /api/analyze - Analyze student work with Gemini AI
  • GET /api/voice - Get voice settings
  • POST /api/voice - Update voice settings
  • POST /api/chat - Chat with AI tutor

🎯 Workflow

  1. Student uploads PDF β†’ Renders in workspace
  2. Student draws/annotates β†’ Auto-capture after 5 seconds
  3. Screenshot sent to Mathpix β†’ OCR extracts LaTeX/text
  4. Data sent to Gemini AI β†’ Analyzes work and context
  5. AI generates feedback β†’ Text response created
  6. ElevenLabs TTS β†’ Converts to natural speech
  7. Student receives β†’ Audio playback + visual feedback panel

πŸ› Troubleshooting

Backend Issues

Port already in use:

uvicorn main:app --reload --port 8001

API keys not working:

  • Double-check .env file is in the backend/ directory
  • Ensure no extra spaces in API keys
  • Verify API keys are active and have credits

Frontend Issues

PDF not rendering:

  • Check that pdf.worker.min.js is in the public/ folder
  • Clear browser cache and reload

Voice not working:

  • Use Chrome or Safari (required for Web Speech API)
  • Grant microphone permissions
  • Check if backend is running

πŸ“ License

This project was created for the FUJI Hackathon.

πŸ‘₯ Contributors

  • Usmaan Sayed
  • Iqbal Ghanci
  • Furqan Ahcom
  • Jonathan Robin

πŸ™ Acknowledgments

  • Google Gemini AI for intelligent tutoring
  • Mathpix for OCR capabilities
  • ElevenLabs for natural voice synthesis
  • Next.js and FastAPI communities

Made with ❀️ for students everywhere

About

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors