RukAI is an AI Agent that utilizes multimodal inputs and outputs to help users track their Skipping rope workouts It is a next-generation, real-time fitness agent that watches, analyzes, and coaches your jump rope form using continuous video and audio. Built for the Gemini Live Agents hackathon track, RukAI moves beyond text-in/text-out interactions by acting as a true physical-world companion.
TRY: https://jumprope-coach.web.app/
Traditional fitness apps just count reps. RukAI is built to actually see you. By leveraging device vision and the Gemini Live API, RukAI tracks your biomechanics in real-time. If your form breaks down, the AI instantly interrupts your workout with live, spoken audio corrections to help you fix your technique before you injure yourself.
Here is the data flow for the RukAI application:
graph TD
subgraph Client [Client-Side: React.js Web App]
UI[React UI Dashboard & State]
Cam[Live Webcam Feed]
MP[MediaPipe Pose Landmarker]
Audio[Web Audio API]
WS_C[WebSocket Client]
end
subgraph Firebase [Firebase / Google Cloud]
Auth[Google OAuth 2.0]
DB[(Firestore NoSQL DB)]
Hosting[Firebase Hosting]
end
subgraph Backend [Backend Server: Node.js]
WS_S[WebSocket Server]
ADK[Google ADK]
GeminiClient[Gemini API Controller]
end
subgraph External [Google AI Services]
Gemini[Gemini Live API]
end
Cam -->|30fps Video Frames| MP
MP -->|Skip Count updates| UI
MP -->|Form Correction JSON Flags| WS_C
UI <-->|Read/Write User Stats| DB
UI <-->|Authenticate| Auth
Hosting -->|Serves App To| Client
WS_C <-->|Bi-directional connection| WS_S
WS_S -->|Passes Form Warnings| GeminiClient
GeminiClient <-->|Prompts & Audio Streams| Gemini
GeminiClient -->|Raw Audio Byte Arrays| WS_S
WS_S -->|Pipes Audio| WS_C
WS_C -->|Plays Coaching Voice| Audio
WS_S -->|Triggers Post-Workout Summary| ADK
ADK -->|Saves Summary| DB
- Real-time Interaction (Vision/Audio): Processes 30fps webcam feed to analyze physical movement.
- Interruptible Audio Coaching: Pushes real-time heuristic form flags via WebSocket to the Node.js backend, triggering Gemini Live to speak form corrections out loud.
- Mandatory Tech: Powered by the Gemini Live API for real-time conversational audio and hosted entirely on Google Cloud / Firebase. Generates post-workout analytics using Google's ADK (Agent Development Kit).
- Live Audio Feedback Loop: Streams generated coaching audio to the browser via WebSockets for real-time interventions.
- Developer Testing Mode (Current Vision Engine): The current computer vision engine utilizes lightweight MediaPipe heuristics designed specifically with a "permissive testing mode." This allows judges to easily trigger form corrections (like elbow flares or head-bobbing) and experience the Gemini Live audio loop while sitting at a desk, without needing to perform a rigorous jump rope routine.
- Dynamic Training Calendar: A dot-matrix activity heatmap that automatically cross-references your workout history against custom-generated Firebase regimens (Beginner, Intermediate, Advanced).
- Smart Streak Tracking: Calculates daily consistency utilizing SVG progress rings.
- Frontend (React.js): A split-screen dashboard featuring a cinematic live-camera feed and a sleek metrics panel using Google Sans typography and custom SVG icons.
- Computer Vision Engine (MediaPipe): Runs entirely client-side for zero-latency analysis, piping telemetry data directly to the backend.
- Backend Engine (Node.js & WebSockets): Maintains a persistent socket connection to stream raw byte-array audio from the Gemini Live API to the browser's AudioContext.
- Database (Firestore): A NoSQL architecture managing
users,workouts, and seededregimens.
- Frontend: React.js, HTML5 Canvas, Web Audio API
- Backend: Node.js, WebSockets (
ws) - AI & Vision: Gemini Live API, ADK, Google MediaPipe Tasks Vision
- Cloud & DB: Firebase Hosting, Firestore Database, Google Cloud Platform (OAuth 2.0)
- Node.js (v16+)
- A Firebase Project with Firestore enabled
- A Google Cloud Project with OAuth 2.0 Client IDs configured
- Clone the repository:
git clone [https://github.com/mikitoxo/RukAI.git](https://github.com/mikitoxo/RukAI.git) cd RukAI - Install Frontend Dependencies:
cd frontend npm install - Install Backend Dependencies:
cd backend npm install - Seed the Database:
Ensure your Firebase serviceAccountKey.json is in the backend folder, then run:
node seedRegimens.js
- Run the App:
- a.) Start the backend socket server:
node server.js
- b.) Start the React frontend:
npm start
- a.) Start the backend socket server:
Dedicated Computer Vision AI: Replacing the current heuristic testing engine with a custom-trained, lightweight ML model specifically optimized for jump rope biomechanics and complex maneuvers (like crossovers and double-unders).
Native Mobile Port: Wrapping the React application in React Native for offline-first iOS/Android support.