Skip to content

wenn00/EdelWise-GoogleGeminiChallenge

Repository files navigation

EdelWise — AI-Powered Tech Guidance for Elderly Users

Google Gemini Challenge 2026 submission

What is EdelWise?

EdelWise is a mobile AI assistant that helps elderly users learn to use smartphone apps through real-time voice guidance. Powered by Gemini Live API, it watches the user's screen and speaks step-by-step instructions in natural, adaptive conversation.

The agent guides, not operates — instead of automating tasks for the user, EdelWise teaches them to do it themselves, building confidence and digital literacy.

Example

"I want to send a photo to my friend on Instagram."

EdelWise sees the user's screen, identifies the current state, and speaks: "Great! I can see you've opened Instagram. Now tap the paper airplane icon at the top right corner to open your messages."

Features

  • Real-time voice guidance — Gemini Live API for natural, conversational coaching
  • Screen understanding — Captures and analyzes screenshots to know exactly where the user is
  • Step-by-step task tracking — Progress visualization with advance/retry/abandon logic
  • Adaptive pacing — Adjusts instruction detail based on user ability and responses
  • Elderly-accessible UI — Large text, high contrast, haptic feedback, minimal cognitive load
  • Resilient connectivity — Automatic reconnection, session recovery, and TTS fallback

Tech Stack

Layer Technology
Frontend React Native 0.81 + Expo SDK 54, React 19, Expo Router 6, TypeScript
Backend Node.js + WebSocket + Gemini Live API + Google Cloud TTS
Database Supabase (PostgreSQL) + Edge Functions
Native Android MediaProjection for screen capture

Prerequisites

  • Node.js 20+
  • npm
  • Expo CLI (npx expo)
  • Supabase CLI (npx supabase)
  • Google Cloud project with:
    • Gemini API key
    • Cloud Text-to-Speech API enabled

Getting Started

1. Clone & Install

git clone https://github.com/wenn00/EdelWise-GoogleGeminiChallenge.git
cd EdelWise-GoogleGeminiChallenge
npm install

2. Environment Variables

Create .env in the project root (frontend):

SUPABASE_URL=<your-supabase-url>
SUPABASE_ANON_KEY=<your-supabase-anon-key>
GEMINI_API_KEY=<your-gemini-api-key>
API_BASE_URL=http://localhost:8080
WS_URL=ws://localhost:8080/v1/ws

Create backend/.env:

GEMINI_API_KEY=<your-gemini-api-key>
SUPABASE_URL=<your-supabase-url>
SUPABASE_SERVICE_ROLE_KEY=<your-supabase-service-role-key>
SESSION_JWT_SECRET=<random-secret>
PORT=8080

3. Supabase Setup

npx supabase start           # Start local Supabase
npx supabase db push          # Apply migrations
npx supabase functions serve  # Start edge functions

4. Start Backend

cd backend
npm install
npm run dev

5. Start Frontend

npx expo start
# Press 'a' for Android, 'i' for iOS, 'w' for Web

Project Structure

├── app/                    # Expo Router screens
│   ├── (tabs)/             # Bottom tab navigator (Home + History)
│   ├── guidance/           # Guidance session flow
│   └── task-select.tsx     # App and task selection
├── backend/                # Node.js WebSocket server
│   ├── server.ts           # Entry point
│   ├── services/           # Gemini Live, TTS, step engine, prompt builder
│   └── prompts/            # System prompt for agent behavior
├── components/             # React Native components
│   └── ui/                 # Elderly-accessible UI primitives
├── services/               # Frontend service layer
│   ├── websocket-manager.ts
│   ├── audio-capture.ts    # PCM 16kHz microphone streaming
│   ├── audio-player.ts     # TTS playback with jitter buffer
│   ├── session-orchestrator.ts
│   └── ...
├── supabase/               # Database migrations & edge functions
├── modules/                # Native modules (screen capture)
├── context/                # React context (session state)
├── hooks/                  # Custom hooks
├── types/                  # TypeScript type definitions
└── constants/              # Theme, colors, config

Architecture

┌──────────────┐     WebSocket      ┌──────────────────┐
│   Mobile App │ ◄────────────────► │  Backend Server   │
│  (Expo/RN)   │  audio + screens   │  (Node.js + WS)   │
└──────┬───────┘                    └────────┬─────────┘
       │                                     │
       │ UI / Audio                          │ Gemini Live API
       │ Haptics                             │ Google Cloud TTS
       ▼                                     ▼
┌──────────────┐                    ┌──────────────────┐
│   User's     │                    │  Gemini + TTS    │
│   Phone      │                    │  (Google Cloud)  │
└──────────────┘                    └────────┬─────────┘
                                             │
                                    ┌────────▼─────────┐
                                    │    Supabase       │
                                    │  (PostgreSQL +    │
                                    │   Edge Functions) │
                                    └──────────────────┘

Flow:

  1. User speaks a goal → mic audio streams to backend via WebSocket
  2. Backend forwards audio to Gemini Live API for understanding
  3. Gemini analyzes user intent + screenshot context → generates guidance
  4. Backend evaluates step progress, converts response to speech via Cloud TTS
  5. Audio instruction streams back to the mobile app
  6. App displays visual overlay + plays voice guidance

Development Commands

npx expo start              # Start dev server
npx expo start --android    # Android
npx expo start --ios        # iOS
npx expo start --web        # Web
npx expo lint               # ESLint
npm run typecheck           # TypeScript type checking

Team

  • hjcloog — Frontend architecture, UI components, E2E integration
  • wenn00 — Backend services, Gemini integration, iOS fixes

License

Built for the Google Gemini Challenge 2026.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors