A voice-first AI companion for visually impaired people. Jia uses computer vision and natural language to describe the environment, answer questions, and provide accessibility assistance.
- 🎤 Voice-First Interface - Speak naturally to interact with the AI
- 👁️ Computer Vision - Describes what the camera sees in real-time
- 🔊 Natural Text-to-Speech - Uses the best available natural-sounding voices
- ⚡ Fast Response - Instant speech recognition and AI responses
- 🛡️ Safety First - Proactively warns about hazards and obstacles
Ngrok allows you to expose your local server to the internet, enabling voice interactions from any device.
- Node.js 18+
- An OpenAI API key
- Ngrok account (free tier works)
-
Clone the repository
git clone https://github.com/so-nerdyy/Jia.git cd Jia -
Install dependencies
npm install
-
Set up your OpenAI API key
# Create .env file cp .env.example .env # Edit .env and add your OpenAI API key OPENAI_API_KEY=your_api_key_here
-
Start the development server
npm run dev
-
Expose with Ngrok (in a new terminal)
ngrok http 5173
-
Access Jia
- Copy the ngrok URL (e.g.,
https://abc123.ngrok.io) - Open it in your browser
- Grant camera and microphone permissions
- Copy the ngrok URL (e.g.,
If you want to use ngrok with the dev server on a specific port:
# Terminal 1: Start the app on port 5173
npm run dev
# Terminal 2: Start ngrok
ngrok tcp 5173Then access using the TCP URL provided by ngrok.
- Frontend: React + Vite (PWA capable)
- AI Chat: OpenAI GPT-4o (configurable to GPT-5 when available)
- Voice: Web Speech API (SpeechRecognition + SpeechSynthesis)
- Vision: Camera API + OpenAI Vision
| File | Purpose |
|---|---|
vite.config.js |
API proxy and model configuration |
src/hooks/useConversation.js |
Main conversation logic |
src/hooks/useSpeech.js |
Voice input/output handling |
src/components/Camera.jsx |
Camera feed component |
In vite.config.js, modify the DEFAULT_MODEL:
const DEFAULT_MODEL = 'gpt-5'; // or 'gpt-4o', 'gpt-4o-mini'Note: GPT-5 requires API access. Check your OpenAI dashboard for availability.
The app automatically selects the best natural-sounding voice available on your device. Voice selection happens in useConversation.js.
# Start development server
npm run dev
# Build for production
npm run build
# Preview production build
npm run preview- React 18
- Vite
- OpenAI API (GPT-4o Vision)
- Web Speech API
- MediaDevices API
MIT