Jia - AI Visual Assistant

A voice-first AI companion for visually impaired people. Jia uses computer vision and natural language to describe the environment, answer questions, and provide accessibility assistance.

Features

🎤 Voice-First Interface - Speak naturally to interact with the AI
👁️ Computer Vision - Describes what the camera sees in real-time
🔊 Natural Text-to-Speech - Uses the best available natural-sounding voices
⚡ Fast Response - Instant speech recognition and AI responses
🛡️ Safety First - Proactively warns about hazards and obstacles

Quick Start with Ngrok

Ngrok allows you to expose your local server to the internet, enabling voice interactions from any device.

Prerequisites

Node.js 18+
An OpenAI API key
Ngrok account (free tier works)

Setup Steps

Clone the repository

git clone https://github.com/so-nerdyy/Jia.git
cd Jia

Install dependencies
```
npm install
```

Set up your OpenAI API key

# Create .env file
cp .env.example .env
# Edit .env and add your OpenAI API key
OPENAI_API_KEY=your_api_key_here

Start the development server
```
npm run dev
```
Expose with Ngrok (in a new terminal)
```
ngrok http 5173
```
Access Jia
- Copy the ngrok URL (e.g., https://abc123.ngrok.io)
- Open it in your browser
- Grant camera and microphone permissions

Using Ngrok TCP for dev server (alternative)

If you want to use ngrok with the dev server on a specific port:

# Terminal 1: Start the app on port 5173
npm run dev

# Terminal 2: Start ngrok
ngrok tcp 5173

Then access using the TCP URL provided by ngrok.

How It Works

Architecture

Frontend: React + Vite (PWA capable)
AI Chat: OpenAI GPT-4o (configurable to GPT-5 when available)
Voice: Web Speech API (SpeechRecognition + SpeechSynthesis)
Vision: Camera API + OpenAI Vision

Key Files

File	Purpose
`vite.config.js`	API proxy and model configuration
`src/hooks/useConversation.js`	Main conversation logic
`src/hooks/useSpeech.js`	Voice input/output handling
`src/components/Camera.jsx`	Camera feed component

Configuration

Changing the AI Model

In vite.config.js, modify the DEFAULT_MODEL:

const DEFAULT_MODEL = 'gpt-5'; // or 'gpt-4o', 'gpt-4o-mini'

Note: GPT-5 requires API access. Check your OpenAI dashboard for availability.

Voice Settings

The app automatically selects the best natural-sounding voice available on your device. Voice selection happens in useConversation.js.

Development

# Start development server
npm run dev

# Build for production
npm run build

# Preview production build
npm run preview

Tech Stack

React 18
Vite
OpenAI API (GPT-4o Vision)
Web Speech API
MediaDevices API

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
public		public
scripts		scripts
src		src
.env.example		.env.example
.gitignore		.gitignore
PLAN.md		PLAN.md
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
vite.config.js		vite.config.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jia - AI Visual Assistant

Features

Quick Start with Ngrok

Prerequisites

Setup Steps

Using Ngrok TCP for dev server (alternative)

How It Works

Architecture

Key Files

Configuration

Changing the AI Model

Voice Settings

Development

Tech Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jia - AI Visual Assistant

Features

Quick Start with Ngrok

Prerequisites

Setup Steps

Using Ngrok TCP for dev server (alternative)

How It Works

Architecture

Key Files

Configuration

Changing the AI Model

Voice Settings

Development

Tech Stack

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages