NOVA - Navigation and Obstacle Voice Assistant

NOVA is an Android application designed to help visually impaired users navigate their surroundings safely. It uses real-time object detection and depth estimation to identify obstacles and announce them through audio feedback.

How It Works

Android Camera → JPEG Frame → Python Server (YOLO + MiDaS) → Detected Objects + Distances → Text-to-Speech → Audio Output

The app captures frames from the phone's camera and sends them to a Python server
The server runs YOLOv26 object detection to identify objects in the frame
MiDaS depth estimation calculates the distance to each detected object
Results are sent back to the app and converted to speech (e.g. "person at 0.2 meters detected") using the ElevenLabs TTS API

Tech Stack

Component	Technologies
Android App	Kotlin, Jetpack Compose, CameraX, Retrofit, OkHttp, Coroutines
Backend Server	Python, FastAPI, Uvicorn
AI Models	YOLOv26 (ultralytics), Intel MiDaS (depth estimation)
Audio	ElevenLabs TTS API, Android MediaPlayer

Project Structure

NOVA/
├── app/                        # Android application
│   └── app/src/main/java/com/example/impairedapp/
│       └── MainActivity.kt     # Main app logic (camera, networking, TTS)
└── server/                     # Python backend
    ├── server.py               # FastAPI server with YOLO detection
    ├── distanceCalculator.py   # MiDaS depth-to-distance conversion
    ├── yolo.py                 # Standalone YOLO test script
    └── *.pt                    # YOLO model weights (nano, medium, large)

Setup

Prerequisites

Android Studio
Python 3.8+
Android device with camera (API 24+)

Server

cd server
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

pip install fastapi uvicorn ultralytics torch torchvision opencv-python numpy matplotlib

# Create config.py with your server IP
echo 'SERVER_IP = "0.0.0.0"' > config.py

python server.py
# Server runs on http://0.0.0.0:8000

Android App

Open the app/ directory in Android Studio
Update the server URL in MainActivity.kt to point to your server's IP address
Add your ElevenLabs API key
Build and run on a physical device:
```
./gradlew installDebug
```

Note: A physical Android device is recommended since the app relies on the camera for real-time object detection.

Permissions

The app requires:

Camera - for real-time environment scanning
Internet - for server communication and TTS API calls

API Endpoint

Method	Endpoint	Description
POST	`/send`	Accepts an image upload, returns detected objects with distances

Response format:

{
  "objects": [["person", 0.2], ["car", 3.6]]
}

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
app		app
server		server
webapp/code		webapp/code
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NOVA - Navigation and Obstacle Voice Assistant

How It Works

Tech Stack

Project Structure

Setup

Prerequisites

Server

Android App

Permissions

API Endpoint

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NOVA - Navigation and Obstacle Voice Assistant

How It Works

Tech Stack

Project Structure

Setup

Prerequisites

Server

Android App

Permissions

API Endpoint

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages