NaviSense Point-Guided Navigation for Independent Shopping 🌍 Inspiration

NaviSense was inspired by a simple but powerful question:

Why is physical navigation still inaccessible in an AI-driven world?

While digital spaces have screen readers and accessibility layers, physical retail environments remain largely unstructured and visually dependent. Grocery shopping — something most people take for granted — requires constant spatial awareness, product identification, and shelf-level confirmation.

We were particularly inspired by:

Real-world accessibility gaps in retail

Google Lookout’s perception architecture

The idea of combining goal-oriented navigation with real-time hand guidance

Rather than just describing a scene, we wanted to build a system that could guide someone’s hand to a specific item and confirm collection.

That became the core idea behind NaviSense.

💡 The Core Idea

NaviSense turns a smartphone into a spatial accessibility layer.

Users can:

Ask for an item using voice

Navigate aisles with real-time guidance

Point their hand toward a shelf

Receive directional cues until they reach the target

Confirm pickup and track quantities

The system improves with each visit by learning store layouts and user preferences.

🏗 How We Built It

We designed NaviSense as a two-tier architecture:

1️⃣ Frontend (React Native + Expo)

Voice input via Modulate Velma-2 STT

Emotion detection to adapt guidance

Real-time TTS feedback

Haptic directional cues

Persistent store and user memory (AsyncStorage)

2️⃣ Backend (FastAPI + GPU Vision Pipeline)

The backend handles perception and reasoning:

🧠 Object Segmentation

We use SAM 2.1 to segment all objects in a scene.

Each mask 𝑚𝑖mi represents a candidate object region.

Built With

Share this project:

Updates