NaviSense Point-Guided Navigation for Independent Shopping 🌍 Inspiration
NaviSense was inspired by a simple but powerful question:
Why is physical navigation still inaccessible in an AI-driven world?
While digital spaces have screen readers and accessibility layers, physical retail environments remain largely unstructured and visually dependent. Grocery shopping — something most people take for granted — requires constant spatial awareness, product identification, and shelf-level confirmation.
We were particularly inspired by:
Real-world accessibility gaps in retail
Google Lookout’s perception architecture
The idea of combining goal-oriented navigation with real-time hand guidance
Rather than just describing a scene, we wanted to build a system that could guide someone’s hand to a specific item and confirm collection.
That became the core idea behind NaviSense.
💡 The Core Idea
NaviSense turns a smartphone into a spatial accessibility layer.
Users can:
Ask for an item using voice
Navigate aisles with real-time guidance
Point their hand toward a shelf
Receive directional cues until they reach the target
Confirm pickup and track quantities
The system improves with each visit by learning store layouts and user preferences.
🏗 How We Built It
We designed NaviSense as a two-tier architecture:
1️⃣ Frontend (React Native + Expo)
Voice input via Modulate Velma-2 STT
Emotion detection to adapt guidance
Real-time TTS feedback
Haptic directional cues
Persistent store and user memory (AsyncStorage)
2️⃣ Backend (FastAPI + GPU Vision Pipeline)
The backend handles perception and reasoning:
🧠 Object Segmentation
We use SAM 2.1 to segment all objects in a scene.
Each mask 𝑚𝑖mi represents a candidate object region.
Log in or sign up for Devpost to join the conversation.