AI-Powered Voice-Controlled Spot Robot

Abstract:

This project aims to assist visually impaired individuals, including those who are blind, by leveraging the Boston Dynamics Spot robot and AI technology to help them navigate complex environments. Through voice commands and intelligent decision-making, the robot autonomously moves in response to user inputs. The system utilizes Deepgram's speech-to-text API to process voice commands and Groq’s AI API to determine appropriate movement actions. The commands are then executed using the Boston Dynamics Spot SDK. Although network connectivity challenges limited testing, the project lays a solid foundation for further development into a practical assistive technology.

Features:

Voice Command Input: Users provide voice commands to control the robot’s movements.
Speech-to-Text Processing: Audio is converted to text using the Deepgram API.
AI-Powered Decision Making: Groq's AI API processes text commands to determine movement sequences.
Robot Movement Execution: Commands are sent to Spot using the Boston Dynamics Spot SDK.
Docker Integration: The application is containerized using Docker and runs on Spot’s onboard computer.

Tech Stack:

Hardware: Boston Dynamics Spot robot
APIs:
- Deepgram: Speech-to-text conversion
- Groq: AI-based decision making for movement
- Boston Dynamics Spot SDK: For robot control
Programming Language: Python
Containerization: Docker
Platform: Runs on Spot’s onboard computer

About the Project:

Inspiration:

Our team was inspired by the challenges faced by visually impaired individuals in navigating unfamiliar or complex environments. We aimed to use cutting-edge robotics and AI technology to create a guide dog-like experience that could improve independence and safety for blind and visually impaired users.

What We Learned:

We learned the complexities of working with robotics hardware in real-world conditions, especially when integrating multiple APIs for voice control and AI-based decision-making. Handling real-time input and making decisions autonomously based on that data was both a rewarding and educational experience.

How We Built the Project:

We developed a Python-based application that runs through a Docker image on the Spot robot’s onboard computer. The system takes voice command input, converts it to text using Deepgram, and processes the command using Groq’s AI to decide on the best movement action. The Boston Dynamics Spot SDK then executes these movements.

Challenges We Faced:

Our main challenge was network connectivity with the Spot robot. Due to these issues, we were unable to test our program on the actual robot until late on the second day of the Hackathon. This delayed our progress and limited our ability to fine-tune and optimize the robot’s real-world interactions. Despite this setback, we successfully implemented a functional voice-controlled movement system.

Things to Consider:

Current version only includes basic voice commands and robot movement.
Further development could involve more complex navigation logic, multi-modal feedback (audio/tactile), and integration with Fetch.ai for real-time path planning.

Future Work:

Enhanced Command Recognition: Add sentiment detection or context for better alignment of verbal inputs with actions.
Multi-modal Feedback: Implement audio and tactile feedback to improve user interaction.
Path Planning: Integrate Fetch.ai for dynamic navigation and obstacle avoidance.