Skip to content

jindanh/Misty_demos

Repository files navigation

Misty_demos

Multi-modal Human-Robot Interaction Demos

Implemented Functions

Multi-turn Conversation: Users can talk to Misty and Misty processes the input and respond verbally using a language model (e.g. Llama3.2)

Example:

python examples/multi_turn_conversation.py

Scene Understanding: Misty takes a picture and use a language model (e.g. Gemma3:4b) to interpret and describe the scene

Example:

python examples/scene_understanding.py

Package Requirements

Examples of using Ollama with Misty:

  • Text/voice interaction: misty_llama.py

Install dependencies

Using a virtual environment or conda environment is recommended to keep dependencies isolated.

  1. Create and activate conda environment
conda create -n misty python=3.9 -y
conda activate misty
  1. Install all dependencies
pip install mistyPy
pip install -r requirements.txt

Quickstart: Run the basic connection test

  1. Configure your Misty IP
  • Open connection_testing.py and set the ip_address variable to your robot's IP (ensure your computer and Misty are on the same network).
  1. Run
python connection_testing.py
  1. What to expect
  • Arms move (movement test)
  • An audio.mp3 is generated locally and sent to Misty; you should hear it speak
  • A photo is taken;

Output locations

  • Photos: images/misty_photos/
  • Audio files: audio/

Troubleshooting

  • If you don't hear speech, verify speakers on Misty and that gTTS saved audio.mp3 successfully.
  • If you cannot connect, recheck the IP address and that both devices share the same network.

About

LLM-powered Misty robot demos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages