Strictly for academic and non-commercial use
If you use Robokit itself or its upstream components in academic work, please cite the toolkit and any relevant works. BibTex for Robokit:
@misc{p2024robokit,
title = {Robokit: A toolkit for robotic tasks},
author = {Jishnu Jaykumar P},
year = {2024},
note = {\url{https://github.com/jishnujayakumar/robokit}},
}Chronologically listed (latest first):
- Perception: MRVG
- Mobile Manipulation: HRT1
- Interactive Robot Teaching: iTeach
- Robot Exploration and Navigation: AutoX-SemMap
- Perception Research: NIDS-Net
- Grasp Trajectory Optimization: GTO
-
Docker Support
- Base image with ROS Noetic + CUDA 11.8 + Ubuntu 20.04 + Gazebo 11
→Dockerfile - Refer to BundleSDF's Docker setup
- Quickstart script:
run_container.sh
- Base image with ROS Noetic + CUDA 11.8 + Ubuntu 20.04 + Gazebo 11
-
Zero-Shot Capabilities
- 🔍 CLIP-based classification
- 🎯 Text-to-BBox: GroundingDINO
- 🧼 BBox-to-Mask: Segment Anything (MobileSAM)
- 📏 Image-to-Depth: Depth Anything
- 🔼 Feature Upsampling: FeatUp
- 🚪 DoorHandle Detection: iTeach–DHYOLO (demo)
- 📽️ Mask Propagation for Videos: SegmentAnythingV2 (SAMv2)
- Input:
jpgormp4 - Supports:
- Point/BBox prompts across video frames
- Multi-object point collection
- Tip: Use jpgs for frame-wise prediction; skip conversion for single images
- Note that SAMv2 only supports mp4 or jpg files as of 11/06/2024
- If you have an mp4 file then extract individual frames as jpg and store in a directory
- For single image mask predictions, no need to convert to jpg.
- Input:
- Python 3.7 or higher (tested 3.9.18)
- torch (tested 2.0)
- torchvision
- pytorch-cuda=11.8 (tested)
- SAMv2 requires py>=3.10.0 (here the installation has been tweaked to remove this constraint)
RoboKit relies on upstream git repositories (GroundingDINO, MobileSAM, FeatUp, etc.), so the supported workflow is to clone the repository and install from source.
git clone https://github.com/IRVLUTD/robokit.git
cd robokit
pip install -e .
# or configure extras via robokit.install.cfg and rerun `pip install -e .`
# or selectively: pip install -e '.[gdino,sam2]'To use the config-based workflow, edit robokit.install.cfg and set the comma-separated extras you want:
[extras]
include = gdino, sam2
Running pip install -e . now installs the listed extras automatically—handy for team-wide setups without long pip commands.
Available extras:
| Extra | Includes |
|---|---|
gdino |
GroundingDINO + CLIP |
sam |
MobileSAM |
sam2 |
SAM v2 + Hydra |
depthany |
Depth Anything (Transformers) |
dhyolo |
Ultralytics + DHYOLO toolkit |
featup |
FeatUp (requires CUDA toolkit and CUDA_HOME) |
all |
Installs every extra supported on your environment |
Fetching checkpoints
The default install skips large asset downloads so it works in offline/CI environments. When you need RoboKit to fetch and stage pretrained checkpoints (GDINO, MobileSAM, DHYOLO, SAMv2), rerun the install with:
ROBOTKIT_ENABLE_FILEFETCH=1 pip install -e .Checkpoints (GDINO, MobileSAM, DHYOLO, SAMv2) are downloaded when
ROBOTKIT_ENABLE_FILEFETCH=1is set before installation (e.g.,ROBOTKIT_ENABLE_FILEFETCH=1 python setup.py install). Without it, installations skip the heavy downloads, and you can trigger them manually later when needed. If a CUDA toolkit is not detected, the FeatUp install is skipped with a warning.
🧩 Known Installation Issues
- Check GroundingDINO installation for the following error
NameError: name '_C' is not defined- For SAMv2,
ModuleNotFoundError: No module named 'omegaconf.vendor'
pip install --upgrade --force-reinstall hydra-core🧪 Usage
- Note: All test scripts are located in the
testdirectory. Place the respective test scripts in the root directory to run. - SAM:
test_sam.py - GroundingDINO + SAM:
test_gdino_sam.py - GroundingDINO + SAM + CLIP:
test_gdino_sam_clip.py - Depth Anything:
test_depth_anything.py - FeatUp:
test_featup.py - iTeach-DHYOLO:
test_dhyolo.py - SAMv2:
- Test Datasets:
test_dataset.pypython test_dataset.py --gpu 0 --dataset <ocid_object_test/osd_object_test>
Planned improvements:
- Config-based pretrained checkpoint switching
- ✨ More features coming soon...
This project is based on the following repositories (license check mandatory):
Special thanks to Dr. Yu Xiang, Sai Haneesh Allu, and Itay Kadosh for their early feedback.
This project is licensed under the MIT License. However, before using this tool please check the respective works for specific licenses.