🤖 RoboKit

Strictly for academic and non-commercial use

📚 Citations

If you use Robokit itself or its upstream components in academic work, please cite the toolkit and any relevant works. BibTex for Robokit:

@misc{p2024robokit,
  title  = {Robokit: A toolkit for robotic tasks},
  author = {Jishnu Jaykumar P},
  year   = {2024},
  note   = {\url{https://github.com/jishnujayakumar/robokit}},
}

🚀 Projects Using RoboKit

Chronologically listed (latest first):

Perception: MRVG
Mobile Manipulation: HRT1
Interactive Robot Teaching: iTeach
Robot Exploration and Navigation: AutoX-SemMap
Perception Research: NIDS-Net
Grasp Trajectory Optimization: GTO

✨ Features

Docker Support
- Base image with ROS Noetic + CUDA 11.8 + Ubuntu 20.04 + Gazebo 11
  → Dockerfile
- Refer to BundleSDF's Docker setup
- Quickstart script: run_container.sh
Zero-Shot Capabilities
- 🔍 CLIP-based classification
- 🎯 Text-to-BBox: GroundingDINO
- 🧼 BBox-to-Mask: Segment Anything (MobileSAM)
- 📏 Image-to-Depth: Depth Anything
- 🔼 Feature Upsampling: FeatUp
- 🚪 DoorHandle Detection: iTeach–DHYOLO (demo)
- 📽️ Mask Propagation for Videos: SegmentAnythingV2 (SAMv2)
  - Input: jpg or mp4
  - Supports:
    - Point/BBox prompts across video frames
    - Multi-object point collection
  - Tip: Use jpgs for frame-wise prediction; skip conversion for single images
  - Note that SAMv2 only supports mp4 or jpg files as of 11/06/2024
  - If you have an mp4 file then extract individual frames as jpg and store in a directory
  - For single image mask predictions, no need to convert to jpg.

⚙️ Getting Started

🧰 Prerequisites

Python 3.7 or higher (tested 3.9.18)
torch (tested 2.0)
torchvision
pytorch-cuda=11.8 (tested)
SAMv2 requires py>=3.10.0 (here the installation has been tweaked to remove this constraint)

🛠️ Installation

RoboKit relies on upstream git repositories (GroundingDINO, MobileSAM, FeatUp, etc.), so the supported workflow is to clone the repository and install from source.

git clone https://github.com/IRVLUTD/robokit.git
cd robokit
pip install -e .
# or configure extras via robokit.install.cfg and rerun `pip install -e .`
# or selectively: pip install -e '.[gdino,sam2]'

To use the config-based workflow, edit robokit.install.cfg and set the comma-separated extras you want:

[extras]
include = gdino, sam2

Running pip install -e . now installs the listed extras automatically—handy for team-wide setups without long pip commands.

Available extras:

Extra	Includes
`gdino`	GroundingDINO + CLIP
`sam`	MobileSAM
`sam2`	SAM v2 + Hydra
`depthany`	Depth Anything (Transformers)
`dhyolo`	Ultralytics + DHYOLO toolkit
`featup`	FeatUp (requires CUDA toolkit and `CUDA_HOME`)
`all`	Installs every extra supported on your environment

Fetching checkpoints

The default install skips large asset downloads so it works in offline/CI environments. When you need RoboKit to fetch and stage pretrained checkpoints (GDINO, MobileSAM, DHYOLO, SAMv2), rerun the install with:

ROBOTKIT_ENABLE_FILEFETCH=1 pip install -e .

Checkpoints (GDINO, MobileSAM, DHYOLO, SAMv2) are downloaded when ROBOTKIT_ENABLE_FILEFETCH=1 is set before installation (e.g., ROBOTKIT_ENABLE_FILEFETCH=1 python setup.py install). Without it, installations skip the heavy downloads, and you can trigger them manually later when needed. If a CUDA toolkit is not detected, the FeatUp install is skipped with a warning.

🧩 Known Installation Issues

Check GroundingDINO installation for the following error

NameError: name '_C' is not defined

For SAMv2, ModuleNotFoundError: No module named 'omegaconf.vendor'

pip install --upgrade --force-reinstall hydra-core

🧪 Usage

Note: All test scripts are located in the test directory. Place the respective test scripts in the root directory to run.
SAM: test_sam.py
GroundingDINO + SAM: test_gdino_sam.py
GroundingDINO + SAM + CLIP: test_gdino_sam_clip.py
Depth Anything: test_depth_anything.py
FeatUp: test_featup.py
iTeach-DHYOLO: test_dhyolo.py
SAMv2:
Test Datasets: test_dataset.py
- python test_dataset.py --gpu 0 --dataset <ocid_object_test/osd_object_test>

🛣️ Roadmap

Planned improvements:

Config-based pretrained checkpoint switching
✨ More features coming soon...

🙏 Acknowledgments

This project is based on the following repositories (license check mandatory):

Special thanks to Dr. Yu Xiang, Sai Haneesh Allu, and Itay Kadosh for their early feedback.

📜 License

This project is licensed under the MIT License. However, before using this tool please check the respective works for specific licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
docker		docker
imgs		imgs
robokit		robokit
ros		ros
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
robokit.install.cfg		robokit.install.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 RoboKit

📚 Citations

🚀 Projects Using RoboKit

✨ Features

⚙️ Getting Started

🧰 Prerequisites

🛠️ Installation

🛣️ Roadmap

🙏 Acknowledgments

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 RoboKit

📚 Citations

🚀 Projects Using RoboKit

✨ Features

⚙️ Getting Started

🧰 Prerequisites

🛠️ Installation

🛣️ Roadmap

🙏 Acknowledgments

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages