RoboFactory is a benchmark for embodied multi-agent manipulation, based on ManiSkill. Leveraging compositional constraints and specifically designed interfaces, it's an automated data collection framework for embodied multi-agent systems.
First, clone this repository to your local machine, and install vulkan and the following dependencies.
git clone git@github.com:MARS-EAI/RoboFactory.git
conda create -n RoboFactory python=3.9
conda activate RoboFactory
cd RoboFactory
pip install -r requirements.txt
# (optional): conda install -c conda-forge networkx=2.5Then download the 3D assets in RoboFactory task:
python script/download_assets.py Now, try to run the task with just a line of code:
python script/run_task.py configs/table/lift_barrier.yamlFor more complex scene like RoboCasa, you can download them using the following commands. Note that if you use these scenes in your work please cite the scene dataset authors.
python -m mani_skill.utils.download_asset RoboCasaAfter download the scene dataset, you can try to run it:
python script/run_task.py configs/robocasa/lift_barrier.yamlIf you are running simulation environments on a headless Debian server without a graphical desktop, you will need to install a minimal set of OpenGL and EGL libraries to ensure compatibility.
Run the following commands to install the necessary runtime libraries:
sudo apt update
sudo apt install libgl1 libglvnd0 libegl1-mesa libgles2-mesa libopengl0You can use the following script to generate data. The generated data is usually placed in the demos/ folder.
# Format: python script/generate_data.py --config {config_path} --num {traj_num} [--save-video]
python script/generate_data.py --config configs/table/lift_barrier.yaml --num 150 --save-videoThe data generated by the ManiSkill framework is in .h5 format. In order to adapt to the training code, we need to convert it to .zarr format. You can convert it according to the following method.
# 1. make data folder in the first time.
mkdir data
mkdir -p data/{h5_data,pkl_data,zarr_data}
# 2. move your .h5 and .json file into the data/h5_data folder.
mv {your_h5_file}.h5 data/h5_data/{task_name}.h5
mv {your_h5_file}.json data/h5_data/{task_name}.json
# 3. run the script to process the data.
# NOTE: This is the script for default config. If you add the additional camera in config yaml, modify the script to adapt the data.
# Example:
python script/parse_h5_to_pkl_multi.py --task_name LiftBarrier-rf --load_num 150 --agent_num 2
# For 2 agents task, convert 2 .pkl file into .zarr file respectively.
# Example:
python script/parse_pkl_to_zarr_dp.py --task_name LiftBarrier-rf --load_num 150 --agent_id 0
python script/parse_pkl_to_zarr_dp.py --task_name LiftBarrier-rf --load_num 150 --agent_id 1We currently provide training code for Diffusion Policy (DP), and we plan to provide more policies in the future. You can train the DP model through the following code:
bash policy/Diffusion-Policy/train.sh ${task_name} ${load_num} ${agent_id} ${seed} ${gpu_id}
# Example:
bash policy/Diffusion-Policy/train.sh LiftBarrier-rf 150 0 100 0
bash policy/Diffusion-Policy/train.sh LiftBarrier-rf 150 1 100 0Use the .ckpt file to evaluate your model results after the training is completed. When setting DEBUG_MODE to 1, it will open the visual window and output more info.
bash policy/Diffusion-Policy/eval_multi.sh ${config_name} ${DATA_NUM} ${CHECKPOINT_NUM} ${DEBUG_MODE} ${TASK_NAME}
# Example
bash policy/Diffusion-Policy/eval_multi.sh configs/table/lift_barrier.yaml 150 300 1 LiftBarrier-rfFor any questions or research collaboration opportunities, please don't hesitate to reach out:yiranqin@link.cuhk.edu.cn, faceong02@gmail.com, akikaze@sjtu.edu.cn.
@article{qin2025robofactory,
title={RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints},
author={Qin, Yiran and Kang, Li and Song, Xiufeng and Yin, Zhenfei and Liu, Xiaohong and Liu, Xihui and Zhang, Ruimao and Bai, Lei},
journal={arXiv preprint arXiv:2503.16408},
year={2025}
}