EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow

ICCV 2025

[🏠Project Page] [📄Paper] [📊Dataset] [🤗Checkpoints]

TL;DR: A method for learning robotic manipulation policies solely from action-unlabeled videos, enabling versatile control over deformable objects, occluded environments, and non-object-displacement tasks.

Installation

# Clone the repository
git clone https://github.com/YixiangChen515/EC-Flow.git
cd EC-Flow

# Download pretrained checkpoints (SAM, GroundingDINO, Co-Tracker)
wget https://dl.fbaipublicfiles.com/segment_anything_2/092824/sam2.1_hiera_large.pt -O sam_and_track/checkpoints/sam2.1_hiera_large.pt
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth -O sam_and_track/gdino_checkpoints/groundingdino_swint_ogc.pth
wget https://huggingface.co/facebook/cotracker3/resolve/main/scaled_offline.pth -O sam_and_track/co-tracker/checkpoints/scaled_offline.pth

# Create conda environment
conda create -n ecflow python=3.8
conda activate ecflow

# Install dependencies
bash install.sh

Flow Prediction

1. Training

We provide the Meta-World dataset in our Huggingface repo. Please download the dataset and place it under the data directory.

There are two ways to prepare the training data:

Use the metaworld.tar.gz file, which contains the pre-processed dataset with ground-truth point tracking results. This version is ready for training out of the box.
Alternatively, you can start with the original Meta-World dataset by using metaworld_original.tar.gz. To generate the processed dataset from it, run:

python -m data_gen.gen_metaworld_all

Once the dataset is prepared, you can start training the flow prediction module by running:

# Note: The global batch size should be divisible by the number of devices. We trained on 8 NVIDIA RTX 4090 GPUs (24GB) with a batch size of 7 per GPU.
torchrun --nnodes=1 --nproc_per_node=8 train.py --results-dir ckpt --global-batch-size=56 --data-path=data/metaworld

2. Inference

You can download the pretrained checkpoints from our Huggingface repo and and place them in the ckpt directory. To evaluate both the flow prediction and goal image prediction results, run the following command:

python inference.py --ckpt ckpt/flow.pt --img-ckpt ckpt/goal_img.pt

Evaluation in Meta-World

To evaluate EC-Flow in the Meta-World environment, follow these steps:

Download the pretrained checkpoints as described above.
Apply the necessary environment modifications by following the instructions in modify_env.md (IMPORTANT).

Once the setup is complete, run the following command to start evaluation:

cd experiment
bash eval_policy.sh

Note: To speed up the evaluation process, you can use multiple GPUs by specifying the device IDs:

# Example Usage
bash eval_policy.sh "0,1,2,3"

License

This repository is released under the MIT license.

Acknowledgement

We extend our deepest thanks to the creators of these remarkable projects:

Contact

If you have any questions about the code, please contact yixiang.chen [AT] cripac.ia.ac.cn

Citation

Please consider citing EC-Flow if it benefits your research:

@InProceedings{Chen_2025_ICCV,
    author    = {Chen, Yixiang and Li, Peiyan and Huang, Yan and Yang, Jiabing and Chen, Kehan and Wang, Liang},
    title     = {EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {11958-11968}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
data_gen		data_gen
diffusion		diffusion
experiment		experiment
metaworld		metaworld
sam_and_track		sam_and_track
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
inference.py		inference.py
install.sh		install.sh
model.py		model.py
setup.py		setup.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow

Installation

Flow Prediction

1. Training

2. Inference

Evaluation in Meta-World

License

Acknowledgement

Contact

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EC-Flow: Enabling Versatile Robotic Manipulation from Action-Unlabeled Videos via Embodiment-Centric Flow

Installation

Flow Prediction

1. Training

2. Inference

Evaluation in Meta-World

License

Acknowledgement

Contact

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages