Andreas Sochopoulos1,2,
Nikolay Malkin1,
Nikolaos Tsagkas1,
João Moura1,
Michael Gienger2,
Sethu Vijayakumar1
1University of Edinburgh, 2Honda Research Institute Europe,
moons_animation.mp4
Diffusion and flow matching policies have recently demonstrated remarkable performance in robotic applications by accurately capturing multimodal robot trajectory distributions. However, their computationally expensive inference, due to the numerical integration of an ODE or SDE, limits their applicability as real-time controllers for robots. We introduce a methodology that utilizes conditional Optimal Transport couplings between noise and samples to enforce straight solutions in the flow ODE for robot action generation tasks. We show that naively coupling noise and samples fails in conditional tasks and propose incorporating condition variables into the coupling process to improve few-step performance. The proposed few-step policy achieves a 4% higher success rate with a 10x speed-up compared to Diffusion Policy on a diverse set of simulation tasks. Moreover, it produces high-quality and diverse action trajectories within 1-2 steps on a set of real-world robot tasks. Our method also retains the same training complexity as Diffusion Policy and vanilla Flow Matching, in contrast to distillation-based approaches.
sudo apt update
sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3
conda env create -f environment_<suite>.yml
conda activate cot-policy-<suite>
where <suite> needs to be replaced be mimicgen, metaworld or mujoco. To train on D4RL Maze tasks, you have to manually install D4RL according to the instructions in the official repository. To use the maze environments used in the experiments of our paper you can download the modifications to D4RL from the links below.
| Environment | Instructions | Link | Command to Generate Data |
|---|---|---|---|
| Metaworld | Use the built-in functionality of the suite to generate expert demos. | GitHub (Link to pvrobo, not Metaworld) | python -m expert_demos.generate* |
| MimicGen | Install from source and follow official instructions for dataset generation. | GitHub, Docs | python download_datasets.py --dataset_type core --tasks <task> |
| Mujoco tasks | Download pre-generated data for ball-in-cup and maze tasks. Custom mazes require changes in D4RL. | Download Link | — |
| Real-world | Download teleoperated data for three real-world tasks used in the paper. | Download Link | — |
Example for training a COT Policy on the push-t task.
./train.sh pusht_image cot_policy_full_pca
train.sh.
You can evaluate the trained policy using variations of the following script:
python eval.py --checkpoint /path/to/checkpoint/ -o /output/path/ -gs 10000 -es 100 -is desired_inference_steps
The -is argument determines the number time steps the interval [0,1] is partioned with and it does not directly translate to neural function evaluations (NFE). If using the euler solver then NFE = desired_inference_steps - 1 and when using the midpoint solver then NFE = 2*(desired_inference_steps - 1).
To calculate the Trajectory Variance (TV) metric simply run variations of the following command:
python eval_tv.py --checkpoint /path/to/checkpoint/ -o /output/path/ -gs 10000 -es 100 -is desired_inference_steps
This codebase is based on the Diffusion Policy repository. Our work and code has also drawn inspiration from many other excellent works such as AdaFlow, torchcfm, OT Conditional Flow Matching and more.
@article{sochopoulos2025cot,
title = {Fast Flow-based Visuomotor Policies via Conditional Optimal Transport Couplings},
author = {Sochopoulos, Andreas and Malkin, Nikolay and Tsagkas, Nikolaos and Moura, João and Gienger, Michael and Vijayakumar, Sethu},
journal = {arXiv preprint arXiv:2505.01179},
year = {2025}
}