🚗 DriveWorld-VLA: Unified Latent-Space World Modeling with Vision–Language–Action for Autonomous Driving
Feiyang Jia*, Lin Liu*, Ziying Song, Caiyan Jia†, Hangjun Ye, Xiaoshuai Hao† and Long Chen⊥ [📄 Paper (arXiv)]
We present DriveWorld-VLA, a tightly coupled framework where a world model serves as the reasoning engine bridging action and prospective imagination.
Feb. 01th, 2026: We released our paper on Arxiv. NavSim Code/Models are released!
- Release Paper
- Release NavSim Models and Training/Evaluation Framework
- Release NuScenes Models and Training/Evaluation Framework
| Method | NC | DAC | EP | TTC | Comfort | PDMS | Training Time | GPU Memory | Checkpoint |
|---|---|---|---|---|---|---|---|---|---|
| DriveWorld-VLA | 99.1 | 98.2 | 81.9 | 96.1 | 100 | 91.3 | 24 hrs | 80 GB | 📥 Download |
Training conducted on 8 NVIDIA H20 GPUs.
Legend • NC: No Collision • DAC: Drivable Area Compliance • EP: Ego Progress • TTC: Time to Collision • Comfort: Comfort • PDMS: Predictive Driver Model Score
root/
├── ckpts/
│ └── resnet34.pth
├── internvl_chat/
│ └── Internvlm checkpoint
├── dataset/
│ ├── maps/
│ ├── navsim_logs/
│ │ ├── test/
│ │ └── trainval/
│ ├── sensor_blobs/
│ │ ├── test/
│ │ └── trainval/
└── exp/
└── metric_cache/To obtain the navsim dataset:
bash download/download_maps.sh
bash download/download_navtrain.sh
bash download/download_test.shrefer to https://github.com/xiaomi-research/recogdrive to download checkpointbash scripts/evaluation/run_metric_caching.shCreate the conda environment:
conda env create -f environment.yml
conda activate Driveworld-vlaInstall dependencies:
pip install -r requirements.txt
pip install git+https://github.com/motional/nuplan-devkit.git@nuplan-devkit-v1.2#egg=nuplan-devkitAdd environment variables to ~/.bashrc (modify paths as needed):
export NUPLAN_MAP_VERSION="nuplan-maps-v1.0"
export NUPLAN_MAPS_ROOT="$HOME/navsim_workspace/dataset/maps"
export NAVSIM_EXP_ROOT="$HOME/navsim_workspace/exp"
export NAVSIM_DEVKIT_ROOT="$HOME/navsim_workspace/"
export OPENSCENE_DATA_ROOT="$HOME/navsim_workspace/dataset"Update paths in:
——navsim/agents/WoTE/configs/default_stage1.py
——navsim/agents/WoTE/configs/default_stage2.py
——navsim/agents/WoTE/configs/default_stage3.py
Then launch training stage 1:
bash scripts/training/run_ImagineWorld_stage1.sh # stage1_trainingThen launch training stage 2:
bash scripts/training/run_ImagineWorld_stage2.sh # stage2_trainingThen launch training stage 3:
bash scripts/training/run_ImagineWorld_stage3.sh # stage3_trainingEvaluation (stage 3):
bash scripts/evaluation/eval_driveworld_vla.shVisualization examples of navsim dataset. Top label: source of trajectory.
Visualization examples of nuScenes validation dataset. Top label: source of trajectory.
DriveWorld-VLA is greatly inspired by the following outstanding contributions to the open-source community: NAVSIM, DPPO, LightningDiT, DiffusionDrive, WOTE.
If you find DriveWorld-VLA is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.
@article{
update soon
}

