Offical implementation for AAAI 2025 paper "World Knowledge-Enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving" [paper link]
- train code
- inference code
- eval code
- dataset
-
CUDA and cuDNN
We use CUDA 11.8 and cuDNN 8.7.0. We actually use the CUDA docker by NVIDIA: docker pull nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04. CUDA 12 is fine, too.
-
Create a conda virtual environment and activate it:
conda create -n kad python=3.10 conda activate kad
-
Basic requirements
pip install --upgrade pip pip install transformers pip install torch torchvision xformers --index-url https://download.pytorch.org/whl/cu118
-
Install flash-attention
# https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features pip install packaging pip install flash-attn --no-build-isolation -
Install KAD and other requirements
git clone https://github.com/KAD.git cd KAD pip install -e .
-
Lora finetune
sh ./script/train/finetune_lora.sh
-
mult-node lora finetune
sh ./script/train/finetune_lora_multi-nodes.sh
If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝 (´▽`ʃ♡ƪ)
@inproceedings{zhai2025world,
title={World knowledge-enhanced reasoning using instruction-guided interactor in autonomous driving},
author={Zhai, Mingliang and Li, Cheng and Guo, Zengyuan and Yang, Ningrui and Qin, Xiameng and Zhao, Sanyuan and Han, Junyu and Tao, Ji and Wu, Yuwei and Jia, Yunde},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={39},
number={9},
pages={9842--9850},
year={2025}
}
This work was supported by the Natural Science Foundation of Shenzhen under Grant No. JCYJ20230807142703006, Natural Science Foundation of China (NSFC) under Grants No. 62176021 and No. 62172041, and Key Research Platforms and Projects of the Guangdong Provincial Department of Education under Grant No.2023ZDZX1034.
Our project is built upon LLaVA and Bunny-Llama, leveraging their robust codebases and the exceptional language capabilities of base model.
