World Knowledge-Enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving

Offical implementation for AAAI 2025 paper "World Knowledge-Enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving" [paper link]

Todo List

train code
inference code
eval code
dataset

Quick Start

Environments

CUDA and cuDNN

We use CUDA 11.8 and cuDNN 8.7.0. We actually use the CUDA docker by NVIDIA: docker pull nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu20.04. CUDA 12 is fine, too.
Create a conda virtual environment and activate it:
```
conda create -n kad python=3.10
conda activate kad
```

Basic requirements

pip install --upgrade pip
pip install transformers
pip install torch torchvision xformers --index-url https://download.pytorch.org/whl/cu118

Install flash-attention

# https://github.com/Dao-AILab/flash-attention?tab=readme-ov-file#installation-and-features
pip install packaging
pip install flash-attn --no-build-isolation

Install KAD and other requirements

git clone https://github.com/KAD.git
cd KAD
pip install -e .

Lora finetune
```
sh ./script/train/finetune_lora.sh
```

mult-node lora finetune

sh ./script/train/finetune_lora_multi-nodes.sh

Dataset

Baidu Cloud: [Link] 提取码: ura9
Google Dive: [Link]
Huggingface: [Link]

Citation

If you find our paper and code useful in your research, please consider giving a star ⭐ and citation 📝 (´▽`ʃ♡ƪ)

@inproceedings{zhai2025world,
  title={World knowledge-enhanced reasoning using instruction-guided interactor in autonomous driving},
  author={Zhai, Mingliang and Li, Cheng and Guo, Zengyuan and Yang, Ningrui and Qin, Xiameng and Zhao, Sanyuan and Han, Junyu and Tao, Ji and Wu, Yuwei and Jia, Yunde},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={9},
  pages={9842--9850},
  year={2025}
}

Acknowledgements

This work was supported by the Natural Science Foundation of Shenzhen under Grant No. JCYJ20230807142703006, Natural Science Foundation of China (NSFC) under Grants No. 62176021 and No. 62172041, and Key Research Platforms and Projects of the Guangdong Provincial Department of Education under Grant No.2023ZDZX1034.

Our project is built upon LLaVA and Bunny-Llama, leveraging their robust codebases and the exceptional language capabilities of base model.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docs/images		docs/images
script		script
src		src
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

World Knowledge-Enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving

Todo List

Quick Start

Environments

Dataset

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

zmling22/KAD

Folders and files

Latest commit

History

Repository files navigation

World Knowledge-Enhanced Reasoning Using Instruction-guided Interactor in Autonomous Driving

Todo List

Quick Start

Environments

Dataset

Citation

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages