🧠 Backdoor Token Unlearning (BTU)

Code for AAAI 2025 Paper
"Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models"
📄 arXiv:2501.03272
📘 AAAI Proceedings

📝 Overview

Backdoor Token Unlearning (BTU) is a novel anti-backdoor learning method designed to train clean language models from poisoned datasets.
The method identifies and neutralizes backdoor triggers by unlearning their influence in token representations, achieving robust defense with minimal performance degradation on clean tasks.

📂 Dataset

The AGNews dataset used in our experiments is not included in this repository due to its size.
Please download the dataset from the OpenBackdoor repository by THUNLP, which includes the same data splits used in our paper.

⚙️ Installation

Ensure you're using Python 3.9. Then install the required dependencies:

pip install -r requirements.txt

The requirements.txt file contains all necessary libraries and specific version constraints for reproducibility.

⚙️ Configuration

Customize your training setup by modifying the config.json file. You can specify:

Dataset paths (tasks and datasets)
Model paths (pretrained checkpoints)
Training hyperparameters, such as:
- learning_rate
- epochs
- batch_size
Unlearning parameters:
- Threshold

Ensure all paths and settings reflect your actual environment before running the script.

🚀 Usage

To start the BTU pipeline, simply run:

python BTU.py

Intermediate logs, model checkpoints, and evaluation results will be saved to the specified output directory in your configuration.

📈 Results Summary

Our BTU method demonstrates:

Our method drastically reduces the backdoor attack success rate (ASR) with only a marginal loss in clean task accuracy.

For more results, refer to Table 1 in the paper.

📖 Citation

If you use this codebase or method in your research, please cite the following work:

@inproceedings{jiang2025backdoor,
  title={Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models},
  author={Jiang, Peihai and Lyu, Xixiang and Li, Yige and Ma, Jing},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={39},
  number={23},
  pages={24285--24293},
  year={2025}
}

🙏 Acknowledgments

This project builds upon the OpenBackdoor framework by THUNLP.
This project builds upon the https://github.com/lancopku/sos by LancoPKU.

📬 Contact

For questions or collaborations, please reach out to the authors via the contact information provided in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
datasets		datasets
BTU.py		BTU.py
README.md		README.md
config.json		config.json
function.py		function.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 Backdoor Token Unlearning (BTU)

📝 Overview

📂 Dataset

⚙️ Installation

⚙️ Configuration

🚀 Usage

📈 Results Summary

📖 Citation

🙏 Acknowledgments

📬 Contact

About

Uh oh!

Releases

Packages

Languages

XDJPH/Backdoor-Token-Unlearning

Folders and files

Latest commit

History

Repository files navigation

🧠 Backdoor Token Unlearning (BTU)

📝 Overview

📂 Dataset

⚙️ Installation

⚙️ Configuration

🚀 Usage

📈 Results Summary

📖 Citation

🙏 Acknowledgments

📬 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages