Skip to content

wanglne/EVA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

EVA

This is the official repository for "EVA: Editing for Versatile Alignment against Jailbreaks" (Accepted by IEEE TPAMI 2026) πŸŽ‰πŸŽ‰.

arXiv IEEE TPAMI License: MIT

Overview

In this work, we propose EVA, an editing-based framework for versatile alignment against jailbreak attacks. EVA aims to improve the robustness and safety alignment of LLMs and VLMs under diverse jailbreak scenarios, while maintaining their general capabilities on benign tasks.

News

  • πŸŽ‰ EVA: Editing for Versatile Alignment against Jailbreaks has been accepted by IEEE TPAMI 2026.
  • 🚧 The code for the VLM part is currently being organized and will be released soon.
  • 🚧 For the LLM part, please first refer to our previous repository: DELMAN.

Code Release Status

LLM Part

The LLM-related implementation of EVA is closely related to our previous work DELMAN. Before the EVA code is fully released, you may refer to the DELMAN repository for the LLM model editing pipeline:

VLM Part

The VLM-related code is currently being organized and will be released soon.

Data

The images used by the three HarmBench image data files in data/ (HarmBench_images_llava.json, HarmBench_images_qwen.json, and HarmBench_images_internvl.json) are available in this Google Drive folder:

TODO

  • Release LLM-related code
  • Release the EVA paper link
  • Release VLM-related code

Citation

@ARTICLE{11523146,
  author={Wang, Yi and Qiu, Hongye and Xu, Yue and Yang, Sibei and Qin, Zhan and Huang, Minlie and Wang, Wenjie},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={EVA: Editing for Versatile Alignment against Jailbreaks}, 
  year={2026},
  volume={},
  number={},
  pages={1-16},
  keywords={Automatic speech recognition;Modeling;Safety;Visualization;Large language models;Conferences;Optimization;Educational institutions;Light emitting diodes;Tuning;Safety Alignment;Jailbreak Attacks;Model Editing;Large Language Models;Vision Language Models},
  doi={10.1109/TPAMI.2026.3694189}}

About

πŸ”₯[IEEE TPAMI 2026] EVA: Editing for Versatile Alignment against Jailbreaks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors