Skip to content

RuijieH/LNE-Blocking

Repository files navigation

LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models

The poster of the paper is available in the assets directory.

Configure Environment and Download Model

VENV

uv venv LNE-Blocking --python 3.11.13
sourve LNE-Blocking/bin/activate 
uv pip install -r requirements.txt

Download Model

Refer https://github.com/YihongDong/CDD-TED4LLMs?tab=readme-ov-file#contaminated-models to download the simulated contaminated LoRA weights as described here. File structure will be as follows:

pretrain_outputs
├── CodeGen6B
├── CodeLlama7B
├── CodeLlama7B_50k
├── CodeLlama7B_lr1e_3
├── CodeLlama7B_lr4e_5
├── gpt-3.5
├── GroundTruth_Probs_CodeGen6B
├── GroundTruth_Probs_CodeLlama7B
├── GroundTruth_Probs_Variants_CodeGen6B
├── GroundTruth_Probs_Variants_CodeLlama7B
├── Llama2
├── Outputs_CodeGen6B
├── Outputs_CodeLlama7B
├── Outputs_CodeLlama7B_lr1e_3
├── Outputs_CodeLlama7B_lr4e_5
├── Outputs_Llama2
├── Outputs_Variants_CodeGen6B
├── Outputs_Variants_CodeLlama7B
├── Variants_CodeGen6B
└── Variants_CodeLlama7B

Then run

mkdir saves
ln -s pretrain_outputs/CodeLlama7B saves/

Evaluate using provided Log

bash batch_eval.sh

Using provided lora adapter to infer and evaluate

Merge lora adapter to base model

python merge_script/makeyaml_codellama1k.py
python merge_script/batch_merge.py

Infer using merged weights

# !Need to configure exp dir in the script
bash batch_infer.sh

Evaluate using merged weights

# !Need to configure exp dir in the script
bash batch_eval.sh

Citation

@misc{hou2025lneblocking,
    title={LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models},
    author={Ruijie Hou and Yueyang Jiao and Hanxu Hu and Yingming Li and Wai Lam and Huajian Zhang and Hongyuan Lu},
    year={2025},
    eprint={2509.15218},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

About

Implemetation of the paper "LNE-Blocking: An Efficient Framework for Contamination Mitigation Evaluation on Large Language Models".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages