Warning
This is the old version of the CoSIL implementation. Read the latest source code.
In the experiment, we use afl to represent our approach, CoSIL.
conda create -n cosil python=3.11
pip install -r requirements.txtTo reproduce the full SWE-bench lite/verified experiments, you should first set up your API key
in afl/util/api_requests.py:
client = openai.OpenAI(api_key="sk-xxxx", base_url="https://xxx/v1")Then, you should generate the repository structure by running the following command:
python get_lite_structure.py # For SWE-Bench Lite
python get_verified_structure.py # For SWE-Bench VerifiedTo avoid regenerating the repository structure files repeatedly, you can use the cache provided by Agentless Team. Download Here!
After that, you should export the following environment variables in run_lite.sh, run_verified.sh, patch_gen.sh and ablation.sh at line 2:
export PROJECT_FILE_LOC=<path to your repo structures>To reproduce RQ1's results, you can run the following command to reproduce the full SWE-bench lite/verified experiments:
bash run_lite.sh
bash run_verified.shAnd the results will be stored in results folder.
To reproduce RQ2's results, you can run the following command.
bash ablation.shTo reproduce RQ3's results, you can run the following command.
bash patch_gen.shAnd then you can use the official evaluation method to evaluate the generated patches on SWE-Bench.
To reproduce RQ4's results, you can run the following command.
bash sample.shYou can use the following command to evaluate the localization results on SWE-bench-Lite or SWE-Bench-Verified.
cd evaluation
python FLEvalNew.py --dataset ["lite"/"verified"] --loc_file ["path to your localization results"]This repository is partially based on OpenAutoCoder/Agentless.