ProjectTest

This is the repo for paper: ProjectTest: A Project-level LLM Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms (https://arxiv.org/abs/2502.06556).

Dataset

Dataset is in the folder "./dataset".

Dataset statistics:

Language	Avg. #Files	Avg. LOC	Avg. #Stars	Avg. #Forks
Python	6.10	654.60	5810.30	996.90
Java	4.65	282.60	3306.05	1347.65
JavaScript	4.00	558.05	17242.30	5476.45

Detailed information of each project is in Appendix A.

The dataset is also available at HuggingFace.

from datasets import load_dataset

# Login using e.g. `huggingface-cli login` to access this dataset
ds = load_dataset("yibowang214/ProjectTest")

Requirements

Python: numpy, scipy, pytest, coverage, text_unidecode, rlcard, rapidfuzz
Java: java 17.0.13, maven 3.6.3

Preprocessing

See data_preprocess.ipynb

The processed data is saved in the folder "./ProjectTest"

Unit Test Generation

See the folder "./generation".

To generate unit tests for Python/JS/JAVA using API (GPT-3.5, GPT-4, O1, Gemini-1.5-Pro, Claude-3.5-Sonnet, etc), run the script
sh ./generation/vanilla/my_test.sh

To generate unit tests for Python/JS/JAVA using open-source models (deepseek-coder-6.7b-instruct, CodeLlama-7b-Instruct-hf, codegemma-7b-it, CodeQwen1.5-7B-Chat, etc), run the script
sh ./generation/vanilla/my_open_py.sh
sh ./generation/vanilla/my_open_js.sh
sh ./generation/vanilla/my_open_java.sh

The generated data is saved in the folder "./generated_tests"

LLM Self-fixing

See the folder "./generation/self_fix".

To generate unit tests for Python/JS/JAVA using LLM self-fixing, run the script
sh ./generation/self_fix/my_test.sh
sh ./generation/self_fix/my_open_py.sh
sh ./generation/self_fix/my_open_js.sh
sh ./generation/self_fix/my_open_java.sh

Evaluation

See the folders "./pytest", "./jstest", "./javatest"

extract the generated tests from the output: data_preprocess.ipynb
move the cleaned tests to the folders: data_preprocess.ipynb
run testing frameworks to get the test results:
sh ./generation/others/run_all.sh

Citation

If you find this work useful, please consider citing:\

@article{wang2025projecttest,
  title={ProjectTest: A Project-level Unit Test Generation Benchmark and Impact of Error Fixing Mechanisms},
  author={Wang, Yibo and Xia, Congying and Zhao, Wenting and Du, Jiangshu and Miao, Chunyu and Deng, Zhongfen and Yu, Philip S and Xing, Chen},
  journal={arXiv preprint arXiv:2502.06556},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.history		.history
ProjectTest		ProjectTest
dataset		dataset
generated_tests/GPT3.5_Data/ProjectTestPy_output		generated_tests/GPT3.5_Data/ProjectTestPy_output
generation		generation
javatest		javatest
jstest		jstest
pytest/GPT3.5/ProjectTest/Python/original_fix		pytest/GPT3.5/ProjectTest/Python/original_fix
LICENSE		LICENSE
README.md		README.md
data_preprocess.ipynb		data_preprocess.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProjectTest

Dataset

Requirements

Preprocessing

Unit Test Generation

LLM Self-fixing

Evaluation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ProjectTest

Dataset

Requirements

Preprocessing

Unit Test Generation

LLM Self-fixing

Evaluation

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages