Skip to content

emugnier/DafnyGym

Repository files navigation

DafnyGym

DafnyGym is a benchmarking suite for evaluating the ability to generate single assertions.

Our benchmark is also available on HuggingFace.

Set-Up

First build the docker image using:

docker buildx build -t dgym .

Then you can evaluate your predictions using the following command:

cat perfect_predictions.json | docker run -i dgym python3 run_eval.py | python3 analysis_results.py

where perfect_predictions.json is your prediction file.

This script might take a while since it is verifying all the different case of DafnyGym.

Content

Our benchmark contains assertions from these 3 repositories:

Citation

@misc{mugnier2024laurelgeneratingdafnyassertions,
      title={Laurel: Generating Dafny Assertions Using Large Language Models},
      author={Eric Mugnier and Emmanuel Anaya Gonzalez and Ranjit Jhala and Nadia Polikarpova and Yuanyuan Zhou},
      year={2024},
      eprint={2405.16792},
      archivePrefix={arXiv},
      primaryClass={cs.LO},
      url={https://arxiv.org/abs/2405.16792},
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages