AL-Bench: Automatic Logging Benchmark

Overview

AL-Bench includes a high-quality dataset and a novel dynamic evaluation method focused on runtime logs, addressing key limitations of prior studies and bridging the gap between real-world requirements and existing evaluation frameworks.

Project Structure

.
├── Static_Evaluation/    # Scripts and results for static evaluation
│   ├── eval/            # Evaluation scripts for each logging tool
│   └── data/            # Evaluation result data
└── Dynamic_Evaluation/  # Scripts and results for dynamic evaluation
    ├── dynamic_evaluation/  # Core scripts for dynamic evaluation
    └── init_dynamic_evaluation/  # Dataset construction scripts

Dataset

The complete evaluation dataset can be accessed at: https://drive.google.com/drive/u/1/folders/1eoK7SaYTuwqcAe9T3ddjeU5oGLRDX2Ps

Evaluation Methods

Static Evaluation

Static evaluation focuses on the following aspects:

Log Level Accuracy (LA)
Log Position Accuracy (PA)
Log Message Accuracy (MA)
Average Level Distance (ALD)
Dynamic Expression Accuracy (DEA)
Static Text BLEU ROUGE Score (STS)

Figure 2: Static evaluation process and metrics calculation

Dynamic Evaluation

Dynamic evaluation assesses the performance of logging tools in actual runtime environments:

Compilation Success Rate
Log Similarity

Figure 3: Dynamic evaluation process

Evaluation Results

Static Evaluation

Dynamic Evaluation

Quick Start

Static Evaluation

Enter the Static_Evaluation directory:

cd Static_Evaluation

Run evaluation script:

python eval/[tool_name]/run_eval.py

Dynamic Evaluation

Strongly recommend using Docker to run the dynamic evaluation.

Pull the Docker image:

docker pull boyintan/al-bench:hadoop-build

Run the Docker container:

docker run -it -v $(pwd):/home/al-bench boyintan/al-bench:hadoop-build /bin/bash

Run the evaluation script:

cd Dynamic_Evaluation

Run the evaluation script:

python Dynamic_Evaluation/get_logs_output/execute_unittest.py --execute_id [execute_id] --results_dir [results_dir] --json_path [json_path] --use_catch_point [use_catch_point] --record_error [record_error] --num_thread [num_thread]

Note:

Prepare the data for dynamic evaluation, the data should be in the following format:

[{
    "uuid": "uuid",
    "prediction": "prediction",
    "predicted_log_statement": {
        "log_statement": "log_statement",
        "log_position": "log_position"
    }
}]

"prediction" should be the standard code format with '/n' as the line break. "predicted_log_statement" should be the log statement in the code. "log_position" should be the line number of the log statement in the code.

Dataset

The complete evaluation dataset can be accessed at: https://drive.google.com/drive/u/1/folders/1eoK7SaYTuwqcAe9T3ddjeU5oGLRDX2Ps

Evaluated Logging Tools

FastLog
UniLog
LANCE
LEONID

Citation

If you use AL-Bench in your research, please cite our paper:

    @misc{tan2025albenchbenchmarkautomaticlogging,
        title={AL-Bench: A Benchmark for Automatic Logging}, 
        author={Boyin Tan and Junjielong Xu and Zhouruixing Zhu and Pinjia He},
        year={2025},
        eprint={2502.03160},
        archivePrefix={arXiv},
        primaryClass={cs.SE},
        url={https://arxiv.org/abs/2502.03160}, 
    }

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Dynamic_Evaluation		Dynamic_Evaluation
Static_Evaluation/eval		Static_Evaluation/eval
img		img
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AL-Bench: Automatic Logging Benchmark

Overview

Project Structure

Dataset

Evaluation Methods

Static Evaluation

Dynamic Evaluation

Evaluation Results

Static Evaluation

Dynamic Evaluation

Quick Start

Static Evaluation

Dynamic Evaluation

Dataset

Evaluated Logging Tools

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AL-Bench: Automatic Logging Benchmark

Overview

Project Structure

Dataset

Evaluation Methods

Static Evaluation

Dynamic Evaluation

Evaluation Results

Static Evaluation

Dynamic Evaluation

Quick Start

Static Evaluation

Dynamic Evaluation

Dataset

Evaluated Logging Tools

Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages