Is Mutation Score a Fair Metric?

Abstract

Mutation score can be used to compare different test suites in relation to mutants detection. However, it is not known if the mutation score, being a summary of the detection ratios of different mutation types, is a fair metric to do such comparison. In this paper, we present an empirical study, with 10 open-source projects, which compares developer-written and automatically generated test suites in terms of mutation score and in relation to the detection ratios of 7 mutation types. Our results indicate fairness on the mutation score but also suggest equivalence among mutants generated by PIT with different mutation operators.

This page provides the experimental material and the statistical analysis used in this experiment.

Experimental Material

Test Generation Tools

EvoSuite's Maven Plugin (version 1.0.6)

We used the argument Duse_separate_classloader=false, in order to avoid problems with measuring code coverage, otherwise it could cause conflicts with PIT's bytecode instrumentations.
Randoop (version 4.1.1)

Changed arguments: flaky-test-behavior=DISCARD (in order to remove flaky tests) and randomseed=x, in order to generate different test suites, since Randoop is deterministic by default. For each execution, x was randomly generated by a pesudorandom integer generator. All the values that we used as randomseeds can be seen here.

Mutation Testing Tool

PITest's Maven Plugin (version 1.4.5)

Case Study Applications

We used 10 projects from the Apache Commons Repository. These projects already have developer-manually written test suites.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
dataAnalysisOutputs		dataAnalysisOutputs
scripts		scripts
statistics		statistics
README.md		README.md
RandomSeedsRandoopExecutions.csv		RandomSeedsRandoopExecutions.csv
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Is Mutation Score a Fair Metric?

Abstract

Experimental Material

Test Generation Tools

Mutation Testing Tool

Case Study Applications

Applications with regression test suites for each test generation technique

Data Analysis Scripts

Data Analysis Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Is Mutation Score a Fair Metric?

Abstract

Experimental Material

Test Generation Tools

Mutation Testing Tool

Case Study Applications

Applications with regression test suites for each test generation technique

Data Analysis Scripts

Data Analysis Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages