A Closer Look at AUROC and AUPRC under Class Imbalance

Paper

If you use this code in your research, please cite the following paper:

@article{mcdermott2024closer,
  title={A closer look at auroc and auprc under class imbalance},
  author={McDermott, Matthew and Zhang, Haoran and Hansen, Lasse and Angelotti, Giovanni and Gallifant, Jack},
  journal={Advances in Neural Information Processing Systems},
  volume={37},
  pages={44102--44163},
  year={2024}
}

To replicate the experiments in the paper:

Setting Up

Run the following commands to clone this repo and create the Conda environment:

git clone git@github.com:hzhang0/auc_bias.git
cd auc_bias
conda env create -f environment.yml
conda activate auc_bias

Synthetic Experiments

To reproduce the experiments on synthetic data (Section 3.1 of the paper), run the notebooks/synthetic_exps.ipynb notebook top to bottom.

Training a Single Model

To train a single model, call train.py with the appropriate arguments, for example:

python -m auc_biases.train \
    --output_dir /output/dir \
    --dataset adult \
    --algorithm xgb \
    --balance_groups \
    --attribute 0 \
    --higher_prev_group_weight 3

To obtain the mimic dataset, see instructions here. The other three datasets are included and/or downloaded automatically.

Training a Grid of Models

To reproduce the experiments in the paper which involve training a grid of models using different hyperparameters, use sweep.py as follows:

python sweep.py launch \
    --experiment {experiment_name} \
    --output_dir {output_root} \
    --command_launcher {launcher}

where:

experiment_name corresponds to experiments defined as classes in experiments.py
output_root is a directory where experimental results will be stored.
launcher is a string corresponding to a launcher defined in launchers.py (i.e. slurm or local).

The experiment vary_group_weight_with_seeds corresponds to Figure 3. We have also uploaded the results of this experiment here. You can download this pickle file and place it in the notebooks folder before continuing to the next step.

Aggregating Results

After an experiment has finished running, to create Figures 3, 7, and 8, run notebooks/agg_results.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
auc_biases		auc_biases
notebooks		notebooks
.gitignore		.gitignore
ProcessingMIMIC.md		ProcessingMIMIC.md
README.md		README.md
command_launchers.py		command_launchers.py
environment.yml		environment.yml
experiments.py		experiments.py
requirements.txt		requirements.txt
sweep.py		sweep.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Closer Look at AUROC and AUPRC under Class Imbalance

Paper

To replicate the experiments in the paper:

Setting Up

Synthetic Experiments

Training a Single Model

Training a Grid of Models

Aggregating Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A Closer Look at AUROC and AUPRC under Class Imbalance

Paper

To replicate the experiments in the paper:

Setting Up

Synthetic Experiments

Training a Single Model

Training a Grid of Models

Aggregating Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages