APPS

APPS finetuning

In this folder we show how to train an autoregressive Language model on APPS dataset, since a common way to evaluate on this benchmark is after finetuning the model on its training split. We use Hugging Face Trainer which supports distributed training on multiple GPUs.

Setup

First login to Weights & Biases

wandb login

You can finetune a model, gpt_345_python_any_license for example, by running:

# we use a global batch size of 256, here = 8 (GPUs) * 2 (batch_size_per_device) * 16 (gradient_accumulation)
python apps_train.py \
        --model_ckpt BigCode/gpt_345_python_any_license \
        --num_epochs 10 \
        --batch_size 2 \
        --gradient_accumulation_steps 16 \
        --learning_rate 5e-5 \
        --eval_freq 250 \
        --fp16

The fine-tuning takes 11h on 4 A100 GPUs.

Acknowledgments

This script is adapted from APPS repository.

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
apps_dataset.py		apps_dataset.py
apps_train.py		apps_train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

APPS finetuning

Setup

Acknowledgments

FilesExpand file tree

APPS

Directory actions

More options

Directory actions

More options

Latest commit

History

APPS

Folders and files

parent directory

README.md

APPS finetuning

Setup

Acknowledgments