Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

Finetuning

In this folder we show how to fine-tune an autoregressive Language model on the following evaluation and downstream tasks with support for 7 programming languages:

  • APPS: Python benchmark to evaluate code generation. It is similar to HumanEval and MBPP, but it is more challanging and has more evaluation problems.
  • CodeComplex: Java benchmark with a classification problem to predict the algorithmic complexity of Java programs among 7 labels.
  • CodeClone: Java benchmark from CodeXGLUE dataset, with a binary classification problem of predicting the semantic equivalence of two programs. [WIP]
  • CodeDefect: C benchmark from CodeXGLUE, with a binary classification problem of predicting whether a code is insecure code and may attack software systems. [WIP]
  • Code-to-text: Dataset from CodeXGLUE for generationg natural language comments from code in Python, Go, Java, Javascript, PHP and Ruby. This task can also be done in a zero-shot setting without need for fine-tuning. [WIP]

We use Hugging Face Trainer API for all tasks, which supports distributed training on multiple GPUs.

The evaluation score on the test set is shown at the end of the fine-tuning. For implementation details, please refer to the README inside each folder.