Skip to content

Add Megatron support for the EleutherAI Evaluation Harness #137

@ibeltagy

Description

@ibeltagy

Add the ability to run the EleutherAI Evaluation Harness on Megatron checkpoints. Right now we are relying on converting Megatron checkpoints to Hugginface checkpoints which is an error-prone process. We also have to use Megatron anyways to run the 200B model.

Implementation details:
You will use this HF gpt2 model implementation here as your reference. Here are more details:

  • Edit _init_, create_from_arg_string to load the Megatron checkpoints
  • Edit the _model_call function to call the Megatron model and read logits back
  • The functions loglikelihood, loglikelihoods, _loglikelihood_tokens might (or might not) require a little tweaking
  • Leave the function greedy_until unimplemented (raise an exception), we don't need it for now.
  • Check this test that shows how to load and call a Megatron checkpoint.
  • Here's one Megatron checkpoint that you can work with.
  • A relatively close implementation is already in the GPT-NeoX repo here and it might be helpful to check as well.

Metadata

Metadata

Labels

Good First IssueGood for newcomersarch&scaleArchitecture and Scaling Modeling Group

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions