-
Notifications
You must be signed in to change notification settings - Fork 228
Closed
Labels
Good First IssueGood for newcomersGood for newcomersarch&scaleArchitecture and Scaling Modeling GroupArchitecture and Scaling Modeling Group
Description
Add the ability to run the EleutherAI Evaluation Harness on Megatron checkpoints. Right now we are relying on converting Megatron checkpoints to Hugginface checkpoints which is an error-prone process. We also have to use Megatron anyways to run the 200B model.
Implementation details:
You will use this HF gpt2 model implementation here as your reference. Here are more details:
- Edit
_init_, create_from_arg_stringto load the Megatron checkpoints - Edit the
_model_callfunction to call the Megatron model and read logits back - The functions
loglikelihood, loglikelihoods, _loglikelihood_tokensmight (or might not) require a little tweaking - Leave the function
greedy_untilunimplemented (raise an exception), we don't need it for now. - Check this test that shows how to load and call a Megatron checkpoint.
- Here's one Megatron checkpoint that you can work with.
- A relatively close implementation is already in the GPT-NeoX repo here and it might be helpful to check as well.
Metadata
Metadata
Assignees
Labels
Good First IssueGood for newcomersGood for newcomersarch&scaleArchitecture and Scaling Modeling GroupArchitecture and Scaling Modeling Group