Skip to content

train_batch_size + dataset + actual batch size #62

@agemagician

Description

@agemagician

Hello,

I have 4 questions for clarification:

  1. Why we should pass the training_data to the deepspeed.initialize to generate a new trainloader rather than using a normal torch trainloader ?
  2. Can we use a custom pytorch trainloader in case we have custom dataset that returns for example inputs, outputs and mask ?
  3. If the actual batch size that is used to be passed to the model is different than the train_batch_size in the json file, what will happen ?
  4. Can we just define gradient_accumulation_steps and train_micro_batch_size_per_gpu
    only and leave deepspeed to calculate train_batch_size automatically ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions