train_batch_size + dataset +  actual batch size

Hello,

I have 4 questions for clarification:
1. Why we should pass the training_data to the deepspeed.initialize to generate a new trainloader rather than using a normal torch trainloader ?
2. Can we use a custom pytorch trainloader in case we have custom dataset that returns for example inputs, outputs and mask ?
3. If the actual batch size that is used to be passed to the model is different than the train_batch_size in the json file, what will happen ?
4. Can we just define gradient_accumulation_steps  and train_micro_batch_size_per_gpu 
 only and leave deepspeed to calculate train_batch_size automatically ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

train_batch_size + dataset + actual batch size #62

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

train_batch_size + dataset + actual batch size #62

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions