Azure tutorial updates and cleanup #43

jeffra · 2020-02-08T06:39:02Z

No description provided.

docs/azure.md

samyam · 2020-02-09T07:39:20Z

docs/azure.md

+[Megatron tutorial](tutorials/MegatronGPT2Tutorial.md) for more details.
+ * In order to fully train GPT2 with DeepSpeed and ZeRO we recommend using 8 instances of
+   Azure's Standard_ND40rs_v2 SKU for a total of 64 NVIDIA V100 GPUs. With this setup you
+   should be able to train 153.6 million samples in less than 2 weeks of training.


I think its important to mention the batch size since performance depends so much on it. Maybe say with this setup you should be able to train 100K steps at batch size 1536 (153.6 million samples) in less than 2 weeks?

IFU-master-2021-09-29

jeffra added 9 commits February 7, 2020 21:48

update docker pull to use pdsh, add ds_ssh

8dccb41

move azure docs to docs folder

cdcd4be

update azure link in readme

acafc86

fix link

854b320

refactor setup scripts

bcd10c0

add back deepspeed.pt

b27f6c7

remove redundant dockerfile

0cfeda1

Update azure.md

870bd22

Update azure.md

c656dd6

jeffra requested review from ShadenSmith, samyam and tjruwase February 8, 2020 06:39

update install wording in readme

45062ae

jeffra linked an issue Feb 8, 2020 that may be closed by this pull request

Installation documentation needed #18

Closed

jeffra mentioned this pull request Feb 8, 2020

Installation documentation needed #18

Closed

Shaden Smith added 4 commits February 8, 2020 12:31

Switching to az vm deallocate

420e4e9

Azure tutorial formatting

4396602

a -> an

4519b6c

formatting tweaks

846cea2

ShadenSmith reviewed Feb 8, 2020

View reviewed changes

docs/azure.md Show resolved Hide resolved

ShadenSmith reviewed Feb 8, 2020

View reviewed changes

docs/azure.md Show resolved Hide resolved

ShadenSmith reviewed Feb 8, 2020

View reviewed changes

docs/azure.md Show resolved Hide resolved

Shaden Smith and others added 3 commits February 8, 2020 14:33

Merge branch 'master' into jeffra/azure_updates

20debce

update

36df1dc

update

a95b34b

jeffra requested a review from ShadenSmith February 9, 2020 05:57

ShadenSmith approved these changes Feb 9, 2020

View reviewed changes

jeffra merged commit 20ff66a into master Feb 9, 2020

jeffra deleted the jeffra/azure_updates branch February 9, 2020 06:00

samyam reviewed Feb 9, 2020

View reviewed changes

kouml pushed a commit to kouml/DeepSpeed that referenced this pull request Apr 3, 2020

Azure tutorial updates and cleanup (deepspeedai#43)

c4a3ebe

jeffra pushed a commit that referenced this pull request May 19, 2020

fixing my merge handiwork (#43)

fd111c0

GrvLeo mentioned this pull request Oct 22, 2020

Fail to use Zero-offload: "ModuleNotFoundError: No module named 'deepspeed.ops.adam.cpu_adam_op'" #483

Closed

rraminen pushed a commit to rraminen/DeepSpeed that referenced this pull request Nov 18, 2021

Merge pull request deepspeedai#43 from rraminen/IFU-master-2021-09-29

2bc2f49

IFU-master-2021-09-29

delock referenced this pull request in delock/DeepSpeedSYCLSupport Sep 21, 2022

fix multi_tensor_adam build issue, JIRA IMM4-303 (#43)

7e1058a

lambda7xx mentioned this pull request Feb 24, 2023

[BUG] Zero-Inference usage error with .init_inference() #2372

Closed

Liangliang-Ma added a commit to Liangliang-Ma/DeepSpeed that referenced this pull request Aug 2, 2024

add intel xpu (deepspeedai#43)

31034d5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Azure tutorial updates and cleanup #43

Azure tutorial updates and cleanup #43

Uh oh!

jeffra commented Feb 8, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

samyam Feb 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Azure tutorial updates and cleanup #43

Azure tutorial updates and cleanup #43

Uh oh!

Conversation

jeffra commented Feb 8, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

samyam Feb 9, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants