fix: fix BF16_Optimizer compatibility issue #2152

shjwudp · 2022-07-28T14:09:59Z

Fix BF16_Optimizer compatibility issue with optimizer state 0-dim tensor, tensor.narrow does not support 0-dim tensor.

…m tensor

tjruwase · 2022-07-28T14:53:05Z

@shjwudp, thanks for this PR. Can you share a bit more context on when this fails?

shjwudp · 2022-07-28T16:08:35Z

@shjwudp, thanks for this PR. Can you share a bit more context on when this fails?

OK, this is really an edge case, in fairseq's Adafactor implementation, the RMS is calculated by the formula tensor.norm(2) / (tensor.numel() ** 0.5) , which is a 0-dim tensor.
https://github.com/facebookresearch/fairseq/blob/main/fairseq/optim/adafactor.py#L223

tjruwase · 2022-07-28T16:41:49Z

Interesting, thanks for sharing. So, to clarify, are you applying bf16_optimizer to adafactor for large model training? If so, we would be very interested in your experience. I suspect there might be issues with checkpoint load/stores, perhaps :).

fix: fix BF16_Optimizer compatibility issue with optimizer state 0-di…

256e07b

…m tensor

shjwudp requested review from RezaYazdaniAminabadi, ShadenSmith, arashb, awan-10, cli99, conglongli, duli2012, eltonzheng, jeffra, minjiaz, mrwyattii, samadejacobs, samyam, tjruwase, xiaoxiawu-microsoft and yaozhewei as code owners July 28, 2022 14:10

Merge branch 'master' into fix_bf16_with_adafactor

74e0380

tjruwase approved these changes Jul 28, 2022

View reviewed changes

tjruwase merged commit 57140e8 into deepspeedai:master Jul 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: fix BF16_Optimizer compatibility issue #2152

fix: fix BF16_Optimizer compatibility issue #2152

Uh oh!

shjwudp commented Jul 28, 2022

Uh oh!

tjruwase commented Jul 28, 2022

Uh oh!

shjwudp commented Jul 28, 2022

Uh oh!

tjruwase commented Jul 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix: fix BF16_Optimizer compatibility issue #2152

fix: fix BF16_Optimizer compatibility issue #2152

Uh oh!

Conversation

shjwudp commented Jul 28, 2022

Uh oh!

tjruwase commented Jul 28, 2022

Uh oh!

shjwudp commented Jul 28, 2022

Uh oh!

tjruwase commented Jul 28, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants