Commit adde81d
wayi
Update on "[Gradient Compression] Allow BatchedPowerSGD to run vanilla allreduce for the first K iterations"
Similar to #50973, allow the batched version to run vanilla allreduce for the first K iterations.
This may be useful if the batched version can be applied to some use cases where the accuracy requirement is not very strict.
Original PR issue: Investigate Applying PowerSGD to Communication Hook for Gradient Compression #47202
Differential Revision: [D26077709](https://our.internmc.facebook.com/intern/diff/D26077709/)
[ghstack-poisoned]1 file changed
Lines changed: 1 addition & 1 deletion
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
430 | 430 | | |
431 | 431 | | |
432 | 432 | | |
433 | | - | |
| 433 | + | |
434 | 434 | | |
435 | 435 | | |
436 | 436 | | |
| |||
0 commit comments