🚀 Feature
Can we change PrepareOutputShardingPropagation in https://github.com/pytorch/xla/blob/master/torch_xla/csrc/xla_graph_executor.cpp#L1076 to async one to reduce the gaps between training steps.
@JackCaoG @yeounoh
Motivation
When using SPMD, there are gaps between steps in timeline, this may be caused by PrepareOutputShardingPropagation.
🚀 Feature
Can we change
PrepareOutputShardingPropagationin https://github.com/pytorch/xla/blob/master/torch_xla/csrc/xla_graph_executor.cpp#L1076 to async one to reduce the gaps between training steps.@JackCaoG @yeounoh
Motivation
When using SPMD, there are gaps between steps in timeline, this may be caused by
PrepareOutputShardingPropagation.