Empty batch support for SyncBatchNorm

## 🚀 Feature

Support empty batches in SyncBatchNorm.

## Motivation

#36382 has fixed SyncBatchNorm for cases where different workers have different batch sizes. But when some worker or all workers have zero batch size, the behavior is still unexpected.

Similar to how BatchNorm supports empty batch sizes now (in https://github.com/pytorch/pytorch/issues/12013#issuecomment-569876108), the expected behavior for SyncBatchNorm should be:

1. forward/backward should work properly when some or all workers have zero batch size. In particular, inputs of no elements should have non-None gradients, parameters should have zero gradients if the total batch size is 0.
2. when the total batch size is 0, moving_mean/moving_var should not be updated.

However, currently using SyncBatchNorm with empty batch produces this error:
```
  File "..../torch/nn/modules/_functions.py", line 17, in forward                           
    mean, invstd = torch.batch_norm_stats(input, eps)
RuntimeError: cannot reshape tensor of 0 elements into shape [0, 3, -1] because the unspecified dimension size -1 can be any value and is ambiguous   
```

## Alternatives

We implement it [here](https://github.com/facebookresearch/detectron2/blob/2ca36e3cbfb2c84c18502221564b629f3877e8be/detectron2/layers/batch_norm.py#L216-L239) but it's a python-based inefficient implementation.
cc @jjsjann123 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty batch support for SyncBatchNorm #36530

🚀 Feature

Motivation

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Empty batch support for SyncBatchNorm #36530

Description

🚀 Feature

Motivation

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions