Skip to content

convert_sync_batchnorm should respect device affinity #37930

@mrshenli

Description

@mrshenli

torch.nn.SyncBatchNorm.convert_sync_batchnorm by default places newly created parameters on CPU even if all parameters of input model are on GPU. It might be better to respect input module's device affinity and place new parameters on the same device where the original _BatchNorm module resides.

To reproduce:

import torch
module = torch.nn.Sequential(
        torch.nn.Linear(20, 100),
        torch.nn.BatchNorm1d(100)
).cuda()
print(set([p.device for p in module.parameters()]))
sync_bn_module = torch.nn.SyncBatchNorm.convert_sync_batchnorm(module)
print(set([p.device for p in sync_bn_module.parameters()]))

The output is:

{device(type='cuda', index=0)}
{device(type='cuda', index=0), device(type='cpu')}

cc @albanD @mruberry

Metadata

Metadata

Assignees

Labels

module: bootcampWe plan to do a full writeup on the issue, and then get someone to do it for onboardingmodule: nnRelated to torch.nnmodule: regressionIt used to work, and now it doesn'ttriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions