Skip to content

fix corner case in kwargs for DataParallel#930

Merged
soumith merged 3 commits intomasterfrom
dpfix
Mar 5, 2017
Merged

fix corner case in kwargs for DataParallel#930
soumith merged 3 commits intomasterfrom
dpfix

Conversation

@soumith
Copy link
Copy Markdown
Collaborator

@soumith soumith commented Mar 5, 2017

When you have an input of dim[0] = 5, and it is sent to DataParallel on 4 GPUs, it is scattered as (2, 2, 1) and GPU-4 is unused.

In this case, the scattering logic of the kwargs scatter is broken, as it does not check for the number of GPUs used, and instead tries to use self.device_ids. This is wrong.
Also, another bug is that it tries to use the values of self.device_ids, assuming that self.device_ids are given in ascending order starting from 0. If they are given as (1,2,3,0), the incorrect kwargs are sent to the wrong GPU.

Fix these corner cases.
cc: @csarofeen

Comment thread torch/nn/parallel/data_parallel.py Outdated
replicas = self.replicate(self.module, self.device_ids)
scattered = self.scatter(inputs, self.device_ids)

used_gpus = len(scattered) # The last GPU might not be used. For example, input of size 5, on 4 GPUs

This comment was marked as off-topic.

Comment thread torch/nn/parallel/data_parallel.py Outdated
for i in self.device_ids
)

gpu_dicts = tuple()

This comment was marked as off-topic.

Comment thread torch/nn/parallel/data_parallel.py Outdated
)

gpu_dicts = tuple()
for i in range(used_gpus):

This comment was marked as off-topic.

Comment thread torch/nn/parallel/data_parallel.py Outdated

used_gpus = len(scattered) # The last GPU might not be used. For example, input of size 4, on 5 GPUs
gpu_dicts = None
if kwargs:

This comment was marked as off-topic.

@soumith soumith merged commit 60736bd into master Mar 5, 2017
@soumith soumith deleted the dpfix branch March 5, 2017 19:27
jjsjann123 pushed a commit to jjsjann123/pytorch that referenced this pull request Aug 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants