Skip to content

Fix CUDA division by a scalar on large arrays.#12023

Closed
colesbury wants to merge 3 commits intopytorch:masterfrom
colesbury:cuda_div_scalar
Closed

Fix CUDA division by a scalar on large arrays.#12023
colesbury wants to merge 3 commits intopytorch:masterfrom
colesbury:cuda_div_scalar

Conversation

@colesbury
Copy link
Member

The gpu_unary_kernel function was not handling arrays that
cannot use 32-bit indexing. This functions was only called directly
by CUDA division by a scalar. Other arithmetic operations go through
gpu_binary_kernel, which already properly handled large arrays.

This bug sometimes manifested as a crash and sometimes as an incorrect
answer.

Fixes #11788

The gpu_unary_kernel function was not handling arrays that
cannot use 32-bit indexing. This functions was only called directly
by CUDA division by a scalar. Other arithmetic operations go through
gpu_binary_kernel, which already properly handled large arrays.

This bug sometimes manifested as a crash and sometimes as an incorrect
answer.

Fixes pytorch#11788
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito pushed a commit to zdevito/ATen that referenced this pull request Sep 25, 2018
Summary:
The gpu_unary_kernel function was not handling arrays that
cannot use 32-bit indexing. This functions was only called directly
by CUDA division by a scalar. Other arithmetic operations go through
gpu_binary_kernel, which already properly handled large arrays.

This bug sometimes manifested as a crash and sometimes as an incorrect
answer.

Fixes #11788
Pull Request resolved: pytorch/pytorch#12023

Differential Revision: D10034017

Pulled By: colesbury

fbshipit-source-id: b17300f327de54035746bf02f576766007c9b144
@ezyang ezyang added the merged label Jun 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

crash when trying to divide or multiply a large tensor on voltas, but not on pascal

4 participants