Skip to content

Fix overflow in torch.remainder when dividend is very large#37758

Closed
xwang233 wants to merge 2 commits intopytorch:masterfrom
xwang233:remainder-fix
Closed

Fix overflow in torch.remainder when dividend is very large#37758
xwang233 wants to merge 2 commits intopytorch:masterfrom
xwang233:remainder-fix

Conversation

@xwang233
Copy link
Copy Markdown
Collaborator

@xwang233 xwang233 commented May 4, 2020

This will fix the GPU implementation in #37743 and #24861. Please also check my comment.

The fixed remainder_kernel follows the similar implementation in numpy. See https://github.com/numpy/numpy/blob/79d7bc276afbe89c746e462d28d4bfbb4fc56148/numpy/core/src/npymath/npy_math_internal.h.src#L649-L658

I also slightly update the doc for torch.remainder, to make it similar to torch.fmod.

I'm not sure how to modify the Vec256 code of CPU remainder_kernel, so I just leave it there.

@xwang233
Copy link
Copy Markdown
Collaborator Author

xwang233 commented May 4, 2020

cc @ptrblck

@xwang233 xwang233 requested a review from ngimel May 4, 2020 10:10
@dr-ci
Copy link
Copy Markdown

dr-ci Bot commented May 4, 2020

💊 Build failures summary and remediations

As of commit d4bbd70 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

See how this bot performed.

This comment has been revised 8 times.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@xwang233
Copy link
Copy Markdown
Collaborator Author

xwang233 commented May 8, 2020

@ngimel Is there anything I can help on the internal test failure?

@ngimel
Copy link
Copy Markdown
Collaborator

ngimel commented May 8, 2020

I'll try landing it now.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@ngimel merged this pull request in 63b1ae6.

facebook-github-bot pushed a commit that referenced this pull request May 14, 2020
Summary:
Together with #37758, this fixes #37743 and fixes #24861.

This follows the CUDA fix in #37758, vectorised using a `blendv` to replace the if conditionals.

Most of the complication is from `remainder` supporting `at::Half` where `fmod` doesn't. I've now got `fmod` working on `Vec256<at::Half>` as well as enabling half dispatch for `fmod` so it matches `remainder`.

I also added `fmod` support to `Vec256<at::BFloat16>` before realising that `remainder` doesn't support `BFloat16` anyway. I could also enable `BFloat16` if that's desirable. If not, I don't think `Vec256<BFloat16>` should be missing `fmod` anyway.
Pull Request resolved: #38293

Differential Revision: D21539801

Pulled By: ezyang

fbshipit-source-id: abac6a3ed2076932adc459174cd3d8d510f3e1d5
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
…37758)

Summary:
This will fix the GPU implementation in pytorch#37743 and pytorch#24861. Please also check my [comment](pytorch#37743 (comment)).

The fixed `remainder_kernel` follows the similar implementation in numpy. See https://github.com/numpy/numpy/blob/79d7bc276afbe89c746e462d28d4bfbb4fc56148/numpy/core/src/npymath/npy_math_internal.h.src#L649-L658

I also slightly update the doc for `torch.remainder`, to make it similar to `torch.fmod`.

I'm not sure how to modify the Vec256 code of CPU remainder_kernel, so I just leave it there.
Pull Request resolved: pytorch#37758

Differential Revision: D21388417

Pulled By: ngimel

fbshipit-source-id: 770ba5801cf34619b2b68b8b0cf95d8cfa52e6f6
laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
Summary:
Together with pytorch#37758, this fixes pytorch#37743 and fixes pytorch#24861.

This follows the CUDA fix in pytorch#37758, vectorised using a `blendv` to replace the if conditionals.

Most of the complication is from `remainder` supporting `at::Half` where `fmod` doesn't. I've now got `fmod` working on `Vec256<at::Half>` as well as enabling half dispatch for `fmod` so it matches `remainder`.

I also added `fmod` support to `Vec256<at::BFloat16>` before realising that `remainder` doesn't support `BFloat16` anyway. I could also enable `BFloat16` if that's desirable. If not, I don't think `Vec256<BFloat16>` should be missing `fmod` anyway.
Pull Request resolved: pytorch#38293

Differential Revision: D21539801

Pulled By: ezyang

fbshipit-source-id: abac6a3ed2076932adc459174cd3d8d510f3e1d5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants