Skip to content

Implement NumPy-like function torch.fmax() & torch.fmin()#49312

Closed
Kiyosora wants to merge 8 commits intopytorch:masterfrom
Kiyosora:implement_fmax_fmin
Closed

Implement NumPy-like function torch.fmax() & torch.fmin()#49312
Kiyosora wants to merge 8 commits intopytorch:masterfrom
Kiyosora:implement_fmax_fmin

Conversation

@Kiyosora
Copy link
Copy Markdown
Contributor

@Kiyosora Kiyosora changed the title [WIP] Implement NumPy-like function torch.fmax() & torch.fmin() Implement NumPy-like function torch.fmax() & torch.fmin() Dec 14, 2020
@Kiyosora Kiyosora marked this pull request as ready for review December 14, 2020 09:27
Comment thread aten/src/ATen/native/BinaryOps.cpp Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 14, 2020

Codecov Report

Merging #49312 (33af88b) into master (ce30dba) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master   #49312   +/-   ##
=======================================
  Coverage   80.65%   80.65%           
=======================================
  Files        1913     1913           
  Lines      208121   208145   +24     
=======================================
+ Hits       167859   167887   +28     
+ Misses      40262    40258    -4     

@mrshenli mrshenli added the module: numpy Related to numpy support, and also numpy compatibility of our operators label Dec 18, 2020
@mrshenli mrshenli requested a review from mruberry December 18, 2020 03:16
@mrshenli mrshenli added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 18, 2020
@mruberry
Copy link
Copy Markdown
Collaborator

Hey @Kiyosora! Thanks for another PR. We'll review this shortly.

@heitorschueroff heitorschueroff self-requested a review December 22, 2020 19:00
Comment thread docs/source/torch.rst Outdated
amin
max
min
fmax
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be in the binary ops near minimum and maximum

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed, Thanks for correction!

Comment thread torch/_torch_docs.py Outdated
Computes the element-wise maximum of :attr:`input` and :attr:`other`.

.. note::
If one of the elements being compared is a NaN, then the non-nan element is returned.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"non-nan" -> "non-NaN"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can remove this note and just write the following:

Computes the element-wise maximum of :attr:`input` and :attr:`other`. This is like :func:`torch.maximum` except it handles NaNs differently: if exactly one of the two elements being compared is a NaN then the non-NaN element is taken as the maximum. Only if both elements are NaN is NaN propagated.

This function is a wrapper around C++'s ``std::fmax`` and is similar to NumPy's ``fmax`` function.

Supports :ref:broadcasting to a common shape <broadcasting-semantics>,
:ref:type promotion <type-promotion-doc>, and integer and floating-point inputs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Copy Markdown
Contributor

@heitorschueroff heitorschueroff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is looking great. I left a suggestion to simplify the mask logic. This also needs to be made c10 compliant (see comments on native_functions) and add some more tests.

Comment thread aten/src/ATen/native/BinaryOps.cpp Outdated
}

Tensor& fmax_out(Tensor& result, const Tensor& self, const Tensor& other) {
auto self_isnan = self.isnan();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Include a fast-path for integer dtypes. Since integer cannot be nan, you can simply return at::gt_out(result, self, other).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at::maximum_out, not at::gt_out, though

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But see my latest comment below about switching to use TensorIterator for simplicity and performance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've improved the code to use TensorIterator instead.

Comment thread aten/src/ATen/native/BinaryOps.cpp Outdated
auto self_isnan = self.isnan();
auto other_isnan = other.isnan();
auto both_nan = self_isnan.logical_and(other_isnan);
return at::maximum(at::where(self_isnan == both_nan, self, other.to(self.dtype())),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be written more efficiently as: other_isnan || (!self_isnan && self > other)

auto mask = other.isnan().logical_or_(self.isnan().logical_not_().logical_and_(self > other));
return at::where(mask, self, other);

You'd have to do the type promotion beforehand which I think is preferable.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do something similar to the out version, though torch.where does not have an out version so you'd have to copy the result.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting idea, but I think it'll be easier to write this using TensorIterator, which will handle broadcasting and type promotion and be much faster. See my comment below.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've improved the code to use TensorIterator instead.

Comment thread aten/src/ATen/native/BinaryOps.cpp Outdated
auto self_isnan = self.isnan();
auto other_isnan = other.isnan();
auto both_nan = self_isnan.logical_and(other_isnan);
return at::minimum(at::where(self_isnan == both_nan, self, other.to(self.dtype())),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, this can be written more efficiently as: other_isnan || (!self_isnan && self < other)

auto mask = other.isnan().logical_or_(self.isnan().logical_not_().logical_and_(self < other));
return at::where(mask, self, other);

You'd have to do the type promotion beforehand which I think is preferable.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I think (self >= other).logical_or(other.isnan()) would work?

However, I would like to recommend this be written like minimum and maximum to use TensorIterator. That will be much more performant and readable. It will require, however, that a gradient be added. The gradient for maximum and minimum should be a good guide. The dispatch will also have to change to no longer be a Math kernel.

The TensorIterator kernels can just be wrappers around std::fmin and std::fmax in C++. I propose we adopt Python's standard for floating-point NaN handling and treat all NaNs as identical, so in this case we don't need to worry about which NaN to return.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your equation works if we can assume that comparisons between NaN always return False. This is usually the case but I believe it is not guaranteed in the C++ standard. I agree that writing using TensorIterator will be better and more future proof.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, we should probably add a note that we assume IEEE 754.

Comment thread aten/src/ATen/native/native_functions.yaml
Comment thread aten/src/ATen/native/BinaryOps.cpp Outdated
return at::minimum(self, other);
}

Tensor& fmin_out(Tensor& result, const Tensor& self, const Tensor& other) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result tensor must be the last parameter (see comments on native_functions.yaml)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Comment thread aten/src/ATen/native/BinaryOps.cpp Outdated
return at::maximum(self, other);
}

Tensor& fmax_out(Tensor& result, const Tensor& self, const Tensor& other) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result tensor must be the last parameter (see comments on native_functions.yaml)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Comment thread aten/src/ATen/native/native_functions.yaml
Comment thread test/test_binary_ufuncs.py Outdated
ops = ((torch.maximum, torch.max, np.maximum), (torch.minimum, torch.min, np.minimum),
(torch.fmax, None, np.fmax), (torch.fmin, None, np.fmin))
a_vals = (float('inf'), -float('inf'), float('nan'), float('nan'))
b_vals = (-float('inf'), float('inf'), float('inf'), float('nan'))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also test a combination of nan and non-nan values to ensure we cover the cases where the value returned should come from one tensor vs the other.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current test does check this with a_vals and b_vals?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only comparison is between float('nan') and float('inf') but what about where the only nan value is in the second tensor?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Looks like the test could be more thorough.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've add the case of float('inf') and float('nan') to prove the situation of only nan value in the second tensor.

@mruberry
Copy link
Copy Markdown
Collaborator

Hey @Kiyosora!

Thanks for implementing torch.fmax and torch.fmin. This looks good, but there are few changes I'd like to propose:

  • as @heitorschueroff points out, dispatch will need to start using c10: full and the argument order needs to be the same going forward between signatures in native_functions.yaml and in the *.cpp files like BinaryOps.cpp. This is a new change and the documentation is not in the build yet. See Enforce c10-fullness for all ops #49619 where the documentation is being added.
  • I think we can implement this more efficiently and readably as a wrapper around C++'s fmin and fmax
  • Some documentation updates

Let me know your thoughts. Looking forward to adding these functions to PyTorch!

@Kiyosora Kiyosora force-pushed the implement_fmax_fmin branch 4 times, most recently from 9f2feb5 to d041f7f Compare December 25, 2020 01:22
Copy link
Copy Markdown
Collaborator

@mruberry mruberry Dec 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few updates needed here:

  • use iter.common_dtype() to test for whether it's a floating type or not and to dispatch
  • if iter.common_dtype() is not a floating point type then just call maximum
  • in the floating point kernel I don't think you need the NaN checks, I think this can just call std::fmax

Similar changes will need to be made for fmin and the CUDA code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, I'll fix it right away.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Comment thread test/test_binary_ufuncs.py Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension is good. Also pair a float('nan') with a real number like 0, 1, or .5.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Comment thread tools/autograd/derivatives.yaml Outdated
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same formula as for maximum:

https://github.com/pytorch/pytorch/blob/d041f7f2fd46f7f646d959d2ea7c6ca9613d7211/tools/autograd/derivatives.yaml#L714

That seems odd since these are different functions.

Let's look at the truth table:

  • if self and other are numbers, then gradient goes to self if self is > other, sure
  • if self and other are nan then this comparison is False and self and other both receive gradient, which is interesting
  • if either of self or other are nan then this comparison is false and both receive gradient, that seems odd

In particular, if self is a number and other is NaN then it seems like gradient should go to self but not to other? And, symmetrically, if self is NaN and other is a number then it seems like gradient should go to other and not to self?

I think we should also bias towards self receiving grad, all else being equal, and rewrite these expressions as:

(self >= other).logical_or_(other.isnan()).logical_not_()

Then the masked_fill for other would be:

(self >= other).logical_or_(other.isnan())

@heitorschueroff Does that look correct to you? What do you think, @Kiyosora?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's does makes sense for me, Thanks for pointing this out, I will fix it soon.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Copy Markdown
Collaborator

@mruberry mruberry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates look really good, @Kiyosora!

I think the implementation of the function can be simplified by relying on minimum/maximum, and I have some questions about the gradients, too. Looking forward to hearing your thoughts!

@Kiyosora Kiyosora force-pushed the implement_fmax_fmin branch from d041f7f to 1863539 Compare December 31, 2020 05:53
@Kiyosora Kiyosora requested a review from mruberry December 31, 2020 09:29
@Kiyosora
Copy link
Copy Markdown
Contributor Author

Hi @mruberry, Sorry for the delay and Thank you so much for your advice!
I've improved the implement and gradients of the function, it look much better now I think.
Please take a look at your convenience. 😃

@mruberry
Copy link
Copy Markdown
Collaborator

mruberry commented Jan 2, 2021

Thanks @Kiyosora! @heitorschueroff and I will take a look when we're back at work next week.

Copy link
Copy Markdown
Contributor

@heitorschueroff heitorschueroff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is looking great overall except for some minor fixes for which I left comments. The test failures are unrelated.

Comment thread torch/_torch_docs.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is causing formatting issues, just a little tweak will do it as follows:

Supports :ref:`broadcasting to a common shape <broadcasting-semantics>`,
:ref:`type promotion <type-promotion-doc>`, and integer and floating-point inputs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Comment thread torch/_torch_docs.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above for fmax on the formatting.

Comment thread torch/_torch_docs.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good example.

Comment thread torch/_torch_docs.py Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good example.

Comment thread aten/src/ATen/native/cpu/BinaryOpsKernel.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the linter will complain that the return statement is on the same line. Could you move it to its own line.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These kernels are looking good, just the same changes as for the fmax kernel.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iter.dtype -> iter.common_dtype

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double -> scalar_t ? or do we need double here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be scalar_t here, I just made a misunderstand, thanks for correcting!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same changes as for the fmax.

@Kiyosora Kiyosora force-pushed the implement_fmax_fmin branch 3 times, most recently from 80d56a2 to ef015b6 Compare January 11, 2021 07:05
@Kiyosora
Copy link
Copy Markdown
Contributor Author

Thank you so much for being such patient in code reviewing, @heitorschueroff. 👍
I have corrected my code, please kindly take a look at your convenience.

Copy link
Copy Markdown
Contributor

@heitorschueroff heitorschueroff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mruberry and I reviewed it and it's looking good to go. Thank you for this great contribution @Kiyosora.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@heitorschueroff
Copy link
Copy Markdown
Contributor

@Kiyosora There were some internal issues preventing this PR from landing and looks like it picked up a merge conflict now. Do you mind rebasing and fixing the conflict?

@Kiyosora Kiyosora force-pushed the implement_fmax_fmin branch from ef015b6 to 87927d9 Compare January 19, 2021 05:19
Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heitorschueroff has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

@heitorschueroff merged this pull request in 4803eaf.

laurentdupin pushed a commit to laurentdupin/pytorch that referenced this pull request Apr 24, 2026
)

Summary:
- Implementing the NumPy-like function`torch.fmax()` and `torch.fmin()` recommended in pytorch#48440

Pull Request resolved: pytorch#49312

Reviewed By: izdeby

Differential Revision: D25887246

Pulled By: heitorschueroff

fbshipit-source-id: d762eeff8b328bfcbe7d48b7ee9d2da72c249691
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged module: numpy Related to numpy support, and also numpy compatibility of our operators open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants