[LangRef] Clarify specification for float min/max operations#172012
[LangRef] Clarify specification for float min/max operations#172012
Conversation
|
@llvm/pr-subscribers-llvm-ir Author: Nikita Popov (nikic) ChangesThis implements some clarifications for the specification of floating point min/max operations based on the discussion in https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006. The key changes are:
Full diff: https://github.com/llvm/llvm-project/pull/172012.diff 1 Files Affected:
diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst
index 5fa3a4ebb2472..3ce5244a39b31 100644
--- a/llvm/docs/LangRef.rst
+++ b/llvm/docs/LangRef.rst
@@ -17288,96 +17288,6 @@ The returned value is completely identical to the input except for the sign bit;
in particular, if the input is a NaN, then the quiet/signaling bit and payload
are perfectly preserved.
-.. _i_fminmax_family:
-
-'``llvm.min.*``' Intrinsics Comparation
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-Standard:
-"""""""""
-
-IEEE754 and ISO C define some min/max operations, and they have some differences
-on working with qNaN/sNaN and +0.0/-0.0. Here is the list:
-
-.. list-table::
- :header-rows: 2
-
- * - ``ISO C``
- - fmin/fmax
- - fmininum/fmaximum
- - fminimum_num/fmaximum_num
-
- * - ``IEEE754``
- - minNum/maxNum (2008)
- - minimum/maximum (2019)
- - minimumNumber/maximumNumber (2019)
-
- * - ``+0.0 vs -0.0``
- - either one
- - +0.0 > -0.0
- - +0.0 > -0.0
-
- * - ``NUM vs sNaN``
- - qNaN, invalid exception
- - qNaN, invalid exception
- - NUM, invalid exception
-
- * - ``qNaN vs sNaN``
- - qNaN, invalid exception
- - qNaN, invalid exception
- - qNaN, invalid exception
-
- * - ``NUM vs qNaN``
- - NUM, no exception
- - qNaN, no exception
- - NUM, no exception
-
-LLVM Implementation:
-""""""""""""""""""""
-
-LLVM implements all ISO C flavors as listed in this table, except in the
-default floating-point environment exceptions are ignored. The constrained
-versions of the intrinsics respect the exception behavior.
-
-.. list-table::
- :header-rows: 1
- :widths: 16 28 28 28
-
- * - Operation
- - minnum/maxnum
- - minimum/maximum
- - minimumnum/maximumnum
-
- * - ``NUM vs qNaN``
- - NUM, no exception
- - qNaN, no exception
- - NUM, no exception
-
- * - ``NUM vs sNaN``
- - qNaN, invalid exception
- - qNaN, invalid exception
- - NUM, invalid exception
-
- * - ``qNaN vs sNaN``
- - qNaN, invalid exception
- - qNaN, invalid exception
- - qNaN, invalid exception
-
- * - ``sNaN vs sNaN``
- - qNaN, invalid exception
- - qNaN, invalid exception
- - qNaN, invalid exception
-
- * - ``+0.0 vs -0.0``
- - +0.0(max)/-0.0(min)
- - +0.0(max)/-0.0(min)
- - +0.0(max)/-0.0(min)
-
- * - ``NUM vs NUM``
- - larger(max)/smaller(min)
- - larger(max)/smaller(min)
- - larger(max)/smaller(min)
-
.. _i_minnum:
'``llvm.minnum.*``' Intrinsic
@@ -17413,30 +17323,26 @@ type.
Semantics:
""""""""""
-Follows the semantics of minNum in IEEE-754-2008, except that -0.0 < +0.0 for the purposes
-of this intrinsic. As for signaling NaNs, per the minNum semantics, if either operand is sNaN,
-the result is qNaN. This matches the recommended behavior for the libm
-function ``fmin``, although not all implementations have implemented these recommended behaviors.
-If either operand is a qNaN, returns the other non-NaN operand. Returns NaN only if both operands are
-NaN or if either operand is sNaN. Note that arithmetic on an sNaN doesn't consistently produce a qNaN,
-so arithmetic feeding into a minnum can produce inconsistent results. For example,
-``minnum(fadd(sNaN, -0.0), 1.0)`` can produce qNaN or 1.0 depending on whether ``fadd`` is folded.
+If both operands are qNaNs, returns a :ref:`NaN <floatnan>`. If one operand is
+qNaN and another operand is a number, returns the number. If both operands are
+numbers, returns the lesser of the two arguments. -0.0 is considered to be less
+than +0.0 for this intrinsic.
-IEEE-754-2008 defines minNum, and it was removed in IEEE-754-2019. As the replacement, IEEE-754-2019
-defines :ref:`minimumNumber <i_minimumnum>`.
+If an operand is a signaling NaN, then the intrinsic will non-deterministically
+either:
-If the intrinsic is marked with the nsz attribute, then the effect is as in the definition in C
-and IEEE-754-2008: the result of ``minnum(-0.0, +0.0)`` may be either -0.0 or +0.0.
+ * Return a :ref:`NaN <floatnan>`.
+ * Or treat the signaling NaN as a quiet NaN. In this case the intrinsic will
+ behave the same as ``llvm.minimumnum``.
-Some architectures, such as ARMv8 (FMINNM), LoongArch (fmin), MIPSr6 (min.fmt), PowerPC/VSX (xsmindp),
-have instructions that match these semantics exactly; thus it is quite simple for these architectures.
-Some architectures have similar ones while they are not exact equivalent. Such as x86 implements ``MINPS``,
-which implements the semantics of C code ``a<b?a:b``: NUM vs qNaN always return qNaN. ``MINPS`` can be used
-if ``nsz`` and ``nnan`` are given.
+If the ``nsz`` flag is specified, ``llvm.minnum`` with one +0.0 and one
+-0.0 operand may non-deterministically return either operand. Contrary to normal
+``nsz`` semantics, if both operands have the same sign, the result must also
+have the same sign.
-For existing libc implementations, the behaviors of fmin may be quite different on sNaN and signed zero behaviors,
-even in the same release of a single libm implementation.
+When used with the ``nsz`` flag, this intrinsics follows the semantics of
+``fmin`` in C.
.. _i_maxnum:
@@ -17473,30 +17379,26 @@ type.
Semantics:
""""""""""
-Follows the semantics of maxNum in IEEE-754-2008, except that -0.0 < +0.0 for the purposes
-of this intrinsic. As for signaling NaNs, per the maxNum semantics, if either operand is sNaN,
-the result is qNaN. This matches the recommended behavior for the libm
-function ``fmax``, although not all implementations have implemented these recommended behaviors.
-If either operand is a qNaN, returns the other non-NaN operand. Returns NaN only if both operands are
-NaN or if either operand is sNaN. Note that arithmetic on an sNaN doesn't consistently produce a qNaN,
-so arithmetic feeding into a maxnum can produce inconsistent results. For example,
-``maxnum(fadd(sNaN, -0.0), 1.0)`` can produce qNaN or 1.0 depending on whether ``fadd`` is folded.
+If both operands are qNaNs, returns a :ref:`NaN <floatnan>`. If one operand is
+qNaN and another operand is a number, returns the number. If both operands are
+numbers, returns the greater of the two arguments. -0.0 is considered to be
+less than +0.0 for this intrinsic.
-IEEE-754-2008 defines maxNum, and it was removed in IEEE-754-2019. As the replacement, IEEE-754-2019
-defines :ref:`maximumNumber <i_maximumnum>`.
+If an operand is a signaling NaN, then the intrinsic will non-deterministically
+either:
-If the intrinsic is marked with the nsz attribute, then the effect is as in the definition in C
-and IEEE-754-2008: the result of maxnum(-0.0, +0.0) may be either -0.0 or +0.0.
+ * Return a :ref:`NaN <floatnan>`.
+ * Or treat the signaling NaN as a quiet NaN. In this case the intrinsic will
+ behave the same as ``llvm.maximumnum``.
-Some architectures, such as ARMv8 (FMAXNM), LoongArch (fmax), MIPSr6 (max.fmt), PowerPC/VSX (xsmaxdp),
-have instructions that match these semantics exactly; thus it is quite simple for these architectures.
-Some architectures have similar ones while they are not exact equivalent. Such as x86 implements ``MAXPS``,
-which implements the semantics of C code ``a>b?a:b``: NUM vs qNaN always return qNaN. ``MAXPS`` can be used
-if ``nsz`` and ``nnan`` are given.
+If the ``nsz`` flag is specified, ``llvm.maxnum`` with one +0.0 and one
+-0.0 operand may non-deterministically return either operand. Contrary to normal
+``nsz`` semantics, if both operands have the same sign, the result must also
+have the same sign.
-For existing libc implementations, the behaviors of fmin may be quite different on sNaN and signed zero behaviors,
-even in the same release of a single libm implementation.
+When used with the ``nsz`` flag, this intrinsics follows the semantics of
+``fmax`` in C.
.. _i_minimum:
@@ -17538,6 +17440,11 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this
intrinsic. Note that these are the semantics specified in the draft of
IEEE 754-2019.
+If the ``nsz`` flag is specified, ``llvm.maximum`` with one +0.0 and one
+-0.0 operand may non-deterministically return either operand. Contrary to normal
+``nsz`` semantics, if both operands have the same sign, the result must also
+have the same sign.
+
.. _i_maximum:
'``llvm.maximum.*``' Intrinsic
@@ -17578,6 +17485,11 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this
intrinsic. Note that these are the semantics specified in the draft of
IEEE 754-2019.
+If the ``nsz`` flag is specified, ``llvm.maximum`` with one +0.0 and one
+-0.0 operand may non-deterministically return either operand. Contrary to normal
+``nsz`` semantics, if both operands have the same sign, the result must also
+have the same sign.
+
.. _i_minimumnum:
'``llvm.minimumnum.*``' Intrinsic
@@ -17619,12 +17531,17 @@ one operand is NaN (including sNaN) and another operand is a number,
return the number. Otherwise returns the lesser of the two
arguments. -0.0 is considered to be less than +0.0 for this intrinsic.
+If the ``nsz`` flag is specified, ``llvm.minimumnum`` with one +0.0 and one
+-0.0 operand may non-deterministically return either operand. Contrary to normal
+``nsz`` semantics, if both operands have the same sign, the result must also
+have the same sign.
+
Note that these are the semantics of minimumNumber specified in
IEEE-754-2019 with the usual :ref:`signaling NaN <floatnan>` exception.
-It has some differences with '``llvm.minnum.*``':
-1)'``llvm.minnum.*``' will return qNaN if either operand is sNaN.
-2)'``llvm.minnum*``' may return either one if we compare +0.0 vs -0.0.
+This intrinsic differs from ``llvm.minnum`` in that it is guaranteed to treat
+sNaN the same way as qNaN. ``llvm.minnum`` will instead non-deterministically
+either act like ``llvm.minimumnum`` or return a :ref:`NaN <floatnan>`.
.. _i_maximumnum:
@@ -17668,12 +17585,17 @@ another operand is a number, return the number. Otherwise returns the
greater of the two arguments. -0.0 is considered to be less than +0.0
for this intrinsic.
+If the ``nsz`` flag is specified, ``llvm.maximumnum`` with one +0.0 and one
+-0.0 operand may non-deterministically return either operand. Contrary to normal
+``nsz`` semantics, if both operands have the same sign, the result must also
+have the same sign.
+
Note that these are the semantics of maximumNumber specified in
IEEE-754-2019 with the usual :ref:`signaling NaN <floatnan>` exception.
-It has some differences with '``llvm.maxnum.*``':
-1)'``llvm.maxnum.*``' will return qNaN if either operand is sNaN.
-2)'``llvm.maxnum*``' may return either one if we compare +0.0 vs -0.0.
+This intrinsic differs from ``llvm.maxnum`` in that it is guaranteed to treat
+sNaN the same way as qNaN. ``llvm.maxnum`` will instead non-deterministically
+either act like ``llvm.maximumnum`` or return a :ref:`NaN <floatnan>`.
.. _int_copysign:
@@ -20379,9 +20301,17 @@ The '``llvm.vector.reduce.fmax.*``' intrinsics do a floating-point
``MAX`` reduction of a vector, returning the result as a scalar. The return type
matches the element-type of the vector input.
-This instruction has the same comparison semantics as the '``llvm.maxnum.*``'
-intrinsic. If the intrinsic call has the ``nnan`` fast-math flag, then the
-operation can assume that NaNs are not present in the input vector.
+This instruction has the same comparison and ``nsz`` semantics as the
+'``llvm.maxnum.*``' intrinsic.
+
+If any of the vector elements is a signaling NaN, the intrinsic will
+non-deterministically either:
+
+ * Return a :ref:`NaN <floatnan>`.
+ * Treat the signaling NaN as a quiet NaN.
+
+If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
+assume that NaNs are not present in the input vector.
Arguments:
""""""""""
@@ -20408,9 +20338,17 @@ The '``llvm.vector.reduce.fmin.*``' intrinsics do a floating-point
``MIN`` reduction of a vector, returning the result as a scalar. The return type
matches the element-type of the vector input.
-This instruction has the same comparison semantics as the '``llvm.minnum.*``'
-intrinsic. If the intrinsic call has the ``nnan`` fast-math flag, then the
-operation can assume that NaNs are not present in the input vector.
+This instruction has the same comparison and ``nsz`` semantics as the
+'``llvm.minnum.*``' intrinsic.
+
+If any of the vector elements is a signaling NaN, the intrinsic will
+non-deterministically either:
+
+ * Return a :ref:`NaN <floatnan>`.
+ * Treat the signaling NaN as a quiet NaN.
+
+If the intrinsic call has the ``nnan`` fast-math flag, then the operation can
+assume that NaNs are not present in the input vector.
Arguments:
""""""""""
|
|
This sounds good to me! I don't have a strong opinion on whether |
jcranmer-intel
left a comment
There was a problem hiding this comment.
I do somewhat miss having the table that organizes all of the different flavors of min/max operations to help identify which ones are in use.
That said, I've long been considering the need to have a dedicated docs page on floating-point semantics for LLVM IR and what different stakeholders need to understand about those semantics (like the existing atomics page), and the table would definitely fit there better. This entire saga is upping work on that document in my priority list, but nothing's going to happen before the end of the calendar year.
arsenm
left a comment
There was a problem hiding this comment.
So if it's nondeterministic, what do we constant fold to for a signaling nan?
|
We can do either. Is there one option that's better for backends / follow-up optimizations? What other operations do ( We also need to ensure that scalar evolution is aware of all this non-determinism and doesn't try to predict what these operations do. With |
Folding to the qNaN will allow more downstream operations to fold, since just about everything else simply propagates input nans |
Either option would be legal. I think using the IEEE 2008 sNaN semantics might be slightly preferable to match the constrained variants. Or we could also uphold #170181 as a permanent rather than temporary change, and just not fold sNaN arguments for these intrinsics at all.
Currently we conservatively assume that all FP libcalls/intrinsics are non-deterministic: llvm-project/llvm/lib/Analysis/ConstantFolding.cpp Lines 4540 to 4544 in f3c1645 |
It could also be: fold to QNaN if the input is known to be an SNaN, otherwise fold to other operand if the input is just known to be any NaN. (Not sure if that's something that can happen with how LLVM represents partial knowledge of floating-point values.) |
llvm/docs/LangRef.rst
Outdated
| If both operands are qNaNs, returns a :ref:`NaN <floatnan>`. If one operand is | ||
| qNaN and another operand is a number, returns the number. If both operands are | ||
| numbers, returns the lesser of the two arguments. -0.0 is considered to be less | ||
| than +0.0 for this intrinsic. | ||
|
|
||
| IEEE-754-2008 defines minNum, and it was removed in IEEE-754-2019. As the replacement, IEEE-754-2019 | ||
| defines :ref:`minimumNumber <i_minimumnum>`. | ||
| If an operand is a signaling NaN, then the intrinsic will non-deterministically | ||
| either: | ||
|
|
||
| If the intrinsic is marked with the nsz attribute, then the effect is as in the definition in C | ||
| and IEEE-754-2008: the result of ``minnum(-0.0, +0.0)`` may be either -0.0 or +0.0. | ||
| * Return a :ref:`NaN <floatnan>`. | ||
| * Or treat the signaling NaN as a quiet NaN. In this case the intrinsic will | ||
| behave the same as ``llvm.minimumnum``. |
There was a problem hiding this comment.
It took me a while to work out what's allowed if both inputs are sNaN. Does it have to return a qNaN? On x86, this operation can be implemented in a much faster way if we allow a signaling NaN input to be passed through to the output. I think this phrasing allows that behavior, but you have to read between the lines a bit.
Right now, it's clear that if both operands are qNaN, it returns "a NaN" (following LLVM semantics). If one operand is sNaN and the other is a number, we are free to return either the number (following llvm.minimumnum) or "a NaN" (like IEEE754-2008, but we don't have to quiet the NaN).
However, it doesn't explicitly enumerate the case where both operands are sNaN. If we assume it's a subset of the case where "an operand is a signaling NaN", we can non-deterministically choose to return "a NaN" and pass through the sNaN. I think it would be helpful to make that case explicit.
There was a problem hiding this comment.
Note that returning "a NaN" also allows passing through the SNaN, so I think both possible interpretations of the two-SNaN case yield the same overall semantics.
|
I've tried to address the review feedback, please take a look. |
|
Based on https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006/61?u=nikic, I've dropped the requirement for zero ordering from maxnum/minnum. So basically now minnum == fmin == IEEE 754-2008 minNum modulo usual sNaN exception. I can revert that commit if the discussion reaches a different conclusion... |
llvm/docs/LangRef.rst
Outdated
| have the same sign. | ||
|
|
||
| Note that these are the semantics of minimumNumber specified in | ||
| IEEE-754-2019 with the usual :ref:`signaling NaN <floatnan>` exception. |
There was a problem hiding this comment.
FWIW, there's now a mix of "IEEE-754" and "IEEE 754". Not sure that it matters...
There was a problem hiding this comment.
Yes, there is a mix of both forms throughout the document. We should normalize to one of them in a separate change.
There was a problem hiding this comment.
I see at present 3 uses of IEEE754, 6 of IEEE 754, 32 of IEEE-754, 0 of any flavor of ISO/IEC 60559.
(Of the three forms, while IEEE-754 is the dominant form right now, IEEE 754- appears to be the most correct form.)
Okay, I did end up reverting this based on https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006/69?u=nikic. |
|
I've added a warning to indicate that the specified signed zero ordering hasn't been implemented yet. |
@nikic Can we revert it to the last commit? It looks to me the only motivation is to use signed zero minnum/maxnum for some optimizations. However, there are several problems/concerns need to be addressed regarding it:
|
Everyone else seems to have reached a consensus. The previous behavior is still available using the nsz flag, and Clang and Rust can and should add that flag. Using minnum/maxnum without nsz is (AFAIK) implementable across architectures, so we don't need to worry about the semantics of fmin/fmax libcalls. |
I'm talking about this point:
|
I certainly agree that we shouldn't do such optimizations until the signed-zero behavior is actually implemented, and that they should also not be applied if the I don't see why that means we shouldn't expose signed-zero ordering at all. Matt's proposed optimization is actually illustrative here. I don't know in particular what we can optimize by statically knowing the sign bit, but it applies when one operand is statically known to be +0.0. In that case, having the nsz flag is actually a good thing. Recall that the x86 implementation uses In such a scenario, we can actually refine Consider instead a scenario where Basically, if we have a version of |
At least, it should be done in order:
It's just one of the lowering pass, and specific to x86. At least, libcall doesn't guarantee it: https://godbolt.org/z/asvK5n5aM
Signaling NaN should matter to all, or none. That says, we should not optimizate any thing to
I'm cautious about it. From x86 baackend PoV, they are basically regression if not neutral. In fact, I agree with the opposite point from @ActuallyaDeviloper :
|
This is why this patch documents that the intrinsics have ordered zero behavior, but includes a warning that this behavior has not been implemented yet. This will allow people to actually start working on making that a reality without getting bogged down in the 10th discussion on this topic, while at the same time making it clear that you can't rely on this yet. Of course, if we want to actually make optimizations that involve dropping the nsz flag, we need to be careful about what the performance effect would be on targets that don't support that natively. I did find @valadaptive's point that (I really don't want this thread to get bogged down in discussions of a specific potential optimization -- it was only mentioned as an example that maxnum without nsz can have practical application even if frontends don't emit it directly.) |
I think it's more important to answer the right or wrong question before actually doing any follow up works. Discussions mean it's controversial enough. And I think it's hasty to draw a conclusion the vague optimizaiton worths the works ahead. Note, I'm not opposed to use |
|
There's no "right" or "wrong", just trade-offs. And the trade-offs will be much more clear once actual patches get written, which currently is blocked on having a coherent documented semantics to review the patches against. Is there some way to declare |
|
Deterministic SNaN behavior is way outside the scope of this PR. SNaN is consistently ignored across LLVM IR. It has been like that for many years. This PR changes nothing in SNaN treatment, so the situation at least does not get worse. Please do not block the resolution of other, non-SNaN-related min/max issues on figuring out the huge, wide-open, and mostly orthogonal design space of how to deal with SNaN. You are doing the equivalent of "before you can repaint this room, you must first redo the roof".
Given that LLVM currently generates code that actively contradicts its documentation, I cannot comprehend how you can seriously suggest this.
That's not correct. clang is a C compiler and therefore SNaN is implicitly unsupported, as per the C standard. The docs of a C compiler generally implicitly only refer to QNaNs. OTOH, those docs very clearly ask for "minnum but with signed zero ordering". How convenient of you to discard the exact evidence you asked for...
So if we ignore SNaN -- which we should, because this PR does not change anything about SNaN and therefore it makes no sense at all to criticize this PR for anything SNaN-related -- then I think this is entirely identical to what this PR does, except that the table looks the way I posted above rather than being nice and symmetric? So we are back to bikeshedding names and attributes. The argument that If it is not correct that this is equivalent to this PR except for SNaN behaviors, then please actually explain your proposal: please provide a list of all min/max operations in the LLVM IR (we don't care about instruction selection opcodes here) and their semantics. As per usual, every such LLVM IR operation should have basic legalization support on all backends -- you don't get to change such basic LLVM IR expectations. Also, the established SNaN contract ("Floating-point math operations are allowed to treat all NaNs as if they were quiet NaNs") is not up for debate in this particular PR -- that would be scope creep. What does your ideal version of that list look like?
Performance is lost all the time. When some optimization heuristic gets changed, it benefits most programs but for some things get worse. Completely hypothetical scenarios of performance loss are therefore not productive. I think you are exaggerating both how likely this is to happen, and how bad it would be. (Obviously you have tests to catch all the obvious cases, right?) And the profit isn't 0. Among other things mentioned above, the profit is a reduced chance of correctness bugs because all the min/max intrinsics treat signed zeros the same. |
So does the unordering signed zero sematics of maxnum/minnum. How does it suddenly become source of correctness bugs?
By do nothing, I mean do nothing in changing the code. At the same time, I'm actively suggesting to recover back the documentation regarding that part either in this patch or a revert of the original patch.
At the beginning, I'm talking the problem within the C scope. I extend it because you asked the freedom for other frontends. Then you are opposed to my proposals leveraging C's rules again?
You won't ask the question if you just took a look at the patch that added the builtins. Its rationality comes from #112852, which exactly I'm opposing for here.
@nikic explained the intentional part is not a problem. He didn't reject the unintentional part can happen in the wild. BTW, I just saw @arsenm mentioned the problem here recently.
I agree the raito of the unintentional discard is rather small. But the change is in the common path. That says, the effects happen on a
Just a new problem. Besides the fast math flag issue itself, all these frontends 100% suffer performance lose on X86 if they don't change to
I don't care about SNaN much either. All the proposals is around "what a minnum with +0.0>-0.0 provides but minimumnum not" and above. So let's back to my question again, what's indeed the requirement for changing to the ordering signed zero minnum/maxnum? I have explained "__builtin_elementwise_minnum" is an invalid one. Do you have new evidence?
Sorry, I still don't get a scenario that proves the necessity of the resolution. |
arsenm
left a comment
There was a problem hiding this comment.
I believe there is consensus to proceed with this. I also thing we should make sure this ends up in the release branch whenever it lands.
jcranmer-intel
left a comment
There was a problem hiding this comment.
I think, to the degree that there is a live dispute about the proper direction going further, it's not really about the semantics of the intrinsics at this point as it is about the semantics of the different ISD_OPCODES in the SelectionDAG file.
There should be some clarification as to that point, but I think that belongs as part of the codegen changes to actually adhere to the intended documentation, not as part of the documentation change here.
| If the ``nsz`` flag is specified, ``llvm.minimumnum`` with one +0.0 and one | ||
| -0.0 operand may non-deterministically return either operand. Contrary to normal | ||
| ``nsz`` semantics, if both operands have the same sign, the result must also | ||
| have the same sign. |
There was a problem hiding this comment.
I'm not entirely certain it's contrary to our actual normal nsz semantics (at least as far as we intend them to be, not as far as it's currently described in the document). The intent is that nsz isn't going to materialize a -0.0 out of absolutely nowhere, but it would take a deep dive of all of the transforms we want to enable with nsz to understand what the precise conditions are.
I think it's worth calling out the expectations of nsz here in any case, as we want nsz on the min/max algorithms to specifically indicate "we don't care about the ordering of -0.0 vis-a-vis +0.0" and we want frontends to be able to rely on that behavior, so I'm not going to object to the wording.
fhahn
left a comment
There was a problem hiding this comment.
We discussed this PR at the LLVM area team meeting today. We found that there is majority consensus that this is a step forward in the right direction and this PR should go ahead.
LGTM
…2012) This implements some clarifications for the specification of floating point min/max operations based on the discussion in https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006. The key changes are: * Explicitly specify minnum and maxnum with an sNaN operand as non-deterministically either returning NaN or treating sNaN as qNaN. This was implied by our general NaN semantics, but is important to call out here due to the special behavior of sNaN. * Explicitly specify the same non-determinism for the minnum/maxnum based vector reductions as well. * Explicitly specify the meaning of nsz on float min/max ops. In particular, clarify that unlike normal nsz semantics, it does not allow introducing a zero with a different sign out of thin air. * Simplify the semantics comparison section. This now focuses only on NaN and signed zero behavior, but omits information about exceptions that is not relevant for these non-constrained intrinsics.
…2012) This implements some clarifications for the specification of floating point min/max operations based on the discussion in https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006. The key changes are: * Explicitly specify minnum and maxnum with an sNaN operand as non-deterministically either returning NaN or treating sNaN as qNaN. This was implied by our general NaN semantics, but is important to call out here due to the special behavior of sNaN. * Explicitly specify the same non-determinism for the minnum/maxnum based vector reductions as well. * Explicitly specify the meaning of nsz on float min/max ops. In particular, clarify that unlike normal nsz semantics, it does not allow introducing a zero with a different sign out of thin air. * Simplify the semantics comparison section. This now focuses only on NaN and signed zero behavior, but omits information about exceptions that is not relevant for these non-constrained intrinsics.
…ven without `nsz` (#177828) This restriction was originally added in https://reviews.llvm.org/D143256, with the given justification: > Currently, in TargetLowering, if the target does not support fminnum, we lower to fminimum if neither operand could be a NaN. But this isn't quite correct because fminnum and fminimum treat +/-0 differently; so, we need to prove that one of the operands isn't a zero. As far as I can tell, this was never correct. Before #172012, `minnum` and `maxnum` were nondeterministic with regards to signed zero, so it's always been perfectly legal to lower them to operations that order signed zeroes.
…/FMAXIMUM even without `nsz` (#177828) This restriction was originally added in https://reviews.llvm.org/D143256, with the given justification: > Currently, in TargetLowering, if the target does not support fminnum, we lower to fminimum if neither operand could be a NaN. But this isn't quite correct because fminnum and fminimum treat +/-0 differently; so, we need to prove that one of the operands isn't a zero. As far as I can tell, this was never correct. Before llvm/llvm-project#172012, `minnum` and `maxnum` were nondeterministic with regards to signed zero, so it's always been perfectly legal to lower them to operations that order signed zeroes.
…ven without `nsz` (#177828) This restriction was originally added in https://reviews.llvm.org/D143256, with the given justification: > Currently, in TargetLowering, if the target does not support fminnum, we lower to fminimum if neither operand could be a NaN. But this isn't quite correct because fminnum and fminimum treat +/-0 differently; so, we need to prove that one of the operands isn't a zero. As far as I can tell, this was never correct. Before llvm/llvm-project#172012, `minnum` and `maxnum` were nondeterministic with regards to signed zero, so it's always been perfectly legal to lower them to operations that order signed zeroes.
…2012) This implements some clarifications for the specification of floating point min/max operations based on the discussion in https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006. The key changes are: * Explicitly specify minnum and maxnum with an sNaN operand as non-deterministically either returning NaN or treating sNaN as qNaN. This was implied by our general NaN semantics, but is important to call out here due to the special behavior of sNaN. * Explicitly specify the same non-determinism for the minnum/maxnum based vector reductions as well. * Explicitly specify the meaning of nsz on float min/max ops. In particular, clarify that unlike normal nsz semantics, it does not allow introducing a zero with a different sign out of thin air. * Simplify the semantics comparison section. This now focuses only on NaN and signed zero behavior, but omits information about exceptions that is not relevant for these non-constrained intrinsics.
…ven without `nsz` (llvm#177828) This restriction was originally added in https://reviews.llvm.org/D143256, with the given justification: > Currently, in TargetLowering, if the target does not support fminnum, we lower to fminimum if neither operand could be a NaN. But this isn't quite correct because fminnum and fminimum treat +/-0 differently; so, we need to prove that one of the operands isn't a zero. As far as I can tell, this was never correct. Before llvm#172012, `minnum` and `maxnum` were nondeterministic with regards to signed zero, so it's always been perfectly legal to lower them to operations that order signed zeroes.
…2012) This implements some clarifications for the specification of floating point min/max operations based on the discussion in https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006. The key changes are: * Explicitly specify minnum and maxnum with an sNaN operand as non-deterministically either returning NaN or treating sNaN as qNaN. This was implied by our general NaN semantics, but is important to call out here due to the special behavior of sNaN. * Explicitly specify the same non-determinism for the minnum/maxnum based vector reductions as well. * Explicitly specify the meaning of nsz on float min/max ops. In particular, clarify that unlike normal nsz semantics, it does not allow introducing a zero with a different sign out of thin air. * Simplify the semantics comparison section. This now focuses only on NaN and signed zero behavior, but omits information about exceptions that is not relevant for these non-constrained intrinsics.
…ven without `nsz` (llvm#177828) This restriction was originally added in https://reviews.llvm.org/D143256, with the given justification: > Currently, in TargetLowering, if the target does not support fminnum, we lower to fminimum if neither operand could be a NaN. But this isn't quite correct because fminnum and fminimum treat +/-0 differently; so, we need to prove that one of the operands isn't a zero. As far as I can tell, this was never correct. Before llvm#172012, `minnum` and `maxnum` were nondeterministic with regards to signed zero, so it's always been perfectly legal to lower them to operations that order signed zeroes.
This reverts commit ea3fdc5. Re-enable const-folding for maxnum/minnum in the middle-end, GlobalISel, and SelectionDAG. Re-enable optimizations that depend on maxnum/minnum sNaN semantics in InstCombine and DAGCombiner. Now that maxnum(x, sNaN) is specified to non-deterministically produce either NaN or x, these constant-foldings and optimizations are now valid again according to the newly clarified semantics in #172012 .
…lvm#184125) This reverts commit ea3fdc5. Re-enable const-folding for maxnum/minnum in the middle-end, GlobalISel, and SelectionDAG. Re-enable optimizations that depend on maxnum/minnum sNaN semantics in InstCombine and DAGCombiner. Now that maxnum(x, sNaN) is specified to non-deterministically produce either NaN or x, these constant-foldings and optimizations are now valid again according to the newly clarified semantics in llvm#172012 .
…lvm#184125) This reverts commit ea3fdc5. Re-enable const-folding for maxnum/minnum in the middle-end, GlobalISel, and SelectionDAG. Re-enable optimizations that depend on maxnum/minnum sNaN semantics in InstCombine and DAGCombiner. Now that maxnum(x, sNaN) is specified to non-deterministically produce either NaN or x, these constant-foldings and optimizations are now valid again according to the newly clarified semantics in llvm#172012 .
…lvm#184125) This reverts commit ea3fdc5. Re-enable const-folding for maxnum/minnum in the middle-end, GlobalISel, and SelectionDAG. Re-enable optimizations that depend on maxnum/minnum sNaN semantics in InstCombine and DAGCombiner. Now that maxnum(x, sNaN) is specified to non-deterministically produce either NaN or x, these constant-foldings and optimizations are now valid again according to the newly clarified semantics in llvm#172012 .
…lvm#184125) This reverts commit ea3fdc5. Re-enable const-folding for maxnum/minnum in the middle-end, GlobalISel, and SelectionDAG. Re-enable optimizations that depend on maxnum/minnum sNaN semantics in InstCombine and DAGCombiner. Now that maxnum(x, sNaN) is specified to non-deterministically produce either NaN or x, these constant-foldings and optimizations are now valid again according to the newly clarified semantics in llvm#172012 .
This implements some clarifications for the specification of floating point min/max operations based on the discussion in https://discourse.llvm.org/t/rfc-a-consistent-set-of-semantics-for-the-floating-point-minimum-and-maximum-operations/89006.
The key changes are: