Skip to content

Conversation

@SimonHeybrock
Copy link
Member

Fixes #3241.

Copy link
Contributor

@jokasimr jokasimr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scipp returns DBL_MAX for empty inputs, while NumPy returns NaN. For integer inputs, Scipp returns INT_MAX, while NumPy raises.

It is not really correct to say that Numpy returns Nan when the input is empty, or at least it is a bit unclear. Both of the below examples raises ValueError: zero-size array to reduction operation minimum which has no identity independent of the data-type.

np.min(np.ones(0, dtype=np.int64))
np.min(np.ones(0, dtype=np.float64))

However, Numpy.min does returns nan if the input contains nan.


As a side note that probably should be the topic of another issue: When looking into this I noticed that sc.nanmin and sc.min does the same thing

# Both return -1
sc.array(dims=['x'], values=[1, float('nan'), -1]).min()
sc.array(dims=['x'], values=[1, float('nan'), -1]).nanmin()

while

# Only nanmin returns -1
np.min([1, float('nan'), -1])
np.nanmin([1, float('nan'), -1])

I think the reason is that in the implementation of sc.min we use std::min(a, b) equivalent to (b < a) ? b : a; and if b=nan we get a.

@jl-wynen jl-wynen mentioned this pull request Oct 12, 2023
@SimonHeybrock
Copy link
Member Author

Hmm, and from numpy.ma.masked_array one gets a special value (of type numpy.ma.core.MaskedConstant):

import numpy as np
import numpy.ma as ma

ma.masked_array(np.ones(1), mask=[True]).min()  # masked

inputs, Scipp returns INT_MAX, while NumPy raises. Note that in the case of
:py:class:`DataArray`, inputs can also be "empty" if all elements contributing
to an output element are masked.
Scipp returns DBL_MAX and INT_MAX for empty inputs of float or int dtype,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Scipp returns DBL_MAX and INT_MAX for empty inputs of float or int dtype,
Scipp returns DBL_MAX or INT_MAX for empty inputs of float or int dtype,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in the other docstrings, too.

:py:class:`DataArray`, inputs can also be "empty" if all elements contributing
to an output element are masked.
Scipp returns DBL_MAX and INT_MAX for empty inputs of float or int dtype,
respectively, while NumPy rases. Note that in the case of :py:class:`DataArray`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
respectively, while NumPy rases. Note that in the case of :py:class:`DataArray`,
respectively, while NumPy raises. Note that in the case of :py:class:`DataArray`,

@SimonHeybrock SimonHeybrock merged commit 2cedc60 into main Oct 19, 2023
@SimonHeybrock SimonHeybrock deleted the docs-reduction-ops branch October 19, 2023 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Document behavior of min, max, ... in case of empty, masked, ... inputs

4 participants