ENH: Implement initial kwarg for ufunc.add.reduce#10635
Conversation
|
@eric-wieser Looking forward to your feedback. 😄 |
|
Hmm. I have no idea why the CircleCI build failed, I didn't touch Cython. :/ |
numpy/core/fromnumeric.py
Outdated
numpy/core/src/umath/ufunc_object.c
Outdated
There was a problem hiding this comment.
You need an else { Py_INCREF(identity); }, so that the decref at the end is safe.
There was a problem hiding this comment.
I don't think so, it only checks for None and NULL. The object is getting passed through.
identity is gotten from the PyArg_ParseTupleAndKeywords, it is passed through directly from there, does this do the Py_INCREF call automatically or not?
Edit: Looks like you're right, from the docs:
Note that any Python object references which are provided to the caller are borrowed references; do not decrement their reference count!
There was a problem hiding this comment.
And the other half of the puzzle is that _get_identity does increment the reference count - so you want to make them equivalent so the cleanup doesn't need special casing.
|
It looks like you built this patch on an out-of-date master from before we had CircleCI working. If you rebase, the CircleCI error will go away. |
6e24504 to
609f9b1
Compare
|
This is failing because we reparse the arguments for forwarding to |
|
Somewhat tangent comment: How would this fit in with the dream of turning Currently gufuncs have a fixed set of common keyword arguments, Or does the fact that reduce will have an extra Edit: Related: #8773 |
|
Maybe the way to think of it is that a So maybe ultimately, we would have both a reduce-gufunc, and a reduce method. The reduce method's job is to properly set up the out parameter for the reduce-gufunc. The method would have an "identity" argument, while the ufunc itself would read that value from the out arg. Sorry for the distraction, we can think about this later outside this PR. |
|
The explanation of these changes could be much more complete. And please explain it for all the functions changed. |
|
So, there are two things that worry me here:
|
I'll add more complete docs with examples, if that's what you mean, incorporating @eric-wieser's excellent use cases and insights. :-)
Yes, I thought of that as well, however, plenty of kwargs here are different ( However, if it WAS to be changed, I'd say it's more similar to
>>> np.add.reduce(np.ones((2, 2)), identity=10)
array([12., 12.])The name So |
There's a more surprising result than that one to be found, I think. Maybe |
|
It gives a quite reasonable result, actually. >>> np.add.reduce(np.ones((2, 2, 2)), axis=(0, 1), identity=10)
array([14., 14.]) |
That's maybe reasonable. Perhaps We'll definitely need to run this by the list before this gets merged, but may as well brainstorm here first |
|
Non-adjacent axes? |
Still works beautifully.
The slight problem I see with this is something like the following: >>> np.minimum.reduce([0.0], identity=-np.inf)
-infAs you can see, it gives a value not in the original set. I guess we could call it
|
Huh, I think my memory of the code is incorrect. I thought that multi-axis
I'm now on board with the argument being called The issue in that case is that the caller chose to pass in an initializer that was not the minimum of the set of allowed values. That looks perfectly fine to me. |
|
CI is failing because |
Exactly. We can document a use-case as being reduction of an empty axis. On that note, I'm going to work and might not be available for the next hours. If you want to fix that on my branch, feel free. |
|
On another note, if we're renaming this to Edit: I'm willing to take on the task of making |
|
Accumulate would be a better fit for another PR, especially since the use case is less strong. |
#834 should get you started on that rabbit hole. Lets try and wrap up this PR first though! |
numpy/add_newdocs.py
Outdated
There was a problem hiding this comment.
Move this second sentence to just above np.minimum.reduce so it's clear what it refers to
numpy/core/fromnumeric.py
Outdated
There was a problem hiding this comment.
max? min? This is sum
I'd argue that this isn't useful enough to include in the public API here, or in any of these functions. It's niche enough that we probably want the user calling ufunc.reduce directly
There was a problem hiding this comment.
Hmm. I went to a lot of effort to put it in. :-( Anyway, I'd argue it should stay for the NaN-skipping functions because there's no alternative there.
There was a problem hiding this comment.
Additionally, it's useful for specifying the value of an All-NaN slice reduction (although I guess you could just do this):
x_max = np.nanmax(x, ...)
x[np.isnan(x)] = valueThere was a problem hiding this comment.
Here's what I propose:
np.some_ufunc.reduce(..., initializer=...)np.sum(...)- not useful enough to expose - force people to use the more explicit API abovenp.prod(...)np.min(..., default=...)- forward to initializer, match the py3 APInp.nanmin(..., default=...)as abovenp.min(..., initializer=...)np.nanmin(..., initializer=...)
There was a problem hiding this comment.
I take it back, default and initializer mean different things. max([0], default=10) is different to np.max([0], initializer=10)
There was a problem hiding this comment.
Exactly, as in the discussion above. But to keep uniformity, we should probably either keep it or remove it everywhere. I prefer users should not have to remember/recall which reductions had this and which didn't, they should just instinctively go for either the reduction or ufunc.reduce.
I guess to cater for the nanmax, etc. case, we could ask users to reshape, add an extra slice, then undo, but that would be excessive.
There was a problem hiding this comment.
nanmax absolutely should have an initializer argument, since it needs it to deal with all-nan slices of object dtypes. I'm just not sure they're useful on sum and prod, but maybe leave them in for now, and consult the list.
Either way, my initial remark here is pointing out that this docstring is copied-and-pasted incorrectly
There was a problem hiding this comment.
That's fixed. I see you noticed it was a copypasta.
numpy/core/src/umath/ufunc_object.c
Outdated
There was a problem hiding this comment.
There was a warning in GCC, I wanted to get rid of it.
There was a problem hiding this comment.
Are the warnings/errors because of this change? Some C++ magic going on? Ideally, adding an intitializer shouldn't change anything.
There was a problem hiding this comment.
The warnings are showing up because there's a failure elsewhere - you're probably just not scrolling down far enough to see the error. I'd probably revert this change, and leave it for a future warning cleanup.
|
I think what you, the behaviour for objects is close enough to what the final state should be than what would be the case without those extra commits. So, I would suggest to merge and do a follow-up PR on the details. |
|
+1 for finally merging. |
|
Well, I don't know and I do not want to think about it. Close enough for me is close enough, if assuming that someone will use the current behaviour, we can still get to a reasonable final state without problems at some point. I hate adding the next kludge that future us gets annoyed about. Right now I am a bit confused what you are suggesting as final state for the object support.... |
|
That said, None as the "Error out" value, is maybe good even for object. Not ideal in some sense but the other solutions seem not much nicer. Plus, I am not sure it would even manoeuvre into a tricky to resolve issue for the future. |
|
@seberg Right now, the issue is that if you pass Right now, having |
|
I think the cause of our problems is how the
|
|
@eric-wieser Would you happen to know how I'd access |
|
Something like #include "npy_import.h"static PyObject *NoValue = NULL;
npy_cache_import("numpy", "_NoValue", &NoValue);ought to work |
numpy/core/src/umath/override.c
Outdated
There was a problem hiding this comment.
A comment saying this is the initial argument would be good.
There was a problem hiding this comment.
Sure. Any idea why the Windows build is failing?
|
Penultimate commit message is very wrong! |
numpy/core/src/umath/ufunc_object.c
Outdated
There was a problem hiding this comment.
Ah, this should actually be if (... < 0) { return NULL; } (wrapped as normal), in case the import fails
There was a problem hiding this comment.
npy_cache_import has a return type of void, so we can't do that.
There was a problem hiding this comment.
Ah, in that case, I mean you should check that NoValue != NULL
numpy/core/src/umath/override.c
Outdated
There was a problem hiding this comment.
C90 requires this be declared at the top of the function with the other variables (which is why the test fails)
1b5bed3 to
0fb251e
Compare
|
I think this now has the desired behavior for all cases. The |
|
Any more changes required here? If not, is this good to merge? cc @eric-wieser @mhvk @seberg |
|
Won't look through the code, at least not now. It seems it still needs release notes I guess? So I understand correctly it is now:
Seems good to me as a change. |
|
That's correct:
I'll add release notes now. |
7aca91b to
24e185a
Compare
|
Rebased and tests added for |
b3b7289 to
06d39e0
Compare
|
@hameerabbasi - I'd gladly merge this (maybe after one more day for last comments), so do go ahead and squash! |
06d39e0 to
6b728d1
Compare
|
@mhvk Ping. 😀 |
|
Merged. Thanks, @hameerabbasi! |
|
Thanks so much to @eric-wieser and @mhvk especially for babying me through my first big PR. I owe you! |
As discussed in #5032: Adds an
initialkwarg forufunc.reduce, plus changes inmin,max,sum,prod.This acts as the first item in a reduction. It's useful for empty reductions e.g.
np.min([], initial=np.inf)reduces toinf. It's also useful for starting the reduction with a different value e.g.np.sum([10], initial=5)reduces to15. (5used as starting value rather than the identity0).Explicitly passing the initial value as
Nonecauses empty reductions to err.Explicitly passing
np._NoValuealso keeps the default behaviour.min,max,sumandprodonly pass on theinitialkwarg.PyUFunc_Reduceis modified to accept theinitialkwarg. It then replaces the identity byinitial.Changes are made to
normalize_reduce_argsso that it accepts and parsesinitialcorrectly.Documentation has been added for all the changes.
Fixes #5032