Conversation
numpy/lib/function_base.py
Outdated
There was a problem hiding this comment.
I never liked lines like these much. I think the purpose is just to ensure a is float. How about
a = np.asanyarray(a, dtype=np.result_dtype(a, 0.))
This at least doesn't make a copy unless it is needed.
There was a problem hiding this comment.
You can probably skip the whole asanyarray call here, and just use the dtype argument later in the ufunc call.
There was a problem hiding this comment.
About the asanyarray. Do you think that weights is likely to have a unit or so? Because if we make it asanyarray, the weights would also have the same priority at deciding the output array type. Just wanted to point it out that sometimes we may want to not make give a secondary argument as much influence as the primary one.
There was a problem hiding this comment.
Yes, weights can definitely have units, e.g., if it is the inverse of the measurement error squared.
There was a problem hiding this comment.
Using the dtype argument seems a much better idea indeed -- but needs a bit of care in the same usage for weight below; it may be easiest to just precalculate the result dtype with result_dtype = np.result_type(a, 0.)
There was a problem hiding this comment.
What I meant was, that you can just add a 0. to the line below: scl = wgt.sum(axis=axis, dtype=np.result_type(a.dtype, wgt.dtype, 0.)) to the same effect probably
There was a problem hiding this comment.
I like a = np.asanyarray(a, dtype=np.result_dtype(a, 'f8')) because I am worried about the behavior of the np.multiply(a, wgt) call. If both a and wgt happen to be integer arrays we have to worry about overflow etc.
I like f8 because that is what np.mean explicitly casts to.
There was a problem hiding this comment.
Oh, ok, that forces it to at least double precision, having the same behaviour of upcasting as np.mean makes sense in any case I guess. Though again, i think you can also just plug the casting into that later result_type call.
There was a problem hiding this comment.
Ah, right. Upcast to double makes sense for things like short integers, which is why it upcasts those explicitly.
|
Note that this PR would supersede gh-5551. |
cef50df to
d2b0df6
Compare
|
Updated based on comments. For the casting of The behavior of |
d2b0df6 to
d4c9030
Compare
|
@ahaldane - this looks good! I did a quick try and it works with |
|
LGTM on first read through. |
|
Should add some tests for the preservation of subclasses. |
numpy/lib/function_base.py
Outdated
There was a problem hiding this comment.
Should this be an or or and with the same for wgt? Hmmm, actually, if, which one, hmmm :), I guess or?
There was a problem hiding this comment.
I think the type of wgt doesn't matter. The current code does as follows for the four combinations to think about (float means f4 or f8):
a wgt result_dtype
--- --- ---
int int f8
int float f8
float int f8
float float2 biggest(float, float2)
I suppose in the 2nd line we could cast to whatever wgt's floating type was (f4 or f8) but I'm not sure whether that's helpful or unnecessarily complicated.
(Edit: Actually the 2nd line will give back f8 no matter what we do, since int+f4 coerces to f8 anyway)
e11fefe to
6f6c40c
Compare
6f6c40c to
5ceab8f
Compare
|
Updated with tests. |
| a = np.array([[1,2],[3,4]]).view(subclass) | ||
| w = np.array([[1,2],[3,4]]).view(subclass) | ||
|
|
||
| assert_equal(type(np.average(a, weights=w)), subclass) |
There was a problem hiding this comment.
Should that not be better as assert_(type(np.average(a, weights=w)) is subclass)?
EDIT: Although that only works for new style classes. Hmm, assert_equal does seem to work properly there.
EDIT: But not for old style classes either.
There was a problem hiding this comment.
It's not clear to me what assert_equal does for types, so maybe is ok, if not obvious.
|
Thanks Allan. |
MAIN: fix to #7382, make scl in np.average writeable
These changes to
np.averagewere suggested by @mhvk in #5706.Rather than only make the changes in
np.ma.average, in this PR I'd like to first add them tonp.average, so that we end up with a situation wherenp.ma.averageandnp.averageare more or less copies.