feat: add axis=None reducer specializations#3653
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files
🚀 New features to boost your workflow:
|
|
I like this |
|
The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR3653 |
|
Closing this as @ianna will take this over. |
ianna
left a comment
There was a problem hiding this comment.
@pfackeldey Thanks for jumping in on this one! I was planning to tackle it, but you’ve already done a great job. Everything looks good from my side—let’s go ahead and merge it. Appreciate the teamwork! 👍
|
Ok 👍 Let me have a close look again tomorrow, the last thing I'm not sure about is the |
|
@pfackeldey @ianna there is still the issue of an error not being raised for a wrong axis value that I commented above. It should be addressed before merging. |
It seems to properly be using numpy to do the summation and also properly error for bad user-passed axes now. |
ikrommyd
left a comment
There was a problem hiding this comment.
This addresses the issue for analyzers so it looks good to me!
Tests may be good to add but @ianna should decide that.
It's possible to construct an array of float32s where awkward on main currently gives inaccurate results during summing versus numpy. This PR fixes this
In [17]: array = np.random.normal(loc=10000, scale=1000, size=10000).astype(np.float32)
In [18]: np.sum(array), ak.sum(array)
Out[18]: (np.float32(99882600.0), np.float32(99882670.0))
This PR adds
axis=Nonereducer specializations. Should fix #1250.A reducer may now implement a specialization of itself that is used in the
axis=Nonecase (andaxis=0oraxis=-1case for 1D rectangular arrays). With this PR the specialization makes use of the underlying nplike implementation for the reducer, i.e.np.sum, which implements a sum algorithm that takes care of compensation due to floating point precision. In order for this interface to work I extended the nplike interface to also implementsum.@ikrommyd and @valsdav could you give this a try?
@ianna what do you think about these specializations? Unfortunately this is not getting rid of intermediate arrays, that would be a much more invasive change in the codebase. However, I noticed that
ak.sum(..., axis=None)is now quite a bit faster than before (due to the numpy kernel).(I might need some help regarding the cuda issue here @ianna :D)