MAINT,API: ignore and NULL fasttake/fastputmask ArrFuncs slots#14942
MAINT,API: ignore and NULL fasttake/fastputmask ArrFuncs slots#14942mattip merged 3 commits intonumpy:masterfrom
Conversation
627e6a1 to
b52f33d
Compare
857b0cd to
8d7cad0
Compare
|
Think I addressed those, will squash in a bit. Maybe it is due to debug build, but now changed it to split up on chunk (because why not, it just speeds up certain cases). And it seems |
|
I want to do the same thing for |
4856e13 to
e67c1ce
Compare
e67c1ce to
6ec673a
Compare
|
The current timings are below. Basically, take is slower for 3 elements (which is too bad, since it was a strength of the old take compared to advanced indexing. That could probably be resolved by putting in the loop over the inner-size instead of the large chunk memmove. For some reason complex256 is optimized much better in take (maybe alignment? maybe just compiler fluctuations). Details |
f02ee2f to
6e32571
Compare
|
OK, I hope this is more or less converging. There is a slightly slowdown for certain takes, but I am not too concerned, and it could be adapted later if necessary. The complex256 slowdown is somewhat annoying, but I do not think it really matters much and may well just get magically fixed with other compilers or so. |
|
Hmmm, the azure error seems real. While float96 should not be optimized, something goes horribly wrong. Does |
|
Oooops, it was a stupid typo... The issue was only that none of the tests hit the default path, except on the 32bit (actually linux) platform. |
9199017 to
d8d6c36
Compare
Always NULL the fasttake slot, and instead move the optimized versions by doing itemsize specialization in the Take function itself.
This removes the use of the fastputmask ArrFuncs slot from within NumPy and always leaves it NULL.
|
From the call: It sounds like nobody is too concerned about ignoring these slots and a slight difference in speed trade-offs (and I guess if someone notices slowdowns we can optimize). Technical review awaiting of course. |
| Fasttake and fastputmask slots are deprecated and NULL'ed | ||
| --------------------------------------------------------- | ||
| The fasttake and fastputmask slots are now never used internally. | ||
| They will always set to NULL for builtin dtypes. |
There was a problem hiding this comment.
"They will always be NULL" or "They will always be set to NULL ..."
|
In |
|
Sounds like a good idea, although maybe a warning (forever) that it is just ignored is just as good? Removes any possible breaking, and at some point we have to deprecate that whole function anyway... |
Up to you. I would argue that
When we remove the functions the error will go away too. |
9769e8c to
68e8d47
Compare
| return -1; | ||
| } | ||
|
|
||
| if (test_deprecated_arrfuncs_members(f) < 0) { |
There was a problem hiding this comment.
Put into its own helper in anticiapation of more functions joining. In particular fastclip is already deprecated at execution time, and should be added here as well now (as well as get the same documentation updates). Planning to do that once this is in.
|
Testing this seems a bit annoying, so I only tested all the cases manually right now... I can try to think of a way to test it reasonably... |
|
I opted for warning, the alternative is to use the As mentioned above: I think clip should be deprecated in both places (here would be enough I suppose, but I do not feel like removing the other one), I will do that as a followup. |
|
Thanks @seberg |
Always NULL the fasttake slot, and instead move the optimized versions
by doing itemsize specialization in the Take function itself.
I still need to check that there is now speed regression.