ENH: Implement faster keyword argument parsing capable of METH_FASTCALL#15269
Conversation
numpy/core/src/common/npy_argparse.h
Outdated
There was a problem hiding this comment.
Not sure I understand this - do you mean must be passed positionally?
Or are you trying to describe "not keyword-only"?
There was a problem hiding this comment.
The latter, not keyword only, which seems simpler than to count the number of keyword only. Will try to rephrase if this seems to go anywhere.
numpy/core/src/common/npy_argparse.c
Outdated
There was a problem hiding this comment.
I'd recommend
| * @return Returns 1 on success and 0 on failure. | |
| * @return Returns 0 on success and -1 on failure. |
which is more typical for error-reporting in python - converter functions seem to be the exception not the rule.
There was a problem hiding this comment.
I generally prefer -1, I had it like this only, because it is closer to the code using PyArg_Parse* code, but am happy change.
10e700b to
7f9cf12
Compare
|
It seems PyPy 3.7 now supports fastcall, so we could use that as an excuse to reactivate this PR (removing all kwarg dictionary paths). The somewhat ugly argument/parameter macros could then be removed! Basically, I do think that this was a very straight forward (and not actually super large) PR, the difficult part in this project were mostly the ufunc maintanence (which is a later PR). The only real alternative I am aware if using argument clinic such as here: https://github.com/pitrou/pickle5-backport/tree/master/pickle5/clinic and https://github.com/pitrou/pickle5-backport/blob/master/BACKPORT-NOTES.txt which would require some tooling or vendoring of all clinic versions, which I am not too fond of. About the ufuncs: That maintenance is absolutely necessary in large parts, we could push that through, but I need to touch the same code massively for ufuncs, and it will be hell to try to split out the "maintenance" from the rest, so those things can't be worked on in parallel. Either we do the argument parsing soon, or we give up on ufunc argument parsing until dtypes are finished. The first way may even be easier for me, but that can only work if we check it off at a reasonable speed. (I do remember Marten did even have a look at that old PR). To be clear: I mainly comment here to note that we could do this slightly nicer now. If anyone is happy to review this, I very much think this is worthwhile and I can pick this up at almost any time if we have the review bandwidth to push it through. |
|
There are merge conflicts. Could you run benchmarks to show this is justified? |
|
I don't want to snipe you away from dtypes right now :). I do think its worthwhile, even as a first step to have the infrastructure, since it does delete some ugly fast-paths, speeds up some things and might make optimiizing Some examples for changes (it requires a dirty rebase right now, unfortunately it is not trivial also due to the So is that a lot? I don't know, most real-world code has many other larger overheads probably. Basically, the reason why I did not have nice benchmarks, is that:
|
|
I would prefer to push this off to 1.21, there is already a ton of stuff in 1.20 that might bite us. |
|
@charris sorry, I was never suggesting to rush this in before that. I just realized that we can clean up some of the uglier parts now, and that I though it was probably the main reasons why it stalled. |
7f9cf12 to
fc5cb20
Compare
|
Lets see what PyPy thinks. I actually updated this, speedups are the same (significant e.g. for EDIT: to be clear, this PR signficantly speeds up printing arrays (due to the dragon4 argparsing), but the main speedups are I changed it now look like this (example from later PR): NPY_PREPARE_ARGPARSER; // Translates to a static cache.
if (npy_parse_arguments("empty", args, len_args, kwnames,
"shape", &PyArray_IntpConverter, &shape,
"|dtype", &PyArray_DescrConverter, &typecode,
"|order", &PyArray_OrderConverter, &order,
"$like", NULL, &like,
NULL, NULL, NULL) < 0) {
goto fail;
}using -1 as return value for errors and Happy to:
once the merging starts. |
|
I ran the benchmarks with the full diff (there seem some small merge conflicts or just refcount bugs still to straighten out there though) – that is, the timings are with more commits (a lot of random fluctuations, e.g. those (going to update the branch today, but there is probably some fiddling to do with the commits, and then hope that PyPy likes it now :) ) Details |
fc5cb20 to
7a8d798
Compare
|
One note here, is that the codecov failures are almost all related to errors that are only relevant when adding new functions with using |
Later PRs or later changesets in this PR? Coverage still claims the conversion errors are not hit. |
This is a fast argument parser (an original version also supported dictionary unpacking (for args, kwargs call style) which supports the new FASTCALL and VECTORCALL convention of CPython. Fastcall is supported by all Python versions we support. This allows todrastically reduces the overhead of methods when keyword arguments are passed.
7a8d798 to
b83847c
Compare
|
I expect that most of this gets tested in the ufunc/ EDIT: Tried close/reopen, since codecov seemed to ignore the complete file now... |
Array methods are the easiest target for the new parser. They do not require any larger improvements and most functions in multiarray are wrapped. Similarly the printing functions are not dispatched through `__array_function__` and thus have the most to gain.
b83847c to
9a45332
Compare
|
|
||
| def test_invalid_integers(): | ||
| with pytest.raises(TypeError, | ||
| match="integer argument expected, got float"): |
There was a problem hiding this comment.
Is this test running? It should be triggering code coverage of the new functions (the error message comes fromPyArray_PythonPyIntFromInt) but coverage claims otherwise. I wonder what is going on?
There was a problem hiding this comment.
Definitely gets run. Confuses me as well, the best idea I have is that the inclusion/compilation both with _multiarray_tests and with _multiarray_umath confuses the coverage (it no seems to report no coverage at all)? Which would mean that mem_overlap coverage should also be off, and that sounds a bit familiar.
|
Thanks @seberg |
|
Oh wow, thanks, let me do some of the rebasing due to this, and probably write brief docs for the developers corner. |
The first commits adds the ifrastructure and macros necessary for the argument parsing. The second commit uses it for
methods.cand functions likeones,empty,empty_likethat are defined in C as fairly simple examples. Some of the functions macros are not used here.This replaces gh-15099
I am not sure what would be a better approach; Unless argument clinic; I may have misjudged it, but it seems like using it is a larger change and I am not sure how simple it with support of different python versions.
I am not quite sure that the addition in
common/npy_argparse.?is the right approach for something like this?There will be two follow up PRs:
np.array.