Skip to content

Conversation

@MaanasArora
Copy link
Contributor

Follows #29737 to actually add the registration for the new-style sorts and argsorts. This is done using a PyUFunc_AddLoopsFromSpecs convenience function that can accept multiple ufunc names and arraymethod specs using slots, special-casing the "sort" and "argsort" names. The sfloat test is updated to use this registration.

Thanks for reviewing!

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! Mostly looks good -- probably the biggest comment I have is that the sort context contains flags rather than bools... The rest is minor, with some just being extensions that could be added later too.

Curious to see what @seberg thinks - it seems a nice implementation of his idea!

Sorting and argsorting methods for dtypes can be registered using the
ArrayMethod API. This is done by adding an ArrayMethod spec with the name
``"sort"`` or ``"argsort"`` respectively. The spec must have ``nin=1``
and ``nout=0`` for sort and ``nin=1`` and ``nout=1`` for argsort. Sorting is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has to have nout=1 - there is one output, it is just inplace (as you write below).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sorry, this is fixed, thanks!

ignored and the loops are always called with contiguous data. The spec can
also define the `resolve_descriptors` slot if needed.
The `context` passed to the loop contains the `parameters` field which is for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete "is"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

The `context` passed to the loop contains the `parameters` field which is for
these operations is a `NpySortParameters` struct. This struct contains the
boolean fields `stable` and `descending` to indicate whether the sort should be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed? Just pass the sort flags as we had defined! (which are discussed in the C api).

Copy link
Contributor Author

@MaanasArora MaanasArora Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In hindsight, I have no idea... sorry! It seems I somehow thought we were using the older PR. This is fixed, thanks!

0x00000015 = e56b74d32a934d085e7c3414cb9999b8,
# Version 21 (NumPy 2.4.0)
# Add 'same_value' casting, header additions.
# Sort and argsort array method registration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be # General loop registration for ufuncs, sort, and argsort

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

.. c:member:: const char *ufunc_name
The name of the ufunc to add the loop to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly vague. Where is this name found? Could one register for a ufunc defined outside numpy? I think by default it should be looked up on np, but one should be able to give a name including package. I guess one can test this using some of the private test ufuncs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do this, I think. The logic would be to allow for a :, if given, you import the part before the : and then use GetAttr for the part after (in principle should support . in the second part too).
It might be that the import mechanism can already do both in one go, but I am not sure. If it is tedious in C, could create a little Python helper to do it since it is very easy there, e.g. something like this

And if there is no : we would just go with numpy (and maybe the internal ufunc module as well to robustify it).

I am wondering if it makes sense to provision for when the object isn't found (or not a ufunc). We could for example allow for "?numpy.strings:..." to make it easier to support across versions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep thinking about nice ways to expand, but the more I do so, the more it is clear that we should not go there now!

So, just a small comment: I would call this entry plain "name" -- it is already in the "UFunc_LoopSlot" namespace, so adding ufunc_ seems superfluous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - I agree adding more detail could be a bit broad at this time maybe! I've changed the ufunc_name to name, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to not do it here, but I am curious about your reasoning :). I suppose it may be weird enough to support for non-numpy and honestly, writing this helper isn't hard (although this type of convenience seems to make more of a dent than I tend to expect).

Anyway, one thing we don't yet support, but happy to also defer is adding ufuncs that are not available in the main namespace.

}
else {
PyObject *ufunc =
PyObject_GetAttrString(PyImport_ImportModule("numpy"), slot->ufunc_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is the part where I think if it fails, one perhaps should have a more general import - surely, there is something available that can help importing a module.submodule.function string...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I think we have some neat shortcuts with static variables to avoid having to re-import numpy every time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that could be nice. I'm looking at this, but for the static variables: I actually couldn't find a place where the ufuncs are looked up using a static module variable, instead, it seems the module numpy is passed as an argument into those functions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have npy_cache_import_runtime you can use for NumPy (if we support non-numpy can't cache those of course).
(For more general, see first comment, I think it should be module.submodule:obj.attr and if that tedious we can even do it in Python since I'll assume this doesn't matter speed wise -- there may be C helpers in the Python API, though)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was indeed thinking about npy_import and npy_cache_import_runtime, both in src/common/npy_import.h
Since nothing here is performance critical, in this particular instance, you could just replace this with

PyObject *ufunc = npy_import("numpy", slot->ufunc_name);

One might then use npy_cache_import_nuntime("numpy", "sort") (and argsort) somewhere on top and do a straight comparison of those here.

Copy link
Contributor Author

@MaanasArora MaanasArora Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks! I've used npy_import on top, and then compared cached imports with the ufunc.

Edit: this means we will cache in the npy_runtime_imports struct as is done now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the windows tests seem to be failing because we haven't imported numpy before we call this function - I've reverted to using npy_import for now for the comparisons. Figuring how to import (and cache) these before call - if needed?

PyObject_GetAttrString(PyImport_ImportModule("numpy"), slot->ufunc_name);
if (ufunc == NULL) {
PyErr_Format(PyExc_TypeError,
"numpy.%s is not a ufunc!", slot->ufunc_name);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here an AttributeError has already been set, so you can just return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{
PyUFunc_LoopSlot *slot;
for (slot = slots; slot->ufunc_name != NULL; slot++) {
if (strcmp(slot->ufunc_name, "sort") == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may not be completely crazy to do the test after getting the attribute from numpy - i.e., check that it is array_sort. This means that if someone passes in numpy.sort, it would still get found. Note that I'm not sure you have access to array_sort here, so for now this is probably simply overthinking it!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way works... that way we need to import and cache numpy.sort before this can be called (which is fine). But overall, I don't think it matters.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks! I would be happy to extend to outside NumPy as this should really be a convenience API for the biggest part.

.. c:member:: const char *ufunc_name
The name of the ufunc to add the loop to.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do this, I think. The logic would be to allow for a :, if given, you import the part before the : and then use GetAttr for the part after (in principle should support . in the second part too).
It might be that the import mechanism can already do both in one go, but I am not sure. If it is tedious in C, could create a little Python helper to do it since it is very easy there, e.g. something like this

And if there is no : we would just go with numpy (and maybe the internal ufunc module as well to robustify it).

I am wondering if it makes sense to provision for when the object isn't found (or not a ufunc). We could for example allow for "?numpy.strings:..." to make it easier to support across versions.

enforce that `data[0] == data[1]`. Argsorting returns a new array of
indices, so the output must be of ``NPY_INTP`` type.
The spec must define the `get_strided_loop` slot which returns a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, it may not have to (unless we add an extra check), but I let's lie :).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that there is a default that will just look up the loop in the other slots? I do think it would be good to treat this as much as a regular ufunc array method as possible. In that respect, the only important bits may be that for sort, inplace is expected, and that one can expect aligned, contiguous data. But if it is fine, therefore, to just define contiguous_loop, is there anything wrong with saying so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you could just a strided_loop and that will be picked and could contain the parameter logic there rather than inside get_loop.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think we probably should just delete this, then.

If we really do want to advertise that the array is going to be contiguous, one could do so at the end of l2079:

...type. For both
``sort`` and ``argsort``, the data provided is contiguous.

But do we want to advertize that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure. It could make it easier to support strides in the future if people actually write for them, so maybe not?

/*
* Add a promoter for both directions of multiply with double.
*/
int res;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: would move to first use, or init as -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

{
PyUFunc_LoopSlot *slot;
for (slot = slots; slot->ufunc_name != NULL; slot++) {
if (strcmp(slot->ufunc_name, "sort") == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way works... that way we need to import and cache numpy.sort before this can be called (which is fine). But overall, I don't think it matters.


if (!PyObject_TypeCheck(ufunc, &PyUFunc_Type)) {
PyErr_Format(PyExc_TypeError,
"numpy.%s was not a ufunc!", slot->ufunc_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"numpy.%s was not a ufunc!", slot->ufunc_name);
"%s was not a ufunc!", slot->ufunc_name);

then if we support more, it just works out ;).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done, thanks!

}
else {
PyObject *ufunc =
PyObject_GetAttrString(PyImport_ImportModule("numpy"), slot->ufunc_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have npy_cache_import_runtime you can use for NumPy (if we support non-numpy can't cache those of course).
(For more general, see first comment, I think it should be module.submodule:obj.attr and if that tedious we can even do it in Python since I'll assume this doesn't matter speed wise -- there may be C helpers in the Python API, though)

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review of just the documentation - will continue later, so perhaps wait with pushing changes...

.. c:member:: const char *ufunc_name
The name of the ufunc to add the loop to.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I keep thinking about nice ways to expand, but the more I do so, the more it is clear that we should not go there now!

So, just a small comment: I would call this entry plain "name" -- it is already in the "UFunc_LoopSlot" namespace, so adding ufunc_ seems superfluous.

PyUFunc_LoopSlot *slots)
Add multiple loops to ufuncs from ArrayMethod specs. This also
handles the registration of sort and argsort methods for dtypes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this gives a bit more detail than necessary. How about shortening this to

Add multiple loops to ufuncs from ArrayMethod specs. This also
handles the registration of methods for the ufunc-like functions
``sort`` and ``argsort``. See :ref:`array-methods-sorting` for details.

And then delete the sentence at l.1916.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

handles the registration of sort and argsort methods for dtypes
from ArrayMethod specs.
To register a loop for a ufunc, the ``slots`` argument must be a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again shorten a little, perhaps as follows:

The ``slots`` argument must be a  NULL-terminated array of
`PyUFunc_LoopSlot` (see above), which give the name of the
ufunc and spec needed to create the loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done as well.

Sorting and argsorting methods for dtypes can be registered using the
ArrayMethod API. This is done by adding an ArrayMethod spec with the name
``"sort"`` or ``"argsort"`` respectively. The spec must have ``nin=1``
and ``nout=1`` for both sort and argsort. Sorting is inplace, however we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"... Sorting is inplace, hence we"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

enforce that `data[0] == data[1]`. Argsorting returns a new array of
indices, so the output must be of ``NPY_INTP`` type.
The spec must define the `get_strided_loop` slot which returns a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that there is a default that will just look up the loop in the other slots? I do think it would be good to treat this as much as a regular ufunc array method as possible. In that respect, the only important bits may be that for sort, inplace is expected, and that one can expect aligned, contiguous data. But if it is fine, therefore, to just define contiguous_loop, is there anything wrong with saying so?

these operations is a `PyArrayMethod_SortParameters` struct. This struct
contains a `flags` field which is a bitwise OR of `NPY_SORTKIND` values
indicating the kind of sort to perform (that is, whether it is a stable
and/or descending sort).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is probably the reason to define get_strided_loop, maybe it would be good to keep the above shorter and here add "If the strided loop depends on these flags, a good way to deal with that is to define get_strided_loop, and not set any of the PyArrayMethod_StridedLoop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I removed the previous paragraph and added this note.


// This is used for the convenience function `PyUFunc_AddLoopsFromSpecs`
typedef struct {
const char *ufunc_name;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I'd suggest simple name here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done as above.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, second half of second round: only small things, I think this is working out very neatly!

My only slightly more major comment/question is about registering wrapping loops, but I ended up concluding those are probably better done in a separate function. @seberg, does that make sense? It certainly does not seem to have to hold up this, unless we think it would affect the layout of the slots.

enforce that `data[0] == data[1]`. Argsorting returns a new array of
indices, so the output must be of ``NPY_INTP`` type.
The spec must define the `get_strided_loop` slot which returns a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think we probably should just delete this, then.

If we really do want to advertise that the array is going to be contiguous, one could do so at the end of l2079:

...type. For both
``sort`` and ``argsort``, the data provided is contiguous.

But do we want to advertize that?

.casting = NPY_NO_CASTING,
};

PyArray_DTypeMeta *add_dtypes[3] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really your PR, but partially for my understanding; I think below one could just use multiply_dtypes again, since it is identical. Indeed, one could also use it for sort (but not argsort). Might it make sense to just define these as all_sfloat_dtypes[3] and use them where possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem to work - pushed! Actually, it seems to have been reused before when the spec was also being reused.

PyBoundArrayMethodObject *sort_meth = PyArrayMethod_FromSpec_int(&spec, 0);
if (sort_meth == NULL) {
/* N.B.: Wrapping isn't actually correct if scaling can be negative */
if (sfloat_add_wrapping_loop("hypot", multiply_dtypes) < 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, would it be possible to do this with the new registration too? It would seem something that might be used a lot. Though maybe the easiest is to then define a new PyUFunc_AddWrappingLoopsFromSpecs. Anyway, would seem absolutely fine to leave this for later! (It is nice in that it would remove the need for sfloat_get_ufunc...)

p.s. Here multiply_dtypes is re-used, so I think it indeed can be done elsewhere too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I suspect that would indeed change the structure of the slots, so should consider. In particular for wrapping loops we need the dtype (and descriptor translate functions) and not the spec as far as I can tell. So, it seems there is not much common, maybe defining a new function would be best?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was also going in that direction, but will defer to @seberg.
Anyway, I think it is basically out of scope here!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's not go there. Seeing that C-API module init PEP, I wish I had thought of just not using a struct at all and putting everything into slots, then this would be easy to do reasonably...

But I think a reasonable solution will be to just add the wrapping functions as slots. Then if those slots are found you might even ignore some entries of the original struct. (Or at least allow for example `nin = nout = 0.)

PyBoundArrayMethodObject *sort_meth = PyArrayMethod_FromSpec_int(slot->spec, 0);
if (sort_meth == NULL) {
PyErr_Format(PyExc_TypeError,
"Failed to create sort method for %R",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute nitpick, but does this not fit on one line? (logic: error paths should take up the least space possible to minimize distraction)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixed!


int ret = PyUFunc_AddLoopFromSpec_int(ufunc, slot->spec, 0);
if (ret < 0) {
Py_DECREF(ufunc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this before the if so that you can remove the one on l.256 as well. In fact, you could just fall through altogether, but perhaps that's too risky...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did that, thanks!

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, looks great. Two confused comments, one to do only if there is another round. Maybe @seberg can advice about the interned objects; I don't understand why the windows build failed, but don't have a windows machine, so cannot test it.

p.s. Really not too important, but I usually try to avoid too many pushes, as it bugs me that every time the tests are run 70 times...


npy_cache_import_runtime("numpy", "sort", &npy_runtime_imports._sort);
npy_cache_import_runtime("numpy", "argsort", &npy_runtime_imports._argsort);
if (npy_cache_import_runtime("numpy", "sort", &npy_runtime_imports._sort) < 0 ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird that this did not work. It looks good, though it should be outside the loop, and the object should not have an underscore (at least, that seems to be the habit, to just use the name).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's quite strange - though yes it should've been outside the loop! Could be the issue maybe, given it only failed in windows, maybe hit some corner case. But in the error, the npy_runtime_imports struct itself was not found...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I noticed that. I just cannot understand how things can work everywhere else if it was a problem here...

return -1;
}

PyObject *sort = npy_import("numpy", "sort");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, we can still do this the inelegant way (but really, runtime import should work!). Outside of the loop,

static PyObject *sort = NULL;
static PyObject *argsort = NULL;
if (sort == NULL) {
    sort = npy_import("numpy", "sort");
    if (sort == NULL) {
        return -1;
    }
if (argsort == NULL) {
    argsort = npy_import("numpy", "argsort");
    if (argsort == NULL) {
        return -1;
    }
}

But I'm really confused why what you had failed (hard to believe it had to do with being inside the loop), i.e.,

if (npy_cache_import_runtime("numpy", "sort", &npy_runtime_imports.sort) == -1
    || npy_cache_import_runtime("numpy", "argsort", &npy_runtime_imports.argsort) == -1) {
    return NULL;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used that code block, thanks! I'll try experimenting with the cached imports more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should have commented here: Yes, I noticed that the windows build failed to find it. I just cannot understand how things can work everywhere else if it was a problem here...

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I am happy with this thanks. I think adding this convenience functino should be pretty uncontroversial API wise, and it is flexible enough to deprecate, etc.

Just a few nits. I can see extending at least for linalg.<...> or numpy.strings.<...> but maybe no need to do it here and now.

PyBoundArrayMethodObject *sort_meth = PyArrayMethod_FromSpec_int(&spec, 0);
if (sort_meth == NULL) {
/* N.B.: Wrapping isn't actually correct if scaling can be negative */
if (sfloat_add_wrapping_loop("hypot", multiply_dtypes) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, let's not go there. Seeing that C-API module init PEP, I wish I had thought of just not using a struct at all and putting everything into slots, then this would be easy to do reasonably...

But I think a reasonable solution will be to just add the wrapping functions as slots. Then if those slots are found you might even ignore some entries of the original struct. (Or at least allow for example `nin = nout = 0.)

static PyObject *sort = NULL;
static PyObject *argsort = NULL;
if (sort == NULL) {
sort = npy_import("numpy", "sort");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you do it here, please use the cache_import_runtime

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this - FYI using the runtime struct was causing errors on Windows, not sure how the CI would end up, but hopefully what is there now is reasonable!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it errors out even if we don't actually cache (in the struct)! See this commit also, @mhvk was confused about it too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the many pushes, reverted this to fix the CI until we figure this out.


PyBoundArrayMethodObject *argsort_meth = PyArrayMethod_FromSpec_int(slot->spec, 0);
if (argsort_meth == NULL) {
PyErr_Format(PyExc_TypeError, "Failed to create argsort method for %R", dtype);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't matter, but FYI: this probably eats an in-flight exception. But chaining is a bit tedious to repeat multiple times IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure I agree here, why not just let the error pass through? I.e., return -1? This will be seen only by developers for whom it may be nicer to have the original error. Anyway, would change it here and above.

Py_INCREF(sort_meth->method);
Py_DECREF(sort_meth);
}
else if (ufunc == argsort) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should decref ufunc here and above also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

.. c:member:: const char *ufunc_name
The name of the ufunc to add the loop to.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to not do it here, but I am curious about your reasoning :). I suppose it may be weird enough to support for non-numpy and honestly, writing this helper isn't hard (although this type of convenience seems to make more of a dent than I tend to expect).

Anyway, one thing we don't yet support, but happy to also defer is adding ufuncs that are not available in the main namespace.

}
}
if (argsort == NULL) {
argsort = npy_import("numpy", "argsort");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but I can push it: Very sure the function includes the NULL check (no need to indent).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw later that you flipped back and forth here. I fixed it, the problem was that this is a cpp file and the header was missing the extern C.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense, thank you!

@seberg seberg added the 63 - C API Changes or additions to the C API. Mailing list should usually be notified. label Oct 12, 2025
@seberg
Copy link
Member

seberg commented Oct 12, 2025

I don't see this as controversial API since it adds a pretty straightforward convenience mostly. So planning to merge soon (if anyone has concers even after ee can follow up, of course!).

We do need to add a release note (or two, one for sorting), though.

MaanasArora added a commit to MaanasArora/numpy that referenced this pull request Oct 12, 2025
@MaanasArora
Copy link
Contributor Author

MaanasArora commented Oct 12, 2025

Thanks for reviewing! Yes, I suppose the main point of extension might be enabling the use of other namespaces.

I added two relatively short release notes, not sure if one should add a link to the API docs (probably not)?

@mhvk
Copy link
Contributor

mhvk commented Oct 12, 2025

@seberg - nice fix, this had me quite stymied, but I never even considered that there would be a difference with being in a c++ file...

Agreed that we should move forward! It may be good to have a new issue with leftovers. I think the main one is to broaden it to all ufuncs, i.e., allow the name to contain dots and perhaps a colon for the module (trying to understand the logic for it: is it to avoid someone having a package that is called like a numpy ufunc?). One could envision a call out to a little python helper either if the name contains : or ., or if the lookup in numpy fails.

@seberg
Copy link
Member

seberg commented Oct 12, 2025

Colon is just the precise way to represent a full Python object reference: module.submodule:obj.attribute, see for example also entry-point definitions.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, but I'll just push the fix up, so we can get this merged!


PyBoundArrayMethodObject *argsort_meth = PyArrayMethod_FromSpec_int(slot->spec, 0);
if (argsort_meth == NULL) {
PyErr_Format(PyExc_TypeError, "Failed to create argsort method for %R", dtype);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not sure I agree here, why not just let the error pass through? I.e., return -1? This will be seen only by developers for whom it may be nicer to have the original error. Anyway, would change it here and above.

MaanasArora and others added 4 commits October 12, 2025 10:59
…egistration function

Includes the following commits:

DOC: Document `PyUFunc_AddLoopsFromSpecs` and new sorting / argsorting arraymethods

ENH: Update field names in PyUFunc_LoopSlot struct

ENH: Update C-API hash (15, NumPy 2.4)

ENH: Add 'PyUFunc_LoopSlots' to ufunc_funcs_api with MinVersion "2.4"

ENH: Update hash for version 21 (NumPy 2.4.0)

ENH: Move PyUFunc_LoopSlot struct definition and update version hash for NumPy 2.4.0

REF: Move PyUFunc_LoopSlot struct definition to dtype_api.h

DOC: Clarify ArrayMethod API registration process in array.rst

DOC: Update sorting parameters struct in array.rst to use correct member (flags)

ENH: Clarify version 21 description in cversions

REF: Remove unnecessary error message for non-ufunc attributes in PyUFunc_AddLoopsFromSpecs

REF: Refactor ufunc initialization and consolidate sort loops in sfloat_init_ufuncs

REF: Simplify ufunc import by replacing PyObject_GetAttrString with npy_import

REF: Rename ufunc_name to name in PyUFunc_LoopSlot

REF: Initialize res variable to -1 in sfloat_init_ufuncs for clarity

STYLE: Refactor error message formatting in PyUFunc_AddLoopsFromSpecs for clarity

REF: Simplify redundant Py_DECREF call in PyUFunc_AddLoopsFromSpecs

DOC: Improve documentation for PyUFunc_AddLoopsFromSpecs and sorting

ENH: Optimize ufunc handling in PyUFunc_AddLoopsFromSpecs by using cached imports for sort and argsort

DOC: Clarify pointer type

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

REF: Optimize array definition of loop slots for sfloat

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

REF: Improve error message for missing ufunc by removing module name

BUG: Fix incorrect use of `npy_cache_import_runtime`

REF: Reuse dtypes variable in sfloat ufunc init

REF: Cache sort and argsort functions in npy_runtime_imports_struct

BUG: Add missing header file for and add error handling to `numpy_cache_import_runtime`

BUG: Improve error handling for numpy runtime imports in PyUFunc_AddLoopsFromSpecs

BUG: Replace npy_cache_import_runtime with direct npy_import calls for sort and argsort

DOC: Fix code annotations and references in sorting and argsorting docs

BUG: Optimize numpy import calls for sort and argsort in PyUFunc_AddLoopsFromSpecs

BUG: Remove unused references to _sort and _argsort in npy_runtime_imports_struct

BUG: Replace npy_import calls with npy_cache_import_runtime for sort and argsort in PyUFunc_AddLoopsFromSpecs

DOC: Add versionadded directive for PyUFunc_AddLoopsFromSpecs in array.rst

BUG: Add missing Py_DECREF for ufunc in PyUFunc_AddLoopsFromSpecs

BUG: Replace npy_cache_import calls with npy_import for sort and argsort, simplify decrefs
@mhvk mhvk force-pushed the enh/user-sort-registrations branch from 1374802 to ba50a25 Compare October 12, 2025 15:02
@mhvk
Copy link
Contributor

mhvk commented Oct 12, 2025

p.s. I took the opportunity to squash commits together, trying to leave the small contributes of @seberg and me in separate commits, so credit goes to where it belongs! Will merge once the tests have passed.

@mhvk mhvk merged commit c9b2919 into numpy:main Oct 12, 2025
77 checks passed
@mhvk
Copy link
Contributor

mhvk commented Oct 12, 2025

p.p.s. Generalization to non-numpy names forthcoming...

@MaanasArora
Copy link
Contributor Author

Thanks @mhvk and @seberg! Looking forward to the generalization and I'll start the follow-up for StringDType (and think more about the compare function idea).

MaanasArora added a commit to MaanasArora/numpy that referenced this pull request Oct 13, 2025
IndifferentArea pushed a commit to IndifferentArea/numpy that referenced this pull request Dec 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

01 - Enhancement 63 - C API Changes or additions to the C API. Mailing list should usually be notified.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants