Skip to content

Conversation

@charris
Copy link
Member

@charris charris commented Aug 30, 2025

This PR adds new sorting APIs that can be used for extending our current
sort types. The new C-APIs do not have the which argument of the
legacy C-APIs, instead they have flags request flag API. In this
transition, heapsort can no longer be called directly, introsort
(quicksort) is called instead. The sort/argsort methods get new keyword
arguments used to set the flags when kind is not given, so flags is
invisible at the Python level. The new keywords are:

  • stable -- whether the sort should stable, default is False,
  • descending -- whether to sort in descending order, default is False,
  • nanfirst -- whether NaNs sort to the last or first, default is False.

The main difference here is that the keywords select by functionality,
whereas kind selects by algorithm. Note that descending and nanfirst
are not yet implemented, so an error is raised if either is specified.

The added C-API functions are:

  • (API)PyArray_SortEx,
  • (API)PyArray_ArgSortEx,

Those are the two functions that need changes depending on how we want
to implement adding new sort algorithms. They should be pretty easy to
adapt to whatever we choose to do.

The legacy C-API functions, PyArray_Sort and PyArray_ArgSort, have been
modified to call through the new C-API functions. In order to do so
the kind function has been encoded in the flags like so:

  • "quicksort" - default (0)
  • "heapsort" - default (0)
  • "mergesort" - stable (1)

I note that the partitioning/selection functions have not been modified,
we may want to do that a some point.

@charris
Copy link
Member Author

charris commented Aug 30, 2025

Whoa, that is one of the ugliest diffs I've ever seen. The actual code just adds new functions and deletes the content of old ones. I suggest looking at the files direct.

@charris
Copy link
Member Author

charris commented Aug 30, 2025

Note that I use some static arrays inside the functions which are only written to when they are initialized. The initialization may not be thread safe, although they should be in C++11. They can be purely local without much cost. I used tables so that minimal changes would be needed if we add function slots one way or another. They contain variables that are not available until link time, but then, so do the standard dtypes. I'm not clear if function names can be considered constant at compile time, If so, I could simply pull them out of the functions and initialize them in the usual way.

I think it can be done, but I not sure.

@charris
Copy link
Member Author

charris commented Aug 30, 2025

OK, const myfunc foo[] = {func1, func2}; works.

@charris charris force-pushed the new-extended-sorts branch from 79ad3b1 to 2b45310 Compare August 30, 2025 13:24
@charris
Copy link
Member Author

charris commented Aug 30, 2025

Cleaned things up a bit. @seberg You wanted to remove the legacy PyArray_Sort and PyArray_ArgSort from the public API. This is a chance to avoid exposing the new functions in API. I suppose we can always add them later if someone wants them.

@mhvk
Copy link
Contributor

mhvk commented Aug 30, 2025

I think I'd still suggest that the new C-API does not have sort kind at all, and that the old functions just translate. Alternatively, for now, make the new functions private and have the public SortEx not have the kind argument, i.e., both old and new call into the private ones.

@charris
Copy link
Member Author

charris commented Aug 30, 2025

I think I'd still suggest that the new C-API does not have sort kind at all

Internally, I've been using the fourth flag bit to mark using "kind" as a table index. We could translate the kind values into flag bits:

kind == 0 -> 0 (default)
kind == 1 -> 8 (new bit, heapsort)
kind == 2 -> 1 (stable)

or maybe

kind == 0 -> 8 (default)
kind == 1 -> 9 (new bit, heapsort)
kind == 2 -> 10 (stable)

i.e., if the fourth bit is set, use the lower bits to hold the flag value. I chose the fourth bit to reserve eight values of the keywords, but we could pretty reserve more space. I expect we will have a minimum of 32 bits to work with, flags is an int type. So we can get rid of kind in the C-API. Getting rid of it in the public method API will take deprecation. For that matter, we might could use a different bit to mark argsorts, but maybe that is going too far.

Maybe we should just make the flags variable uint32.

* Sort an array in-place with extended parameters
*/
NPY_NO_EXPORT int
PyArray_SortEx(PyArrayObject *op, int axis, NPY_SORTKIND which, int flags)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now looking a bit more closely, I see how you use which to infer flags. My sense would still be not to have which in the new C API call sequence at all, but, say, store it in the top 4 bits of uint32 flags, so one would have

which = ((int32)flags >> 28);
if (which == NPY_USEFLAGS) {
    switch (flags & 0x0fffffff) {
        ...
    /* no case 8 */
} else {
    sort = sort_table[which];
}

And then call appropriately elsewhere.

In the header file, we should then reserve the appropriate bits.

Alternatively, it is perhaps easier to just rename the functions you have here Py_Array_Sort_int(...) and then define

NPY_NO_EXPORT int
PyArray_SortEx(op, axis, flags) {
    return PyArray_SortEX_int(op, axis, NPY_USEFLAGS, flags);
}

Yet another alternative would be to remove the which part of the logic here altogether, and move it to the old functions. But that would seem to lead to quite a lot of code duplication.

@charris
Copy link
Member Author

charris commented Aug 31, 2025

The way I see things at the moment.

Decided:

  • Methods sort and argsort keep their current APIs for back compatibility (done)
  • Remove the which argument from the new C-API functions, use only the flags. (done)
  • Make the flags argument explicitly uint32. (todo)

To be Decided:

  • Remap kind losing heapsort ,
  • Make public the new C-API,
  • Deprecate kind ,

I would also argue for reserving the lower 8 bits of of the flags for sort selection, and the remaining bits for hints.
That is just to avoid future complications in implementations. That can be handled through documentation.

@mhvk
Copy link
Contributor

mhvk commented Aug 31, 2025

Agreed on the "decided". For those to be decided, I'd say "remap, make public, and deprecate" - though the deprecation could be done in a follow-up PR.

@charris charris force-pushed the new-extended-sorts branch from 2b45310 to e953c3e Compare August 31, 2025 18:58
@charris
Copy link
Member Author

charris commented Aug 31, 2025

I went ahead and made the modifications. It isn't possible to set the enum type until C23, so to be safe I just left it an enum, which will change size if needed, but the size will be platform specific. I think that is safe.

@charris
Copy link
Member Author

charris commented Aug 31, 2025

We can postpone adding documentation and keywords at the Python Level until we actually add the new functionality.

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments, probably too early, since really they are mostly about making sure that the new C API gets fully exercised. (Maybe one way is to start with just the stable flag in this commit, and introduce descending and nanfirst in following ones, together with their support. But probably all premature...)

*/
typedef enum {
_NPY_SORT_UNDEFINED=-1,
NPY_SORT_USEFLAGS=-1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe want to mention in the comment that this enum is intended to be deprecated eventually?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we leave a comment here, it should be searchable. Hmm, maybe

\\ TODO: remove after sort kind deprecation expires.

// remap which to flags, note that NPY_HEAPSORT is effectively
// remapped to NPY_QUICKSORT.
switch (which) {
case NPY_STABLESORT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also have NPY_MERGESORT here, right? For the C API, we cannot be sure that it won't be used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mergesort and stablesort are the same thing in the enum. And mergesort isn't actually mergesort, it is either timsort or radix sort depending on the type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, should have seen that. Ignore this one then!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, maybe add to the comment you have already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used NPY_MERGESORT instead, which might make more sense.

// remap which to flags, note that NPY_HEAPSORT is effectively
// remapped to NPY_QUICKSORT.
switch (which) {
case NPY_STABLESORT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add NPY_MERGESORT also here.

}
// remap sortkind to flags, NPY_HEAPSORT is effectively mapped to NPY_QUICKSORT.
switch (sortkind) {
case NPY_STABLESORT:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think here NPY_MERGESORT indeed cannot happen. Maybe add a comment just in case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just used NPY_MERGESORT instead, which might make it clearer.

if (order == Py_None) {
order = NULL;

if (sortkind == NPY_SORT_USEFLAGS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth splitting out this sortkind checking into a separate function, since it is duplicated between sort and argsort. Something like

int get_sort_flags(NPY_SORTKIND sortkind, int stable, int descending, int nanfirst)

where the return argument is flags (or -1 if an error occurred. (Maybe better with flags passed in?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of such duplications, I thought to leave it for a later cleanup if desired. See also the legacy C-API functions. And wait until after the new slot is added, which might take over that whole bit and delete this code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough!

@mhvk
Copy link
Contributor

mhvk commented Aug 31, 2025

We can postpone adding documentation and keywords at the Python Level until we actually add the new functionality.

indeed, though whatever we expose at the C level should be correct (even if it means erroring on NPY_SORT_DESCENDING)

@charris charris force-pushed the new-extended-sorts branch from e953c3e to 80babce Compare August 31, 2025 23:45
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good now, spotted just one missing break. Mostly now seems a question of how to do follow-up: as is, we introduce C API and (undocumented) python API for options that error. Do we want that? E.g., @seberg suggested to not have nanfirst yet.

On the other hand, things are now in place to add that!

default:
case NPY_HEAPSORT:
case NPY_QUICKSORT:
flags = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a break after, no?

I'd also swap the order, so default is part of the end (also above, in item_selection.c, and below), but no big deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used all the different possibilities, so the default is not longer needed. The range is already checked by the sortkind converter.

@charris charris force-pushed the new-extended-sorts branch 2 times, most recently from e2140e6 to 9200fe0 Compare September 1, 2025 02:21
@charris
Copy link
Member Author

charris commented Sep 1, 2025

All the keywords are there if you want to test them:

In [1]: np.ones(10).sort(descending=1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 1
----> 1 np.ones(10).sort(descending=1)

TypeError: no current sort function meets the requirements

They just won't be documented. I suppose we could print out the flag values to make it clearer.

@seberg
Copy link
Member

seberg commented Sep 1, 2025

Do we want that? E.g., @seberg suggested to not have nanfirst yet.

Yes, there is no reason in Python to add a kwarg early, it is just confusing to have it. On C the story may be a bit different (although, not much because NPY_TARGET_VERSION allows us to decide whether or not to expose a new flag! We could skip adding the flags even there -- although I don't care about that!).


I am curious now if we can't avoid the new function and just change the definition of which in PyArray_Sort()? (Eventually deprecated if e.g. "heapsort" is passed. Would leave some bits unused in the new sort flags, but that seems OK.)
Not sure how one would best write it, though. Maybe a union of both enums (and make sure the enums don't collide in flags)? Or just document it as "legacy flag maps to stable/unstable".

@charris
Copy link
Member Author

charris commented Sep 1, 2025

I am curious now if we can't avoid the new function

Yes, we could. But you didn't want to extend SORTKIND :) And to be fair, that would have been a kludge.

I rather like the new functions and the flags interface, I think things worked out well. The only real difference is heapsort not being public, and I've thought it unneeded since introsort went in. From the beginning it was the slowest sort, taking about 2x longer than quicksort, and quicksort has gotten faster.

The flag interface is nicely extensible in the future, we have been abusing SORTKIND for many years as new sorting functions have come along: introsort, radix sort, timsort, SIMD quicksort, etc. We don't want to make users choose among those options, let the experts do it.

All we have do to get rid of nanfirst is comment out a line

//           "$nanfirst", &PyArray_OptionalBoolConverter, &nanfirst,

The parser then raises an error:

In [1]: np.ones(10).sort(nanfirst=1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[1], line 1
----> 1 np.ones(10).sort(nanfirst=1)

TypeError: sort() got an unexpected keyword argument 'nanfirst'

@charris
Copy link
Member Author

charris commented Sep 1, 2025

Note that all user dtypes need is a compare function, they just put NULLs in the current sorting slots. We only need the current sorting slots for type specific builtin sorts. To add descending sorts, the quick and dirty solution is to add generic sorting routines and add them to the new tables which are introduced here along with the new interfaces. We only need an extended dtype if we want them to be builtin. We can start with the generic sorts and use them for timing comparison, which will tell us if it is worth going to type specific sorts.

I need to see how runtime dispatching works, I haven't figured that out yet.

@seberg
Copy link
Member

seberg commented Sep 1, 2025

the quick and dirty solution is to add generic sorting routines and add them to the new tables which are introduced here along with the new interfaces. We only need an extended dtype if we want them to be builtin.

Yeah, but I happen to care a lot about this, that if we add descending we do it in a way that can be exposed to user DTypes, because we can't focus on the API decisions.
(But this is the other PR and I think most decisions are actually already getting close -- but API is hard and it's necessary to have multiple eyes on it. The biggest API complexity is maybe not missing the chance of fixing more of the API at the same time.)

I don't see that we can expect generic sorts to have decent performance for descending, if that was the case, we wouldn't use Raghuveers or highway sorts (which we should plug in for descending as well if possible).

But you didn't want to extend SORTKIND

Maybe, I may have changed my opinion. But there were two different things here: One is keeping the actual algorithm choice (we are doing the opposite now, always removing that choice) and the other is absorbing the new sort option flags into a single larger enum.

Now that we decided to narrow the algorithm choice to stable/unstable. I think avoiding PyArray_SortEx() is probably nice (I am not 100% yet, because I am not sure if that ends up with some monster of an enum). Just having PyArray_SortEx is not that pretty and a bit confusing as well (which one should I use?).

I need to see how runtime dispatching works, I haven't figured that out yet.

Isn't that what the other PR tried to solve -- and while it should morph with recent discussion, mostly did solve?

@charris
Copy link
Member Author

charris commented Sep 1, 2025

that ends up with some monster of an enum)

The current values are orthogonal, so three enum entries gives 8 choices. I don't expect the enum to grow much, what will grow are the switch statements.

EDIT: I expect the other PR, if implemented with either a table or function, to get passed a flag, which is just a number. If implemented with a table the number is an index into the table. In short, I expect the switch statements to go away, essentially moved down to the new interface.

@charris
Copy link
Member Author

charris commented Sep 1, 2025

For instance, descending && stable becomes an option. What would grow exponentially is SORTKIND. I think the flag option is preferable and doesn't impact the other PR, except it becomes simpler IMHO.

This PR adds new sorting APIs that can be used for extending our current
sort types. The new C-APIs do not have the *which* argument of the
legacy C-APIs, instead they have *flags* request flag API. In this
transition, heapsort can no longer be called directly, introsort
(quicksort) is called instead. The sort/argsort methods get new keyword
arguments used to set the flags when *kind* is not given, so *flags* is
invisible at the Python level. The new keywords are:

- stable -- whether the sort should stable, default is False,
- descending -- whether to sort in descending order, default is False,
- nanfirst -- whether NaNs sort to the last or first, default is False.

The main difference here is that the keywords select by functionality,
whereas *kind* selects by algorithm. Note that descending and nanfirst
are not yet implemented, so an error is raised if either is specified.

The added C-API functions are:

- (API)PyArray_SortEx,
- (API)PyArray_ArgSortEx,

Those are the two functions that need changes depending on how we want
to implement adding new sort algorithms.  They should be pretty easy to
adapt to whatever we choose to do.

The legacy C-API functions, PyArray_Sort and PyArray_ArgSort, have been
modified to call through the new C-API functions. In order to do so
the *kind* function has been encoded in the *flags* like so:

- "quicksort" - default (0)
- "heapsort"  - default (0)
- "mergesort" - stable  (1)

I note that the partitioning/selection functions have not been modified,
we may want to do that a some point.
Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last set of (very) nitpicky comments -- I think it is over to @seberg with this!

* Updated with new names denoting requirements rather than the algorithm. All
* the previous values are reused in a way that should be downstream
* compatible, but the actual algorithms used may be different than before. The
* new approach should be more flexible and easier to update. The idea is that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No big deal but would delete the sentence "The idea is ..." - the implementation should not matter here (I may well end up thinking that this is the right approach, and now see how indexing is sort-of baked in to the present API, but not having seeing any implementation, I still feel the logical solution would be a single slot for a function with a flags argument...).

will use the second field and so on. To alter the sort order of a
structured array, create a new data-type with a different order of names
and construct a view of the array with that new data-type.
Equivalent to :meth:`ndarray.sort<numpy.ndarray.sort>` (*self*, *axis*,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already not quite true any more, and will soon be even less so; I'd tend to start with what the function actually does (so, "Return an array ... " and then at the very end say something like

This function is the implementation for :meth:`ndarray.sort<numpy.ndarray.sort>`,
though with slightly different meaning of ``kind`` -- see ``NPY_SORTKIND`` below.

(and similar for argsort)

Sorting ``kind='heapsort'`` now maps to ``kind='quicksort'``
------------------------------------------------------------
It is unlikely that this change will be noticed, but if you do see a change in
execution time or unstable argsort order, this is likely the cause.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add "Please let us know if this causes a performance regression."?

PyArray_Descr *newd, *saved=NULL;
int stable = -1;
int descending = -1;
int nanfirst = -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need the optional bool converter, since the default is well defined in both cases. The optional one is just for when None/not passed has a special meaning.
(Which is the case for stable, because it needs to distinguish not passed for the error.)

I would also be happy to just delete the NaN thing here, but either way. I have been drifting away from wanting it much. And I think if we have it then nan_placement="first" might be clearer than a boolean thing in Python.

(Anyway, it doesn't really matter of course as it is all commented out anyway.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can go further, with heapsort gone, the kind argument has the same functionality as the stable keyword, we can deprecate it! Internally, we use it to set the stable default when it is given. Then all the keywords can be treated the same way. No reason not to allow both arguments any more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I would say to hold off with deprecating it, unfortunately.
I think stable= was added in NumPy 2.0, and it would be a pain if projects like sklearn can't transition to stable= for all NumPy versions they support (i.e. avoid the need for a numpy version specific code).

(Unless stable= was actually added earlier.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we allow both 'kindand keywords,kind` would precedence in the case (kind="stable", stable=False), the only difference with current behavior would be no error message, and we could still do that but it would be different message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to allow mixing both, at least if we raise an error when they mismatch. But I am not following what you are suggesting.
Users can't pass stable=True yet, on all versions, so I don't want to ask them to replace kind=... yet (even if we allow kind="stable" it forces double-change: kind="mergesort" -> kind="stable" -> stable=True.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind="stable" is an alias for kind="mergesort", no double-change needed, except maybe in documentation. Something like: "Use stable=True in place of kind='mergesort' or kind='stable' ".

We can leave that for a discussion in a different PR, or perhaps an issue. I think the only thing to be decided is the removal of NPY_NANFIRST from the enum. If we are unlikely to use it, it is easier to add it when needed than to remove it once it is in. But we should leave space for more requirements if we add new 'hint' keywords in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather a follow on PR if we want to enable both kind and keywords. I think the important thing for this PR is to have the C interface defined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. FWIW, the nan parameter matters more for the DType API layer, where new hints can be added nicely (if the user expects it), but requirements cannot.

We could add the bit-mask that Marten suggested. I am starting to think: let's forget about it. I think in practice we likely only need to define that a descending sort also should return the NaNs last.
And if we have to do slightly uglier things in the case that we need more I am OK with that.

For hints, we could prepare for it, but we can also do something like:

#define NPY_SORT_LOWMEM (PyArray_RuntimeVersion > 2_4 ? FLAG : 0)

a bit ugly, but the user won't notice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think stable= was added in NumPy 2.0

Yep, #25437.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For hints, we could add something like "NPY_SORT_FIRST_HINT = 16," that would put the requirements in the first four bits of NPY_SORTKIND, with the first bit unused, e.g,, three possible requirements. You get an index by

index  = (sortkind & 0xF) >> 1. 

Since it often leads to future troubles to start with too few bits, "NPY_SORT_FIRST_HINT = 1 << 8," might be a better choice. Then the first eight bits would be reserved.

@charris
Copy link
Member Author

charris commented Sep 3, 2025

is make sure that the C-side raises an error when the new flags are used

You get a TypeError: "no current sort function meets the requirements"

@charris
Copy link
Member Author

charris commented Sep 3, 2025

What do you want done before this goes in?

@mhvk
Copy link
Contributor

mhvk commented Sep 3, 2025

I'm OK with it as is, but think given the discussion it might be best not to advertise NANFIRST in any way -- easiest would be just to remove it for now. I do think it is fine to leave descending in, even if it is not yet supported, since the array API requires that eventually we implement it. But that the way that one is added also already sets an obvious template for adding further options later.

But again, I don't feel strongly, so maybe @seberg can pipe in too.

p.s. Overall, a bit of a tortuous path, but I think a very nice result!

@charris
Copy link
Member Author

charris commented Sep 3, 2025

only need to define that a descending sort also should return the NaNs last.

If there is no nanfirst keyword, we should say that all alternatives sort NaNs to the end.
If there is a nanfirst keyword, we should say that all alternatives sort NaNs to the end unless nanfirst=1.

Not sure where that should go, maybe in the extending dtypes documentation.

@charris
Copy link
Member Author

charris commented Sep 3, 2025

I have removed all references to nanfirst.
I left the descending converter alone, as it seems a converter is wanted and the current converter looks exactly right

PyArray_BoolConverter(PyObject *object, npy_bool *val)
{
    int bool_val = PyObject_IsTrue(object);
    if (bool_val == -1) {
        return NPY_FAIL;
    }
    *val = (npy_bool)bool_val;
    return NPY_SUCCEED;
}

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks all good to me now!

Maybe left is a comment by @Serberg (#29642 (comment)):

This doesn't need the optional bool converter, since the default is well defined in both cases. The optional one is just for when None/not passed has a special meaning.

(I think OptionalBoolConverter is for the case that one can pass in None as well.)

@charris
Copy link
Member Author

charris commented Sep 4, 2025

I think OptionalBoolConverter is for the case that one can pass in None as well.)

Why do you need None in parsing a keyword only argument $stable? The parser isn't called if the argument isn't present. If we use builtin formats and parse it as 'i':

    case 'i': {/* signed int */
        int *p = va_arg(*p_va, int *);
        long ival;
        if (float_argument_error(arg))
            RETURN_ERR_OCCURRED;
        ival = PyLong_AsLong(arg);
        if (ival == -1 && PyErr_Occurred())
            RETURN_ERR_OCCURRED;
        else if (ival > INT_MAX) {
            PyErr_SetString(PyExc_OverflowError,
                            "signed integer is greater than maximum");
            RETURN_ERR_OCCURRED;
        }
        else if (ival < INT_MIN) {
            PyErr_SetString(PyExc_OverflowError,
                            "signed integer is less than minimum");
            RETURN_ERR_OCCURRED;
        }
        else
            *p = ival;
        break;
    }

which is not simpler than calling the __bool__ dunder, which doesn't need to do all those other checks. We could make a new converter that checked for an integer type first, (PyLong_Check), if you want to avoid accepting all types that define __bool__. Can't do better than that, Python doesn't have a real bool type.

@mhvk
Copy link
Contributor

mhvk commented Sep 4, 2025

I think just use PyArray_BoolConverter

@charris
Copy link
Member Author

charris commented Sep 4, 2025

I think just use PyArray_BoolConverter

Ah, it's already written, is it. Which raises the opposite question, why isn't it used for stable? Lets try that.

@charris
Copy link
Member Author

charris commented Sep 4, 2025

Well, that answers that question, the original was in fact correct, PyArray_OptionalBoolConverter does need to be used for both 'stable' and 'descending'.

@mhvk it was suggested by @seberg that I not use PyArray_OptionalBoolConverter for the new options,
by which I assume he meant PyArray_BoolConverter instead, which I had forgotten existed.

@seberg
Copy link
Member

seberg commented Sep 4, 2025

the original was in fact correct, PyArray_OptionalBoolConverter does need to be used for both 'stable' and 'descending'.

I am not following. You don't really need it for stable but it is has some sense because of the -1 default when it's not passed.
What it gives you is that kind="stable", stable=None is allowed, i.e. stable=None behaves identical to it not being passed.

But for descending= you don't need it, it's signature is not stable=None/np._NoValue, but descending=False. It has a well defined default and descending=None has the same meaning as desecnding=False and that will always be true.
There will never be later code checking for descending == -1 in the C-code here.

@charris
Copy link
Member Author

charris commented Sep 4, 2025

But for descending= you don't need it

I tested that and it didn't work. It compiles, and runs without errors, but not correctly. Which is to say, sortkind is not correctly constructed. The function signature is defined in the parser call, and both "|stable" and "|descending" are keyword only, so it makes sense that they need the same function. Note that using the alternate function to parse stable gives hundreds of test errors.

I think the vectorcall logic goes this way. All the possible variables must be processed. If it is determined that there is no corresponding value in the args list, the question is "what to do"? This is determined on our end, and I think we have decided to return None in that case, so the Optional parsing function is needed to ignore that case and do nothing.
The straight boolean parser will only work correctly for strictly positional arguments, where the wanted value is always in the args list or an error is raised if the args list is too short.

Note that the two functions also have different return types, the straight converter returns to npy_bool (unsigned char), while the optional converter returns to int. That could cause problems on big endian architectures if the return variables don't have the correct type.

@ngoldbaum I think you implemented the fastcall processing, can you confirm my understanding?

@seberg
Copy link
Member

seberg commented Sep 4, 2025

That could cause problems on big endian architectures if the return variables don't have the correct type.

I implemented fastcall, but it has nothing to do with things here, it behaves exactly the same as the builtin parser.
Maybe you forgot to change the variable to npy_bool descending = NPY_FALSE;?

This is determined on our end, and I think we have decided to return None in that case

When the argument is optional, you always have to set the default. I would still avoid OptionalBoolConverter because it converts None to False rather than "keep default unchanged".

@charris
Copy link
Member Author

charris commented Sep 4, 2025

Summary: As long as we allow kind or keywords, but not both, we need to use the optional parser so that we can tell when a keyword is used.

The problem with not using the non-optional converter is that we assume that it doesn't affect the current value of the variable (which is what online documentation suggests), which is incorrect. It sets the variable to either 0 ot 1. Setting the variable -1 before the call serves no purpose. So the non-optional version can only be used when 0 is the default value. The code can be fixed to work with the non-optional parser, but will not work as long as we want to enforce use kind, or use keywords, but not both.

I don't think we need to keep the one or the other choice now that heapsort is gone, but as long as we do, we need to use the optional parser. If we allow both kind and keywords, then we can do;

int stable = -1;  -> npy_bool stable = 0; 

and give the keyword priority over kind in the sortkind construction.

@seberg
Copy link
Member

seberg commented Sep 4, 2025

Summary: As long as we allow kind or keywords, but not both, we need to use the optional parser so that we can tell when a keyword is used.

I suppose I think there is no reason to reject sort(kind="mergesort", descending=False). But fine, if you want kind="mergesort", descending=None to work but kind="mergesort", descending=False to fail, then yes we need the Optional version.
(If I went back in time, I might actually start arguing the same direction for stable= also -- to not use the Optional one.)

EDIt: anyway, I don't care much, just do the version you prefer.

@charris
Copy link
Member Author

charris commented Sep 4, 2025

Summary: I'll stick with the current implementation.

Using a split is an easy rule to remember, as opposed to having different behaviors for different keywords. I'll go for that.

@charris charris merged commit cf5632a into numpy:main Sep 4, 2025
76 checks passed
@charris
Copy link
Member Author

charris commented Sep 4, 2025

Thanks all. And especial thanks to @mhvk for his close reading and suggestions.

@mhvk
Copy link
Contributor

mhvk commented Sep 4, 2025

Looking forward to the next steps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants