Skip to content

Conversation

@MaanasArora
Copy link
Contributor

Supersedes #28516 after @charris defined the new C-API for PyArray_Sort in #29642.

I have kept this light so far, with a sorting table (struct) for the DType API, using the new signature with the context. The new design is untested so far because without sort_compare, using it directly in say StringDType would be difficult, but happy to move this to parity if this looks right so far considering Chuck's comments. Thank you!

@ngoldbaum
Copy link
Member

I think this needs to be documented in the C API docs for the dtype API:

https://numpy.org/devdocs/reference/c-api/array.html#slot-ids-and-api-function-typedefs

@MaanasArora
Copy link
Contributor Author

Yes thanks, missed that. This is done! Will add a release note also.

@ngoldbaum ngoldbaum added the 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes label Sep 12, 2025
@ngoldbaum
Copy link
Member

Awesome, thanks! Can you also update your PR over at numpy-user-dtypes to make use of the updated API (if any updates are needed)? numpy/numpy-user-dtypes#109

The CI over there is slow at the moment so hopefully that's not annoying. We need to add some caching for the quaddtype build.

@MaanasArora
Copy link
Contributor Author

Sure! Currently that PR relies on the sort_compare feature as well, so the changes are incompatible, but we can write the sort functions themselves. So I'm looking into specialized sorting algorithms for these dtypes which will allow for more comprehensive documentation as well (instead of just sort_compare)!

@MaanasArora
Copy link
Contributor Author

Actually, do you think we should get this PR to parity first instead? It doesn't seem those dtypes have too many considerations for overall sorting itself, and I think a lot of examples including StringDType rely on the comparison feature. I think a comparison system for using generic sorts is about non-negotiable at this point, so I might as well add it now.

@ngoldbaum
Copy link
Member

Sure, whichever you think would work best. Just wanted to make sure it's on your radar.

@charris
Copy link
Member

charris commented Sep 12, 2025

Note that sortkind has the property that sortkind & 0x7 >> 1 yields an index ranging from 0 to 3 that can be used to pick the sorting functions if they are in an array. That trick is not currently used, but things were deliberately designed so that it would work.

@mhvk
Copy link
Contributor

mhvk commented Sep 13, 2025

A question that I also raised in #29642 - is the design of having a struct table with pointers to different sort functions for different properties set in stone? To me, it feels illogical if we are having flags to determine sort function behaviour -- it is less extensible than just passing in flags to a general function, and does not allow later adding "request" flags later (like, optimize for low memory usage or for performance).

It seems to me more sensible to have just two functions (sort, argsort), both of which takes flags (perhaps via the context struct?), and have some way to expose which flags they can or cannot deal with (either as a mask exposed in a struct, or via a return code).

Implementation-wise, it would also seem easy enough for a developer to implement the table if they want to go that route, while the other way around is just annoying (I can easily imagine that ascending/descending is easily done in one function).

But it may well be that I'm missing the bigger picture here!

@charris
Copy link
Member

charris commented Sep 13, 2025

ascending/descending is easily done in one function).

Only for the generic sorts. The type specific sorts gain speed by having < built in. The idea is to avoid choice/function calls in the innermost loops. That said, it is easy to build functions with alternative definitions of <, but that would give you two functions, not one. Even for the generic loops, that might be the way to go, instead of looking for -1 returned by the compare functions, look for +1. Note the absence of 0.

Once this is settled, I plan to generate the needed descending sort functions.

EDIT: It is also easy to add entries to a private tables without causing incompatibilities. I don't see the added complexities of hints as having much use at the moment, they can be added later if needed. The only real use I can see would be for out-of-memory sorts, and that would probably qualify as a requirement rather than a hint.

@MaanasArora
Copy link
Contributor Author

MaanasArora commented Sep 13, 2025

In #28516 we actually had just two functions that mirrored ufuncs, get_(arg)sort_function that returned a pointer to the appropriate function given the parameters, but Chuck had the good point that (as I understand) we effectively have that many functions anyway if we want to be explicit about efficiency. I can see a user dtype implementing deeply nested code or such in a single function, which may not be that good for performance, so now actually I think it makes sense to be explicit that we're looking for multiple functions in practice, unless I'm missing something!

@mhvk
Copy link
Contributor

mhvk commented Sep 13, 2025

@charris - yes, my example was perhaps poorly chosen. My question is really more general: isn't it better to define the API as just having only one sort and one argsort function that take flags? Much easier to extend in the future. (Though perhaps I read too many LWN articles where people bemoan the fact that old kernel function do not have flags, making the creation of a new one necessary...)

The only downside I see is that this does not automatically provide a way to tell that a given dtype can sort descending, etc., hence the idea of a mask or specific return code. Though it probably is fine to generically just error if a sort function doesn't support it (same as function=NULL in the present proposal) and fall back to a standard sort using the comparison.

@mhvk
Copy link
Contributor

mhvk commented Sep 13, 2025

@MaanasArora - sorry, missed your answer, which appeared while I wrote mine.

I can see the wish for performance but it seems to me calling a function with flags that internally just has a switch right up front (with each option a separate code block, say in-lined sort functions) would not impact performance much. Though I also wonder if any of this is going to matter: sorting generally is not a fast process.

Anyway, I don't want to make too much of this...

@MaanasArora
Copy link
Contributor Author

MaanasArora commented Sep 13, 2025

@mhvk no problem! Yes, it would probably not impact performance terribly, and if user dtypes want to define a simple sort, even if slightly inefficient, I guess we could let them :)

Another benefit of having the older architecture that @seberg suggested would be that it mirrors ufuncs, and allows for more flexible configuration with the context as you said.

The reason (as far as I understand) we went with structs, was that I think tables are more explicit in a way — we have 4 ways of doing sorts, so declaring them outright is simpler in its own way as we have straightforward and performant routines to call without needing to consider configuration.

Personally, I think the config doesn't change the code in the routines much at all so it might be simpler to just pass it into two functions. I guess it depends on whether the performance gains are good to have too.

I'm happy to go with whichever approach everyone thinks is better for the project, I am just clarifying what I understand so far!

@mhvk
Copy link
Contributor

mhvk commented Sep 13, 2025

we have 4 ways of doing sorts

Except that @charris also introduced an option to sort nans first or last... It is currently commented out, but it would make it eight options... Which is partially why I like passing on flags... 😺

Note that I'd be absolutely fine with going with a mechanism mirroring the ufuncs -- in the end, sorting could in principle easily be done with a gufunc if that would have a flags option.

p.s. Sorry to come to this discussion late!

@seberg
Copy link
Member

seberg commented Sep 16, 2025

Right, so basically we have 4 typical implementations. With NaNs, it would be 8, although we wouldn't need all. I am now leaning towards not worrying about NaNs, though. I just doubt that sorting NaNs to the front is niche enough to make it hard.

I can see the wish for performance but it seems to me calling a function with flags that internally just has a switch right up front

I think optimization is actually a reason to expose the get_* style API. The overhead is tiny, but:

  • It makes it much easier to reason about how to build an optimized version for structured dtypes. This could have a huge effect on
  • If we add sort-hints (or other metadata to the context), it might allow future further optimized versions to select the best algorithm once. E.g. a "this is a very small sort" hint.

Not sure this matters, but I think optimization is a reason for the get_ style API. The reason for allowing to just pass 4 functions here is that it is less verbose to implement (i.e. user, including NumPy convenience).

UFuncs actually support both styles. The closest design to what ufuncs actually be having both!
But, I am happy to just do either one, we can always add a NPY_DT_sort_getter slot and one should only define one of them. We could also add some void *reserved = NULL at the end of the struct, but since NPY_DT_sort_getter and NPY_DT_sort_functions would be mutually exclusive, I actually don't think I care about that.

(Implementation wise, NPY_DT_sort_getter could be set to a default that looks up what's in NPY_DT_sort_functions.)


There was the question of making it easy to use the generic implementation by only providing a compare function. But it seems fair to defer that to a follow-up, it is also again be somewhat mutually exclusive (since the compare function could fill in all of partitioning, sorting, argsorting, and potentially even compare ufuncs!).

@charris
Copy link
Member

charris commented Sep 16, 2025

I am quite happy to leave the design choice to @seberg here. My first instinct would be to go with the simplest implementation, but agree that a getter approach offers more flexibility for future changes. That said, the future is hard to predict, so we shouldn't spend too much time on it. I think the most important thing in the short term is to make alternative selections easy, SIMD for instance. I think Sebastian is thinking of run time dispatch as a possibility.

@mhvk
Copy link
Contributor

mhvk commented Sep 16, 2025

@seberg - thanks for the details! I still should at some point look at how the ufuncs actually do it, but perhaps you can help by pointing out in the code or docs where things are defined... If ufuncs do both, then I think we should pick one of those for now, and mimic it as close as possible. (To me, the getter still seems the more scaleable solution, so I'd prefer that...)

@seberg
Copy link
Member

seberg commented Sep 16, 2025

Ah, right, I would love to move the runtime dispatching into the getter. I didn't think of it TBH, because that isn't where it is for ufuncs right now (it may be there for casts, not sure).

I think I convinced myself in the previous comment, that both is useful (for simplicity) but that it is perfectly reasonable to use two distinct slots for both (because you only need to use one of them).
Given that, I would argue to push on with the current version and then follow up, even if I would like to follow up probably.

(I do have to go through the API here, although I suspect it is pretty much settled from the last PR.)

@seberg
Copy link
Member

seberg commented Sep 16, 2025

If ufuncs do both, then I think we should pick one of those for now, and mimic it as close as possible. (To me, the getter still seems the more scaleable solution, so I'd prefer that...)

Hehe, fine, we can also start with the getter. I was considering starting with this just to make the step of changing our internals slightly easier :). Honestly, I don't care, so if you like a getter maybe that is the way to go.

@mhvk I think it's easier to look at the public API here. Not great layout, but you can see: https://numpy.org/doc/2.1/reference/c-api/array.html#c.NPY_METH_strided_loop the slots (of which most ufuncs only need the first strided-loop one, but casts need more). And further down the get_loop one with the corresponding signature.
The implementation for ufunc just fills in a default get_loop that checks what is in the other slots.
(But ArrayMethods have their own full set of slots, while here it would be nice to integrate it into the DType slots, so if we add functions I think a struct is nice.)

@mhvk
Copy link
Contributor

mhvk commented Sep 16, 2025

Thanks, that's useful!

One question before proceeding is whether it might be possible to treat the sorting code as an array method. Since effectively these are gufuncs, I think it should be, but it is not clear to me how one would pass on the sorting flags. Though perhaps the analogy is that sorting ascending and descending (e.g.) is like different gufuncs. Maybe that is actually the way to go?

@seberg
Copy link
Member

seberg commented Sep 16, 2025

Maybe that is actually the way to go?

Maybe, but that means 4 gufuncs here if we don't figure out parameters and I think that is too much for the immediate need (if interesting). And sorting may be a bit special? We simplified to contiguous loops since it sounded like supporting strided ones is unlikely to be useful (but of course one could likely wrangle that into gufuncs to focus on that).
Sorting is also naturally in-place (not argsorting of course), which is also slightly awkward for ufuncs probably.

I suppose I agree that gufuncs might be the better thing, but without figuring out some loose ends that seems hard. And figuring out the loose ends seems maybe a bit much for this.

@mhvk
Copy link
Contributor

mhvk commented Sep 16, 2025

I guess the question is whether we can get away with defining a get_loop that has the same signature?

@MaanasArora
Copy link
Contributor Author

MaanasArora commented Sep 16, 2025

If ufuncs do both, then I think we should pick one of those for now, and mimic it as close as possible. (To me, the getter still seems the more scaleable solution, so I'd prefer that...)

I do suspect sorting is a complex enough operation that some consideration for future extensions is important; the getter solution seems to target many of them well enough and it seems more natural to what this API actually does (allow users to define custom sorts). If optimization is actually easier with it, that's great and seems more reason to go that way!

There was the question of making it easy to use the generic implementation by only providing a compare function. But it seems fair to defer that to a follow-up, it is also again be somewhat mutually exclusive (since the compare function could fill in all of partitioning, sorting, argsorting, and potentially even compare ufuncs!).

Makes sense! That should make the size of this PR much more palatable :) In practice we're still accessing the compare arrayfunc, so we can probably fit the stringdtype fix into this PR without needing to move to a slot for that. Though I'm happy to do it in a small follow-up too if you prefer.

I suppose I agree that gufuncs might be the better thing, but without figuring out some loose ends that seems hard. And figuring out the loose ends seems maybe a bit much for this.

Would it be possible to transition to gufuncs using this API down the line? We could presumably allow the same API that we have now and internally insert it into a gufunc?

Edit: I mean the getter API, not the one implemented right now, since it seems we're going that way!

@seberg
Copy link
Member

seberg commented Sep 17, 2025

I guess the question is whether we can get away with defining a get_loop that has the same signature?

Would it be possible to transition to gufuncs using this API down the line?

I suppose. In the end, the missing piece of gufuncs is the fact that their context never holds parameters (and parameters here are restricted to things that are scalar and do not take part in promotion, probably -- e.g. that doesn't work for quantiles where the parameter changes promotion).
The sort context would have to for the sort options flags. So the thing to solve would be two-fold, I think:

  1. You would have to modify argument parsing to create a gufunc specific parameter context structure. (Maybe, that doesn't need to be a struct usually, because if you have a void * slot, you can cast it the flags here). That also needs cleanup (maybe as NpyAuxData *, not sure!).
    • No, I don't think any of this is particularly hard, but not sure it is easy either and there may be other design approaches. (I do agree that it would be nice to make context be identical to such a future thing here, but I am not sure I want to worry about it.)
  2. Our sort loops don't support strides, that could be solved via a small sort specific copier that wraps that calls the contiguous version in principle. Or more generic tooling.
  3. Sort loops work in-place. They effectively have zero inputs and a single output (that we need to read from). I suspect this part is largely fine, but might need a clearer "input and output" marker within the ufunc machinery.

We could presumably allow the same API that we have now and internally insert it into a gufunc?

If we do create a gufunc with the above capabilities, then yes, that gufunc could use this mechanism to implement itself for all dtypes that support it via this (while also allowing to do it directly if you want to support e.g. strided versions).

(The biggest annoyance/dance would be that if the context is structured differently, the translater/wrapper needs to modify it additionally to dropping the strides argument.)

I am not sure that all of this should matter. I like the idea of doing the above, but I don't like the idea of sitting on this change for it indefinitely.

@mhvk what does this mean to you? To me it doesn't change much right now to be honest (I wish it did, but I don't want to stall things). Unless this is an argument for doing just the explicit 4 loops for now, because a get_loop() could be done if we manage to wranlge it into a gufunc.

@mhvk
Copy link
Contributor

mhvk commented Sep 17, 2025

Yes, we do need to move forward and concreteness would be good...

But maybe first back to basics: looking again at the PR, there are two things I'd really like to change.

Use the standard array method loop signature

I'd suggest we ensure that the (arg)sort loops have the standard array method loop signature:

typedef int (PyArrayMethod_StridedLoop)(
    PyArrayMethod_Context *context,
    char *const *data,
    const npy_intp *dimensions,
    const npy_intp *strides,
    NpyAuxData *auxdata
)

This is obviously needed if we move to (arg)sort being gufuncs, but I think it is good regardless, just not to expand special cases. It is fine to state that for sort this will only be called with data[0] == data[1] (though perhaps it is not a problem to require that a sort implementation just copies the data if that is not the case; could eventually be an implementation flag).

Use loop registration, not slots on the dtype

It feels like a mistake to me to define things on the dtype, in a way moving us back to the previous API, where the dtype defined a set of "handy" functions in PyArray_ArrFuncs. I think it had become clear this was a bad idea and instead we now have array methods, where a dtype registers itself as needed. I think we should go via registration here too.

For registration, using the ufunc registration mechanism seems by far the easiest -- any custom one would mostly duplicate it. Making (arg)sort gufuncs may be more work than warranted but perhaps one could at first have "fake" ones that just work as storage and expose those via names. In that case, registration would be via,

PyUFunc_AddLoopFromSpec(NPY_[ARG]SORT_UFUNC, spec);

and inside the current sort/argsort code, the relevant loop would be looked up in those fake gufuncs.

Alternatively, one could expose a registration function specifically for (arg)sort that wraps this.

Decide on how to deal with the flags

The way the array method system is currently written, there is no straightforward way to pass on runtime information to a ufuncs other than the dtypes. Thus, by far the easiest way to deal with flags is to imagine different gufunc for the different sort types. In that case, registration would become,

PyUFunc_AddLoopFromSpec(NPY_[ARG]SORT_STABLE_ASCENDING, spec);
PyUFunc_AddLoopFromSpec(NPY_[ARG]SORT_STABLE_DESCENDING, spec);
...

(which would argue even more for wrapping this in a PyArray_RegisterSortLoopsFromSpecs(....))

That said, it would seem relatively easy to add a new void *runtime_data to PyArrayMethod_Context which in the [arg]sort main implementation could be used to pass on the flags. It may not be that long before that will be needed for other cases too...

@mhvk
Copy link
Contributor

mhvk commented Sep 17, 2025

p.s. The above is mostly because this PR introduces new public API, which I feel we need to get somewhat right. A private implementation for StringDType could be substantially hackier...

@MaanasArora
Copy link
Contributor Author

MaanasArora commented Sep 27, 2025

Thanks @mhvk for reviewing! I've handled two reference counting comments and will look at the rest (such as the flags for GIL) after @seberg's review. Glad to hear things are looking good.

Edit: Wow, that PR for ufuncs on scalars has a huge speedup indeed!

@charris
Copy link
Member

charris commented Sep 28, 2025

The test failure is unrelated.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good, if you like, you could simplify the resolve_descriptors to just bail on non-native byte-order (even just assert).
Let me know when we got the method_flags (just the GIL one) and the loop_descrs/loop_descrs are used for the new path.

If you would like me to suggest a solution, let me know (or if you are sick of iterating, I can maybe also make an iteration).

else {
loop_descrs[1] = given_descrs[0];
Py_INCREF(loop_descrs[1]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

view_offset makes little sense for sorting (a sort isn't a view, a cast may be, .imag may be a ufunc is fathomable but not supported... but a sort never is).
Thus I don't care what it is set to above, since you won't bother setting it here, nor will you check if it was set above.

In theory a dtype that can hold only a single element (and thus is always sorted) could set *view_offset = 0 here and then if we check that were it's called the sort could just return immediately, but that seems silly so...

PyArray_DTypeMeta *dtypes[2],
PyArray_Descr *given_descrs[2],
PyArray_Descr *loop_descrs[2],
npy_intp *view_offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, it doesn't make sense, especially if sort is already in-place (but not in the sense of being a no-op).


NPY_BEGIN_THREADS_DESCR(descr);
if (!needs_api) {
NPY_BEGIN_THREADS_DESCR(descr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NPY_BEGIN_THREADS_DESCR checks the descriptor for the flag, but that is exactly what you do above. So no need for the DESCR part.
(I think the descr isn't changed yet, but if I missed it let me know.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's an explicit flag so should use that! This is done (also caught an important bug)

Copy link
Contributor

@mhvk mhvk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this all looks great, and I have no further comments (the ones I made just now are just me realizing that I was probably wrong before...).

Might still be good if @seberg can have a last look.

Also, if we merge, we should squash the commits!

p.s. @seberg - it would be nice to expose the ability to get loops from a ufunc. Partially inspired by this PR, I wrote a little class that multiplies whatever is input by a constant, and ended up getting the legacy loop via ->types and ->functions just because it seemed impossible to get the loop directly using the regular API. Or did I miss something?

else {
odescr = descr;
Py_INCREF(odescr);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still slightly wrong should have explained more maybe. Needscopy needs odescr to decide. (Rather than swap since we replace the swap logic!.)
So we should find odescrfirst, and where we do, eithercheck equivtypes or check swap (or always check equivtypes later, it should take the ==fast path anyway).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, thanks a lot for the pointers! I was missing that odescr is still the first descriptor (as it's the intermediate value in a sense). I've done something to this effect, hopefully it's correct!

NPY_BEGIN_THREADS_DESCR(descr);
if (!needs_api) {
NPY_BEGIN_THREADS;
}
Copy link
Member

@seberg seberg Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we do also need to usethe descriptors, in this case the first (as the second we assume is default intp, I am fine with that). But the copying must use context->descrs[0].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm a bit confused, won't the NEEDS_PYAPI flag (in get_strided_loop) handle that logic, and in case it doesn't, we're checking op anyway? Or are we building in a safeguard?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forget about it, I think I just picked a pretty random place in the argsort loop to comment that the desriptor is important there also. You added that.

@seberg
Copy link
Member

seberg commented Oct 1, 2025

p.s. @seberg - it would be nice to expose the ability to get loops from a ufunc. Partially inspired by this PR, I wrote a little class that multiplies whatever is input by a constant, and ended up getting the legacy loop via ->types and ->functions just because it seemed impossible to get the loop directly using the regular API. Or did I miss something?

We need to add such public API, there is the Python side "unstable" API as np.add._resolve_dtypes_and_context (check it's docs).
Yes, I would be happy to expose this as more convenient API, it's that, because for testing/prototyping that seemed like the right level at the time.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be getting blind on this, but I think we can go ahead here and then continue and see how it pans out using it for more.

(I.e. we somewhat should change our sorts to actually use this, also since it'll be the only good way to get to reverse sorts anyway.)

Honestly, I think we could just put this in as is. It doesn't have meaningful additions right now as that would come with a way to actually register the sort!

Comment on lines 3274 to 3276
if (odescr == NULL) {
return NULL;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't actually fail (unless the typenum is invalid), and we usually just use that assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so should remove the check? Did that.

@seberg
Copy link
Member

seberg commented Oct 7, 2025

OK, I think it's best to just put this in, thanks @MaanasArora, sorry that this is so difficult with the API design and ufunc back and forth. Right now, it changes (practically) nothing public API wise anyway. The added but unused *parameters is fine, and honestly a good idea anyway.

@seberg seberg merged commit aa45f1a into numpy:main Oct 7, 2025
77 checks passed
@mhvk
Copy link
Contributor

mhvk commented Oct 7, 2025

Yay!! A really nice contribution, looking forward to the next steps, of allowing registration!

@MaanasArora
Copy link
Contributor Author

Thank you both for reviewing; I learned a lot here! Looking forward to the next steps as well.

@mhvk
Copy link
Contributor

mhvk commented Oct 7, 2025

Same here, I learned a lot (thanks @seberg! what you implemented is really nice, but it is non-trivial to understand without a simple example - which is why I like the scaled float case so much...)

IndifferentArea pushed a commit to IndifferentArea/numpy that referenced this pull request Dec 7, 2025
* ENH, API: New sorting slots for DType API

* DOC: Add sorting context structures and functions to DType API documentation

* DOC: Fix type signature syntax

* DOC: Update data pointer type in sorting function signatures

* DOC: Fix incorrect argsort function name

* DOC: Fix formatting of function signature for PyArray_SortFuncWithContext

* ENH: Initialize moving to an arraymethod table for user dtype sorts

* ENH: Initialize PyArrayMethod_Context to zero in sorting functions

* ENH: Add stable sort option to context in user sorts

* DOC: Remove deprecated references from DType API sorting documentation

* ENH: Refactor sorting parameters to use NPY_SORTKIND in PyArrayMethod_SortParameters

* ENH: Remove unnecessary blank line in NPY_DType_Slots structure

* ENH: Simplify axis check in PyArray_ArgSort

* REF: Simplify PyArray_(Arg)Sort loop discovery logic

* BUG: Add missing breaks in loop discovery in sorts

* BUG: Simplify context initialization in PyArray_Sort and PyArray_ArgSort

* BUG: Fix dangling pointer to context in PyArray_(Arg)Sort

* BUG: Fix dangling context pointer in PyArray_(Arg)Sort

* REF: Update sorting implementation to use new generalized array method system

* ENH: Use dtype slots for sort and argsort arraymethods

* STYLE: Add back blank line in methods.h

* REF: Simplify conditional logic in PyArray_Sort and PyArray_ArgSort

* BUG: Add error handling for strided loop retrieval in PyArray_(Arg)Sort

* ENH: Fill out context and strides in sort strided loop calls

* ENH: Add correct number of descriptors and strides in sorting loops

* BUG: Correct descriptors and strides in argsort implementation

* ENH: If-case parameters field in PyArrayMethod_Context

* REF: Initialize PyArrayMethod_Context structure directly for sorting functions

* ENH: Pass auxdata to dtype sorts

* ENH: Update stride calculation for sorting to use item size

* ENH: Refactor context initialization and descriptor resolution in sorting functions

* ENH: Implement sorting and argsorting methods for scaled float dtype

* DOC: Add comment to clarify that we ignore `view_offset`

* ENH: Free auxdata and decref descriptors in sorting functions

* DOC: Add comments to clarify that method flags are ignored in sorting functions

* ENH: Add scaled float tests for sorting and argsorting with scaled and non-contiguous arrays

* TST: Update scaled float argsort test to stride array

* STYLE: Fix formatting in sfloat test_sort

* BUG: Update sorting functions to check for sort method in cleanup instead of strided loop

* REF: Fix scope for sort contexts

* REF: Initialize sorting context only if taken

* DOC: Add parameters member to PyArrayMethod_Context for loop parameterization

* BUG: Move declaration of loop_descrs array in PyArray_ArgSort

* DOC: Add comment clarifying arraymethod context parameters

* DOC: Clarify description of parameters member in PyArrayMethod_Context

* REF: Use existing variable to simplify NPY_DTYPE macro

* BUG: Add missing output decref in PyArray_ArgSort

* BUG: Add assertions to validate parameters in sorting functions of scaled float dtype

* REF: Simplify argument passing by removing unnecessary dimensions and strides in casting functions

* BUG: Fix castinfo changes

* TST: assert exact stride values in sfloat sorts

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* TST: assert exact stride values in sfloat sorts

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* DOC: move versionchanged in ArrayMethod_Context

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* BUG: Fix stride initialization in _new_sortlike and ensure proper decref in PyArray_ArgSort

* BUG: Correct dimension and stride initializations in sorting

* BUG: Fix strides initialization in sorting

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* DOC: clarify function level in parameters for PyArrayMethod_Context

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* REF: initialize slots in-line for sfloat sorts

* BUG: Remove incorrect dimension assertions in sorting loops

* BUG: Correct size to sizeof

* STYLE: revert newline deletion

* STYLE: remove trailing whitespace

* STYLE: revert delete newline

* BUG: Fix incorrect stride assignment in _new_sortlike function

* STYLE: Remove unnecessary whitespace

* DOC: Improve comment

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>

* REF: Remove first branch in resolve_descriptors for sfloat sorts

* TST: Add reverse sorting test for non-aligned arrays in sfloat sort tests

* DOC: Add TODO comments for future registration method in sfloat_init_sort

* ENH: Perform descriptor handling in sorting array method for byte order

* DOC: Remove redundant versionchanged note in types-and-structures.rst

* TST: Add stable and unstable sorting tests for sfloat arrays

* ENH: Update stride assertions in sfloat_default_sort_loop for improved data validation

* ENH: Add stride assertions in sfloat_default_argsort_loop for improved data validation

* ENH: Change descriptor handling in sfloat_sort_resolve_descriptors and sfloat_argsort_resolve_descriptors for improved byte order management

* ENH: Manage reference counting for argsort method in sfloat_init_sort to prevent memory leaks

* ENH: Add assertion in sfloat_sort_resolve_descriptors to validate descriptor consistency

* ENH: Add assertion in sfloat_argsort_resolve_descriptors to validate second descriptor type

* ENH: Move error handling to sfloat_sort_get_loop and sfloat_argsort_get_loop for unsupported sort kinds

* ENH: Manage reference counting for sort method in sfloat_init_sort to prevent memory leaks

* REF: Clarify stride calculation in PyArray_Sort and PyArray_ArgSort

* ENH: Add method flag handling in sorting methods to check for GIL

* REF: Simplify descr element size retrieval in PyArray_Sort and PyArray_ArgSort

* BUG: Replace NULL return with goto fail in PyArray_ArgSort for error handling

* REF: Assert instead of handling non-native byteorder in sfloat sorts

* BUG: Initialize return value to NULL in PyArray_ArgSort for error handling

* REF: Remove descr-specific thread release in favor of pyapi flag in sorting methods

* BUG: Fix needcopy logic in _new_sortlike and _new_argsortlike functions

* BUG: Remove unnecessary NULL check for odescr in PyArray_ArgSort

---------

Co-authored-by: Sebastian Berg <sebastian@sipsolutions.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

01 - Enhancement 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants