ENH: Add internal real/imag ufuncs and use them for the attrs#30984
Conversation
Admittedly, this has a wrinkle in that I don't feel like adding a `set_imag` ufunc as well or maybe we should? This also fixes `object` arrays `.real` and `.imag`, although fixing means trying to fetch the attribute (which is also not perfect, if better? E.g. what if that is a method...) It still has an awkward fallback: If there is *no* loop defined for real/complex we assume it is a real value. This is not good, but I think the solution here is to come from the other side and identify `ComplexDType` a bit clearer. Why a ufunc? Well, it allows fixing the object path here fairly well by allowing to write a different function. It would also work for polar coordinates.
mhvk
left a comment
There was a problem hiding this comment.
@seberg - I like the idea of the ufuncs, but two worries.
- doing
.realand.imaghappens a lot and is time-critical, did you check the performance? (e.g., most of my code dealing with complex numbers hasdef power(z): return z.real**2+z.imag**2). I fear you may have to stick with having the standard view taking as the default... - I am less enchanted by the "magic" that now sometimes outputs can be views. I think these functions should behave like all ufuncs, e.g.,
np.positive, and thus return a copy by default.
On (2), of course we do want to allow views and avoid actual work in that case, but could we introduce an actual signal that that is wanted (and that would thus work for np.positive as well)? At the risk of repurposing too much, maybe casting='view' or giving out=... an extra meaning, that a view is OK?
| value = *reinterpret_cast<real_type *>(in); | ||
| } | ||
| else { | ||
| value = *reinterpret_cast<real_type *>(in + sizeof(real_type)); |
There was a problem hiding this comment.
The addition should be done outside the loop (even if we do not exercise it!).
|
p.s. The broadcasted zero makes me wonder whether a ufunc is in fact the right option... |
This comment was marked as outdated.
This comment was marked as outdated.
|
OK, I did the iteration of moving it onto to the DType itself:
Again, still needs tests, but maybe this is the right approach. Maybe the ufuncs don't even need to exist, but it is a fairly convenient way for the new
EDIT: Still needs at least release notes (it is quite substantial with object arrays). But if we think this is a reasonable path, then I think it is far along now. |
|
I went over this very briefly and it looks quite nice! The user dtype API in particular benefits from having ufuncs anyway, I think. Sorts and argsorts are similar insofar they're inplace, so having these too on the dtype fits well! I'm still wrapping my head around this a bit, but yes, putting it on the dtype made it nicer as the ufuncs have less "responsibility" for the view path now. I think it is fine for the ufunc to return views even if atypical to start with; this is mostly internal after all. About a signal that a view is wanted, Edit: Going to take a deeper look soon! |
mhvk
left a comment
There was a problem hiding this comment.
@seberg - had another bit of a review now.
Some comments in-line, all small except perhaps that it may make sense to use the names "real" and "imag" directly, linking them to the existing np.real and np.imag functions -- after all, they will be using the ufuncs internally... With that, it would also not be necessary to change how the function comparison is done (i.e., "numpy:real" will work fine).
| entry points, ``(module ':')? (object '.')* name``, with ``numpy`` | ||
| the default module. Examples: ``sin``, ``strings.str_len``, | ||
| ``numpy.strings:str_len``. | ||
| ``"sort"``, ``"argsort"``, ``".real"``, ``".imag"`` and specific names |
There was a problem hiding this comment.
Bigger: should this just be real and imag? This since we do have np.real and np.imag functions, and presumably those are affected as well. (Indeed, I think those in principle could just be replaced by the new ufuncs, no? In which case there would be no need to mention this here...)
Smaller: did you mean to write "are" instead of "and"? Though perhaps write instead,
Note that some names do not correspond directly to ufuncs names, but
instead to functions that use ufuncs internally:
``"sort"``, ``"argsort"``, ``".real"``, ``".imag"``
| {"_core._multiarray_umath.add", &add_spec}, | ||
| {"numpy:sort", &sort_spec}, | ||
| {"numpy._core.fromnumeric:argsort", &argsort_spec}, | ||
| // These names must match exactly right now (not ufuncs) |
There was a problem hiding this comment.
Is this true? I had done the above as a sanity check on our interpretation of names... (but it is fine if an exact match is required!).
There was a problem hiding this comment.
Right, I changed it for the ones not backed by ufuncs (as such!). (Same as above, I think it's fine to be strict for these...)
| NPY_NO_EXPORT int | ||
| PyUFunc_AddLoopsFromSpecs(PyUFunc_LoopSlot *slots) | ||
| { | ||
| if (npy_cache_import_runtime( |
There was a problem hiding this comment.
If we were to link real and imag to the corresponding functions, we could keep the old scheme (which I slightly prefer, but only slightly).
| >>> a = np.array([2j, "a"], dtype=np.str_) | ||
| >>> np.isreal(a) # Warns about non-elementwise comparison | ||
| False | ||
| >>> np.isreal(a) # Compares as strings currently |
There was a problem hiding this comment.
I'm confused what "Compares as strings currently" means - what would give rise to True?
There was a problem hiding this comment.
Ah, yeah, that can't be understood. np.imag(a) is the empty string "" and it then does "" == 0 which is False.
That doens't make sense... But it didn't change here.
(Just N.B. the big change here is as the object dtype behavior of returning a copy and trying to use getattr(obj, "real", obj); getattr(obj, "imag", 0) -- which is a decent guess, but of course a guess.)
|
Maybe to summarize:
|
|
@mhvk do you have any other worries here? As I said, I have a slight preference to just keep the strict string comparison for now (this is new and we can always broaden it up). |
ngoldbaum
left a comment
There was a problem hiding this comment.
I'm experimenting with Claude Code to help me with code review. It spotted some issues below, only one of them is an outright bug (setting a struct field via a pointer before the error check for the pointer being NULL).
It also pointed out that Marten's comment about performance wasn't addressed. In particular:
calling resolve_descriptors + PyArray_NewFromDescr_int on a hot path is heavier than the old direct PyArray_NewFromDescrAndBase call. Checking whether .real/.imag result in a view (as opposed to calling the ufunc) already happens via view_offset, but the descriptor resolution overhead itself may be avoidable for the built-in complex dtypes by short-circuiting or caching.
Sorry that this was mostly me copy/pasting from Claude code. I'm pretty sure what it's saying about these topics makes sense but please let me know if this is annoying and you'd prefer I not do this.
| // Hardcode slot names for attributes and non-ufuncs stored on the DType | ||
| // (Also avoids circular imports a bit.) | ||
| if (strcmp(slot->name, "real") == 0) { | ||
| Py_XSETREF(ufunc, npy_interned_str.real); |
There was a problem hiding this comment.
I don't think this pattern here and below is correct because np_interned_str.real is a borrowed reference. The next time through the loop, you'd incorrectly decref the borrowed reference. You need to INCREF first to create an owned reference. But also on Python 3.12+ interned strings are immortal so this won't ever cause any issues.
There was a problem hiding this comment.
Yeah good point. I'll just add the NEWREF, of course others might argue it is just as well to not and rely on it being immortal, but it doesn't really matter.
There was a problem hiding this comment.
Yeah good point. I'll just add the Py_NewRef, of course others might argue it is just as well to not and rely on it being immortal, but it doesn't really matter.
| PyArray_Descr *loop_descrs[2] = {NULL, NULL}; | ||
| npy_intp view_offset = NPY_MIN_INTP; | ||
| int res = meth->method->resolve_descriptors( | ||
| meth->method, meth->dtypes, descrs, loop_descrs, &view_offset); |
There was a problem hiding this comment.
you INCREF and DECREF the objects in loop_descrs below - but couldn't they be NULL, in principle?
There was a problem hiding this comment.
No, they can only be NULL on error.
Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>
I don't mind, but it didn't read the history: The slowdown is tiny and actually overshadowed by another, probably also meaningless, slowdown for creating the new view. |
mhvk
left a comment
There was a problem hiding this comment.
@seberg - mostly (very) small comments left. One bigger one, though only a suggestion: instead of templates, might it be better to use static_data to decide between real and imag in the loops? The code is essentially the same, so that avoids doubling the memory use (for very little performance cost).
|
|
||
|
|
||
| /* We shouldn't normally use it, but define a simple loop anyway. */ | ||
| template <typename real_type, bool real_part> |
There was a problem hiding this comment.
A disadvantage of templates might be that code gets duplicated a lot. In stringdtype_ufuncs.cpp, we use the array method static data to select between very similar codes (e.g., startswith_endswith). It may be worth doing the same here for real_part (possibly also for the promotor? for the loop at least, the tiny extra amount of time needed would seem easily outweighed by the extra memory used by the template).
There was a problem hiding this comment.
I'll skip this... frankly the difference is 504 bytes, so a bloat of 0.005%. And that is ignoring the fact that the compiler should probably duplicate the loop to optimize it.
(The total binary increase is certainly more here, but just using the same wrong version for both gives that.)
| } | ||
|
|
||
|
|
||
| template <auto component> |
There was a problem hiding this comment.
Here again, I would suggest to pass in the component via static_data
|
OK, I'll put this in then, thanks for the reviews. Not too happy that I need to pass the private flag to make it fully reliable for free-threading, but it is a different issue. |
This is a work in progress, although already functional
I am trying to figure out how to restructure
.imagand.realto:objectdtype behavior (a bit?) by not returning views.finfo(the ability to fetch the corresponding real type)The approach here is to define internal
imagandrealufuncs, plus we need a few small tweaks to allow those ufuncs to return actualviews.One wrinkle is that
arr.imag = 0works and right now is a bit hacked, because I wasn't sure I like aset_imagufunc (but maybe that is the better solution?).There is also a wrinkle that if
arr.real/arr.imagstill need weird special handling for backcompat since a ufunc really can't return a broadcasted array of zeros.(The code looks wrong, but it is mainly boilerplate ufunc code+docs.)
Generally, I think this is a reasonable approach to make it a ufuncs, although there may be some details to hash out. Ping @ngoldbaum and @mhvk in case you have some thoughts already.
EDIT: Closes gh-1740
EDIT: Should have mentioned that I tried to use an agent to write some of the ufunc code, but it didn't do a job I liked, so there is practically nothing left of it.