Skip to content

AI-generated port of ml_dtypes to numpy 2.#360

Open
copybara-service[bot] wants to merge 1 commit intomainfrom
test_871352005
Open

AI-generated port of ml_dtypes to numpy 2.#360
copybara-service[bot] wants to merge 1 commit intomainfrom
test_871352005

Conversation

@copybara-service
Copy link
Copy Markdown

AI-generated port of ml_dtypes to numpy 2.

@copybara-service copybara-service bot force-pushed the test_871352005 branch 4 times, most recently from 16ac2b3 to c6fbd5f Compare February 24, 2026 08:00
@hawkinsp
Copy link
Copy Markdown
Collaborator

@seberg FYI

I have no intention of submitting this as is without doing a bunch of manual work on it first, but it shows that we can port ml_dtypes to numpy 2's apis with numpy 2.4. I have not read these that closely myself even.

I note there were a couple of things I found that might need fixes on the numpy side:

It appears despite NumPy claiming in dtype_api.h that:

// Copyswap is disabled
// #define NPY_DT_PyArray_ArrFuncs_copyswapn 3 + _NPY_DT_ARRFUNCS_OFFSET
// #define NPY_DT_PyArray_ArrFuncs_copyswap 4 + _NPY_DT_ARRFUNCS_OFFSET

The copyswap functions are called and if you don't define them we end up crashing, at least under numpy 2.4:

* thread #1, name = 'custom_float_te', stop reason = signal SIGSEGV: address not mapped to object (fault address=0x0)
  * frame #0: 0x0000000000000000
    frame #1: 0x00007ffff5653391 libthird_Uparty_Spy_Snumpy_Slibmultiarray.so`PyArray_Byteswap(self=0x000050533c6398f0, inplace='\x01') at methods.c:546:13
    frame #2: 0x00007ffff5654021 libthird_Uparty_Spy_Snumpy_Slibmultiarray.so`PyArray_Byteswap(self=0x000050533c639ad0, inplace='\0') at methods.c:570:15
    frame #3: 0x00007ffff565b092 libthird_Uparty_Spy_Snumpy_Slibmultiarray.so`array_byteswap(self=0x000050533c639ad0, args=0x0000555556b72f00, kwds=0x000050533cfc2400) at methods.c:587:12
    frame #4: 0x0000555555db65cc custom_float_test`method_vectorcall_VARARGS_KEYWORDS(func=0x000050533d78b330, args=0x000050533fc31878, nargsf=9223372036854775809, kwnames=0x000050533ceb3970) at descrobject.c:365:14
    frame #5: 0x0000555555d934e5 custom_float_test`_PyObject_VectorcallTstate(tstate=0x0000555556c188a8, callable=0x000050533d78b330, args=0x000050533fc31878, nargsf=9223372036854775809, kwnames=0x000050533ceb3970) at pycore_call.h:92:11
    frame #6: 0x0000555555d94f6a custom_float_test`PyObject_Vectorcall(callable=0x000050533d78b330, args=0x000050533fc31878, nargsf=9223372036854775809, kwnames=0x000050533ceb3970) at call.c:325:12
    frame #7: 0x00005555561be369 custom_float_test`_PyEval_EvalFrameDefault(tstate=0x0000555556c188a8, frame=0x000050533fc317f0, throwflag=0) at bytecodes.c:2715:19
    frame #8: 0x0000555556189546 custom_float_test`_PyEval_EvalFrame(tstate=0x0000555556c188a8, frame=0x000050533fc31628, throwflag=0) at pycore_ceval.h:89:16
    frame #9: 0x000055555618940b custom_float_test`_PyEval_Vector(tstate=0x0000555556c188a8, func=0x000050533dc9f880, locals=0x0000000000000000, args=0x000050533cafa380, argcount=1, kwnames=0x000050533fccbb80) at ceval.c:1685:12
    frame #10: 0x0000555555d95578 custom_float_test`_PyFunction_Vectorcall(func=0x000050533dc9f880, stack=0x000050533cafa380, nargsf=1, kwnames=0x000050533fccbb80) at call.c:419:16
    frame #11: 0x0000555555d9c065 custom_float_test`_PyObject_VectorcallTstate(tstate=0x0000555556c188a8, callable=0x000050533dc9f880, args=0x000050533cafa380, nargsf=1, kwnames=0x000050533fccbb80) at pycore_call.h:92:11
    frame #12: 0x0000555555d9a289 custom_float_test`method_vectorcall(method=0x000050533cfc2000, args=0x000050533cafa388, nargsf=9223372036854775808, kwnames=0x000050533fccbb80) at classobject.c:61:18
    frame #13: 0x0000555555d94ee7 custom_float_test`_PyVectorcall_Call(tstate=0x0000555556c188a8, func=(custom_float_test`method_vectorcall at classobject.c:44), callable=0x000050533cfc2000, tuple=0x0000555556b72f00, kwargs=0x000050533cfc1c40) a
t call.c:283:24

The AI cunningly did this to get around the problem but NumPy either needs to either allow copyswap and copyswapn to be provided or not call them.

#ifndef NPY_DT_PyArray_ArrFuncs_copyswapn
#define NPY_DT_PyArray_ArrFuncs_copyswapn (3 + (1 << 11))
#endif

#ifndef NPY_DT_PyArray_ArrFuncs_copyswap
#define NPY_DT_PyArray_ArrFuncs_copyswap (4 + (1 << 11))
#endif
  1. it seems that it's not possible to override the dot operator correctly for a new-style user dtype. Is that correct?

@seberg
Copy link
Copy Markdown
Contributor

seberg commented Feb 24, 2026

it seems that it's not possible to override the dot operator correctly for a new-style user dtype. Is that correct?

I think we may have to do the same cunning trick here, which is fine. I was a bit in the "add when needed" mode for ArrFuncs, because really we should solve all of these differently...
The one other thing that I am not sure about is that we might need the old getitem function in order to return a Python float/integer for .item().

Either way, as much as it isn't nice, I think when it comes to ArrFuncs it's very much workable to brutally monkey-patch it.

@copybara-service copybara-service bot force-pushed the test_871352005 branch 3 times, most recently from 10ca2f2 to 4a875a0 Compare February 25, 2026 10:31
@hawkinsp
Copy link
Copy Markdown
Collaborator

it seems that it's not possible to override the dot operator correctly for a new-style user dtype. Is that correct?

I think we may have to do the same cunning trick here, which is fine. I was a bit in the "add when needed" mode for ArrFuncs, because really we should solve all of these differently... The one other thing that I am not sure about is that we might need the old getitem function in order to return a Python float/integer for .item().

Either way, as much as it isn't nice, I think when it comes to ArrFuncs it's very much workable to brutally monkey-patch it.

Can this be done? The failure is:

>     result = np.dot(x, y)
E     TypeError: This function currently only supports native NumPy dtypes and old-style user dtypes, but the dtype was bcomplex32.
E     (The function may need to be updated to support arbitraryuser dtypes.)

and I think the call chain there is something like PyArray_InnerProduct calls PyArray_ObjectType on its arguments, which promply dies here: https://github.com/numpy/numpy/blob/10e9faf1afbecca9316ce752c8a1dc8807137edb/numpy/_core/src/multiarray/convert_datatype.c#L1907

I don't think this can be worked around from the dtype? We didn't make it even as far as calling the dtype's code.

PiperOrigin-RevId: 871352005
@seberg
Copy link
Copy Markdown
Contributor

seberg commented Feb 25, 2026

:(, I had somehow missed that it failed this early, thought it was later. Let me make sure to fix this for 2.5.
But the question is still whether we can "backport" this part (I really naively thought this would be more about the ArrFuncs where I wouldn't have any squirms).

Monkeypatching away this particular is likely too crazy :(. We I think we would have to:

  • keep a legacy num = PyArray_RegisterDType() around
  • descr = PyArray_DescrFromType(num). Then Py_SET_TYPE(descr, NewDType), and
  • NewDType.flags |= 1; NewDType.num = descr->num (tell NumPy this is a legacy dtype. That should be fine because it can be used like one -- i.e. it has a type number now).

This whole dance is basically just to create a type number, because I liked the idea of not needing type numbers. And if there was no history here, maybe all of this would be less of a deal (i.e. find a solution without assigning a type number), but porting things...

In theory one could monkey-patch the other way around, but that seems even less desire-able. FWIW, I think we can add code to NumPy to do the above in a sane way (i.e. a single new flag or so, that says "my dtype is legacy compatible and should get a type number".).

@hawkinsp
Copy link
Copy Markdown
Collaborator

BTW, I just made one more change in this branch, which is to:

  • set the kind characters to their natural kinds (f instead of V).
  • set all the type descriptor characters to the same one ? rather than me claiming a unique one randomly for each type.

The tests all seem to pass, which is great!

I wonder however if that's the right thing to do or not. I wonder what if anything still cares about the type descriptor characters?

@seberg
Copy link
Copy Markdown
Contributor

seberg commented Feb 25, 2026

I wonder however if that's the right thing to do or not. I wonder what if anything still cares about the type descriptor characters?

Not much really. Some downstream projects could in theory use it as C-API, but I'll doubt it overall.

Things that might break, we should maybe open NumPy issues (I can do that):

  • You need to implement __hash__ for sure.
  • I am not sure about __reduce__
  • Hmmmm, one thing I now realize is that the __array_interface__ could be a problem. Exporting things as <f2 for bfloat16 which then round-trips incorrectly :(.
    So it might be that we have to fix the __array_interface__ repr in NumPy 2.5 before we can actually change the kind character.

I would be tempted to leave the character at \0, which is the default right now. Just don't use ? that is the character of a boolean :).

The tests all seem to pass, which is great!

But I guess dot() is still broken, then?

@hawkinsp
Copy link
Copy Markdown
Collaborator

I wonder however if that's the right thing to do or not. I wonder what if anything still cares about the type descriptor characters?

Not much really. Some downstream projects could in theory use it as C-API, but I'll doubt it overall.

Things that might break, we should maybe open NumPy issues (I can do that):

  • You need to implement __hash__ for sure.
  • I am not sure about __reduce__
  • Hmmmm, one thing I now realize is that the __array_interface__ could be a problem. Exporting things as <f2 for bfloat16 which then round-trips incorrectly :(.
    So it might be that we have to fix the __array_interface__ repr in NumPy 2.5 before we can actually change the kind character.

I would be tempted to leave the character at \0, which is the default right now. Just don't use ? that is the character of a boolean :).

The tests all seem to pass, which is great!

But I guess dot() is still broken, then?

Yes. dot is broken, I'm just skipping that for now.

@hawkinsp
Copy link
Copy Markdown
Collaborator

I would be tempted to leave the character at \0, which is the default right now. Just don't use ? that is the character of a boolean :).

I did this and after a couple of fixes it works.

numpy/numpy#30879 seems necessary now.

@MaanasArora
Copy link
Copy Markdown

MaanasArora commented Mar 2, 2026

Sorry, coming into this a bit late! But for dot, backporting does seem hard, though maybe not impossible if we just special-case the common dtype code with a minimal something for user dtypes? It won't really be the right thing to do though I guess, as it's essentially introducing a rough version of a (missed) feature.

On the NumPy side, I dug through the code and think the 'decision' on the dtype is basically made just here:

https://github.com/numpy/numpy/blob/dd102ade8afbe0bf16870cca75fa391fe17cc634/numpy/_core/src/multiarray/multiarraymodule.c#L988-L997

(there is a small reference below, but there's no real logic as far as I could tell.) So there might not be as much to port, hopefully, and we could just do "if not-legacy dtype, trim op descr"? But constructing the descr might have quirks, so it may not be as minimal at it seems, especially for backporting... trying to look more into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants