AI-generated port of ml_dtypes to numpy 2.#360
AI-generated port of ml_dtypes to numpy 2.#360copybara-service[bot] wants to merge 1 commit intomainfrom
Conversation
16ac2b3 to
c6fbd5f
Compare
|
@seberg FYI I have no intention of submitting this as is without doing a bunch of manual work on it first, but it shows that we can port ml_dtypes to numpy 2's apis with numpy 2.4. I have not read these that closely myself even. I note there were a couple of things I found that might need fixes on the numpy side: It appears despite NumPy claiming in The copyswap functions are called and if you don't define them we end up crashing, at least under numpy 2.4: The AI cunningly did this to get around the problem but NumPy either needs to either allow copyswap and copyswapn to be provided or not call them.
|
I think we may have to do the same cunning trick here, which is fine. I was a bit in the "add when needed" mode for ArrFuncs, because really we should solve all of these differently... Either way, as much as it isn't nice, I think when it comes to |
10ca2f2 to
4a875a0
Compare
Can this be done? The failure is: and I think the call chain there is something like I don't think this can be worked around from the dtype? We didn't make it even as far as calling the dtype's code. |
4a875a0 to
7c2ea5d
Compare
PiperOrigin-RevId: 871352005
7c2ea5d to
974cb8c
Compare
|
:(, I had somehow missed that it failed this early, thought it was later. Let me make sure to fix this for 2.5. Monkeypatching away this particular is likely too crazy :(. We I think we would have to:
This whole dance is basically just to create a type number, because I liked the idea of not needing type numbers. And if there was no history here, maybe all of this would be less of a deal (i.e. find a solution without assigning a type number), but porting things... In theory one could monkey-patch the other way around, but that seems even less desire-able. FWIW, I think we can add code to NumPy to do the above in a sane way (i.e. a single new flag or so, that says "my dtype is legacy compatible and should get a type number".). |
|
BTW, I just made one more change in this branch, which is to:
The tests all seem to pass, which is great! I wonder however if that's the right thing to do or not. I wonder what if anything still cares about the type descriptor characters? |
Not much really. Some downstream projects could in theory use it as C-API, but I'll doubt it overall. Things that might break, we should maybe open NumPy issues (I can do that):
I would be tempted to leave the character at
But I guess |
Yes. dot is broken, I'm just skipping that for now. |
I did this and after a couple of fixes it works. numpy/numpy#30879 seems necessary now. |
|
Sorry, coming into this a bit late! But for dot, backporting does seem hard, though maybe not impossible if we just special-case the common dtype code with a minimal something for user dtypes? It won't really be the right thing to do though I guess, as it's essentially introducing a rough version of a (missed) feature. On the NumPy side, I dug through the code and think the 'decision' on the dtype is basically made just here: (there is a small reference below, but there's no real logic as far as I could tell.) So there might not be as much to port, hopefully, and we could just do "if not-legacy dtype, trim op descr"? But constructing the descr might have quirks, so it may not be as minimal at it seems, especially for backporting... trying to look more into this. |
AI-generated port of ml_dtypes to numpy 2.