fix: do not force LE data-order in form.py#3629
Conversation
|
/cc: @jpivarski @tcawlfield |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files
🚀 New features to boost your workflow:
|
in that case we would need more infra to track what's the native byte order because this package interpolate with use cases (cuda) and formats (arrow/parquet/root) that care about byte order when you send chunks of data to them or store onto disks. |
|
We do track what's the native byteorder. |
|
@nileshpatra - Thanks for reporting it! I think, there are two possible fixes inside Awkward:
Instead of hardcoding
Always cast dtypes to little-endian ( |
we know this is not too bad because we have been doing this for reading TTree for decades on LE machines I guess |
I tend to agree that we want to standardize on little-endian (for cross-platform interchange), then the mapping should instead explicitly convert all dtypes to |
|
My understanding by emulating a s390x machine is that the problem is here: We In this case the array builder is used to build the array (if you manually provide the right buffers in the form of numpy arrays, it's not a problem like ak.Array(np.array([1,2,3])) for example). The array builder here reports the buffers in the native byte order:awkward/src/awkward/operations/ak_from_iter.py Lines 99 to 103 in 5ef8ad5 Yet even though from_buffers allows both byte orders as options, index_to_dtype enforces a view as little endian when the buffer may not be little endian.
If we wanna keep this behavior of the array builder reporting back buffers in native byteorder and If we wanna allow little-endian byteorder only, we need to A) ensure that the array builder gives back little endian buffers only and B) disallow big-endian byteorders everywhere and force a litttle endian byteorder here: @ianna suggested both of these solutions. I'm a bit skeptical about the second one because I think numpy creates new arrays in the native byte order. We create many intermediate arrays in the awkward code base and also for the kernels and if you force little endian in Please let me know if my analysis is wrong and where. |
|
For a concrete example, on a big endian machine when calling (Pdb) buffers["node1-data"]
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0,
0, 3], dtype=uint8)
(Pdb) buffers["node1-data"].view(">i8")
array([1, 2, 3])
(Pdb) buffers["node1-data"].view("<i8")
array([ 72057594037927936, 144115188075855872, 216172782113783808],
dtype='<i8')You can see how viewing it as (Pdb) buffers["node1-data"]
array([1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0,
0, 0], dtype=uint8)
(Pdb) buffers["node1-data"].view(">i8")
array([ 72057594037927936, 144115188075855872, 216172782113783808],
dtype='>i8')
(Pdb) buffers["node1-data"].view("<i8")
array([1, 2, 3])Viewing it as For clarification, The error being raised is because the length is calculated from the last element of the offsets with the wrong view due to (Pdb) buffers
{'node0-offsets': array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3], dtype=uint8), 'node1-data': array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0,
0, 3], dtype=uint8)}
(Pdb) buffers["node0-offsets"]
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3], dtype=uint8)
(Pdb) buffers["node0-offsets"].view("<i8")
array([ 0, 216172782113783808], dtype='<i8')And while the array has length 3 ([1,2,3]), the length is assumed to be |
|
This PR still doesn't solve Maybe I'm just stupid and the only problem is that the array builder spits back native byteorder buffers when using |
|
I'm cc'ing @jpivarski and @agoose77 for any comments here. Basically awkward to/from buffers (+round trips) and from iter do not work on big endian systems and we just didn't know. I find it weird that the array builder |
I've just pushed a fix for this. I read all the comments on this PR, and am at the end, quite confused as to what is the best path forward here :-) |
I hopefully have the right fix for this too locally, here Let me know if I should it this to the PR instead of current commits. |
Yeah not very accurate language here on my side. I just meant that you mentioned both ways in your message of A) mostly leaving it as is and fixing it and B) enforcing little endian, that’s all. |
Please see my previous post- we are going with plan B 😀 |
|
Yeah I noticed. I have one question about this plan. Do you wanna enforce it literally everywhere? i.e disable completely the I am asking because if someone uses the ak.from_buffers interface directly from the outside and provides their own buffers, you can't control their endianess. I guess you can say that awkward array expects only little-endian buffers. What's the suggestion regarding this one? Do you want this option to stay and only enforce little-endian inside the awkward code base but still let users provide big-endian buffers or go away completely?
Awkward also assumes in multiple places that the byteorder is native. See awkward/src/awkward/types/numpytype.py Lines 48 to 60 in 5ef8ad5 |
|
I'm being really rude and skimming this thread -- I've timeboxed 10m to think about this! The context of this, which I think @ianna and others have shared / intuited, is that we chose to serialise to little-endian in order to yield a predominantly zero-copy reading experience: #2067 (comment) The intention was that whilst we serialise to little endian, we should operate on arrays in the host endianness. As such, we only need to enforce endianness at the IO boundaries. The awkward It sounds like the compiled TL;DR
|
|
Thanks @agoose77! Should big endian round trips also work on little endian systems like this? Because this is not a round-trip atm In [1]: import awkward as ak
In [2]: ak._util.native_byteorder
Out[2]: '<'
In [3]: array = ak.Array([1,2,3])
In [4]: ak.from_buffers(*ak.to_buffers(array, byteorder=">"), byteorder=">")
Out[4]: <Array [72057594037927936, ..., 216172782113783808] type='3 * int64'> |
|
@ikrommyd yes, that should work (although it would be handy to add an assert that the native byteorder is We might want to add a qemu test that runs Awkward on big endian, but I don't want to unduly increase the maintenance burden. Crucially, forms describe in-memory arrays, and therefore can't have an endianness -- how would we handle LE on a BE system? We need to convert at the boundaries instead. So, forms should just be |
|
Yeah I fixed the repro 😃. This is on my macbook so it's little endian. Endianess should probably be checked here in this block too and flipped accordingly right? awkward/src/awkward/operations/ak_from_buffers.py Lines 233 to 246 in 5ef8ad5 It's only checked in the else block below which is when is_own_array is False.
|
|
I reckon that's probably it! |
|
Hi @ikrommyd would you consider to list the changes that are needed here? I'll be happy to adjust my PR accordingly (given that I did spend some cycles to write a working patch). I don't understand the codebase in great detail -- it was meant to be a drive by contribution since I saw the package failing in Debian. Alternatively, if you want to work out the details, I could as well close the PR. Let me know. |
This PR thread is now officially a choose-your-own-adventure novel. Everyone’s commenting like it’s a race, skipping chapters, and somehow we’ve ended up debating something that was resolved three scrolls ago. I vote we pause the chaos, summon the fellowship (aka a meeting), and bring some actual coherence to this saga before GitHub files for emotional distress. |
|
@all-contributors please add nileshpatra for code |
|
I've put up a pull request to add @nileshpatra! 🎉 |
ianna
left a comment
There was a problem hiding this comment.
@nileshpatra - Thanks for you contribution! The changes proposed do not affect LE workflows - all the tests pass. And as we currently do not have an IB on BE and your PR fixes it for you I will merge it. Thanks!
|
The documentation preview is ready to be viewed at http://preview.awkward-array.org.s3-website.us-east-1.amazonaws.com/PR3629 |
|
Thank you @ianna! As I said previously and on the relevant issue, this fixes like 95% of the tests on a s390x VM so I think it's a good decision! |
Hi, on big endian (s390x) a simple array create fails as follows. This seems to stem from the fact that byte order is hard-coded. The same is also seen in debian CI, see here.
It should instead use the (native) byte order of the build machine.