Skip to content

Different behavior for awkward 2.x on ak.flatten with unknown type  #2207

@raymondEhlers

Description

@raymondEhlers

Version of Awkward Array

2.0.6

Description and code to reproduce

In some of my code, I frequently check that there are entries left in the array before trying to proceed to the next step. My use case is calling out to some c++ code to do jet finding event by event, so if there are no jets found, it eventually returns a full array of events, all of which are empty.

In awkward 1.x, I used ak.flatten(array["data"].px, axis=None) to flatten out a record (px is one of the fields, not calculated by vector) to check for entries (the particular record didn't really matter - I just needed one). In awkward 2.x, if I do the same, I get an assertion error:

In [13]: ak.flatten(jets["data"].px, axis=None)
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In [13], line 1
----> 1 ak.flatten(jets["data"].px, axis=None)

File ~/software/dev/mammoth/.venv/lib/python3.10/site-packages/awkward/operations/ak_flatten.py:161, in flatten(array, axis, highlevel, behavior)
      9 """
     10 Args:
     11     array: Array-like data (anything #ak.to_layout recognizes).
   (...)
    155      999]
    156 """
    157 with ak._errors.OperationErrorContext(
    158     "ak.flatten",
    159     dict(array=array, axis=axis, highlevel=highlevel, behavior=behavior),
    160 ):
--> 161     return _impl(array, axis, highlevel, behavior)

File ~/software/dev/mammoth/.venv/lib/python3.10/site-packages/awkward/operations/ak_flatten.py:173, in _impl(array, axis, highlevel, behavior)
    168     out = ak._do.completely_flatten(layout, function_name="ak.flatten")
    169     assert isinstance(out, tuple) and all(
    170         isinstance(x, ak.contents.NumpyArray) for x in out
    171     )
--> 173     result = ak._do.mergemany(out)
    175     return ak._util.wrap(result, behavior, highlevel)
    177 elif axis == 0 or ak._util.maybe_posaxis(layout, axis, 1) == 0:

File ~/software/dev/mammoth/.venv/lib/python3.10/site-packages/awkward/_do.py:303, in mergemany(contents)
    302 def mergemany(contents: list[Content]) -> Content:
--> 303     assert len(contents) != 0
    304     return contents[0]._mergemany(contents[1:])

AssertionError:

Some more info on the array:

In [12]: jets.type.show()
81 * {
    data: var * Momentum4D[
        px: unknown,
        py: unknown,
        pz: unknown,
        E: unknown,
        area: unknown,
        rho_value: float64,
        constituents: var * Momentum4D[
            px: float64,
            py: float64,
            pz: float64,
            E: float64,
            index: int64
        ],
        unsubtracted_leading_track_pt: unknown
    ]
}

It seems to be about the "unknown" nature of the type, since if I do the same with pt, vector calculates that field, which then types it with float64, and the call works. Also note that if I flatten without axis=None, it works fine. A pickle with the jets array is attached regression_jets.pkl.zip (with .zip as the extension to allow it to be uploaded). I can in principle switch over to pt, but it would be nicer if this worked as in awkward 1.x . Thanks!

edit: I know for sure that the behavior changed, but now I'm a bit less confident it is truly a regression as opposed to the possibility that it's an intended change in behavior. If so, I suppose please let me know, and I'll have to dig through my codebase to change it

Metadata

Metadata

Assignees

Labels

bugThe problem described is something that must be fixed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions