Array.__getitem__ should consider `array.chunk-size` for concrete indexers?

Reported in https://github.com/pydata/xarray/issues/4112, where xarray calling `Array.__getitem__` resulted in an unbalanced Array.

```python
import dask.array

arr = dask.array.from_array([0, 1, 2, 3], chunks=(1,))
print(arr.chunks)  # ((1, 1, 1, 1),)
# align calls reindex which indexes with something like this
indexer = [0, 1, 2, 3, ] + [-1,] * 111
print(arr[indexer].chunks)  # ((1, 1, 1, 112),)
```

That last chunk is large, since all of the `-1`s are in the same input chunk.

To be explicit, the user could chunk the `indexer`:

```
# maybe something like this is a solution
lazy_indexer = dask.array.from_array(indexer, chunks=arr.chunks[0][0], name="idx")
print(arr[lazy_indexer].chunks) # ((1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1),)
```

If the indexer isn't `chunked`, perhaps dask could examine the input array and indexer and chunk the indexer if we detect that the output chunksize would exceed `dask.config.get('array.chunk-size')`?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Array.getitem should consider `array.chunk-size` for concrete indexers? #6270

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Array.__getitem__ should consider array.chunk-size for concrete indexers? #6270

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Array.getitem should consider `array.chunk-size` for concrete indexers? #6270