ENH: Faster array padding

Dear devs,

As suggested here https://github.com/numpy/numpy/pull/11033#issuecomment-386039128 the current implementation of `numpy.pad` uses copies more than necessary. Currently most of the pad modes use `numpy.concatenate` under the hood to create the new array. This has to happen twice for each padded axis. I think it would be faster to pre-allocate the returned array once with the correct final shape and just set the appropriate edge values.

Here is a first draft of a function that would pre-allocate an array with padded shape and undefined content in the padded areas.

```python
def _pad_empty(arr, pad_amt):
    """Pad array with undefined values.

    Parameters
    ----------
    arr : ndarray
        Array to grow.
    pad_amt : sequence of tuple[int, int]
        Pad width on both sides for each dimension in `arr`.

    Returns
    -------
    padded : ndarray
        Larger array with undefined values in padded areas.
    """
    # Allocate grown array
    new_shape = tuple(s + sum(p) for s, p in zip(arr.shape, pad_amt))
    padded = np.empty(new_shape, dtype=arr.dtype)

    # Copy old array into correct space
    old_area = tuple(
        slice(None if left == 0 else left, None if right == 0 else -right)
        for left, right in pad_amt
    )
    padded[old_area] = arr

    return padded
``` 

These undefined pad-areas could then be filled by simple value assignment, e.g. with new `_set_const_after`, `_set_mean_before`... I think this would be significantly faster and I (kind of) tested this already with the suggested function `_fast_pad` in https://github.com/scikit-image/scikit-image/pull/3022.

If you like this idea, I'd be happy to make a PR that addresses this after #11012 is resolved one way or another. I'm looking forward to your feedback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ENH: Faster array padding #11126

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

ENH: Faster array padding #11126

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions