Skip to content

Maximum array size for morphology.reconstruction? #6277

@JamesSample

Description

@JamesSample

Description

When used with large arrays, morphology.reconstruction crashes my Python kernel (without any other error - I just see The kernel for xxx.ipynb appears to have died. It will restart automatically).

Is there a limit to the size of array this function can handle (beyond the memory limits of the system)? My problem is similar to this issue, except my array does not contain any NaNs. If I split the array into smaller chunks, everything works as expected, but unfortunately I need to process the whole file in one go.

I've encountered the same problem with several different datasets, so I don't think it's specific to my data. I also have plenty of memory on my machine: I'm running on Google Cloud and have tried various machines with up to 1 TB of RAM. On these, the kernel crashes when memory usage is less than 1% (as indicated by top).

Way to reproduce

An example array (saved as a .pkl file) can be downloaded here (warning: it's about 5 GB).

The following code runs for 5 to10 minutes, then crashes the kernel:

import pickle
import numpy as np
import skimage.morphology

with open("glomma.pkl", "rb") as f:
    data = pickle.load(f)

assert ~np.isnan(data).any()

# Keep perimeter cells equal to the original values and set all other cells to max
seed = data.copy()
seed[1:-1, 1:-1] = data.max()

filled = skimage.morphology.reconstruction(seed, data, method="erosion")

If I use the code below to subset the array into tiles/quadrants, I can process each tile fine (but of course the combined result is not the same as the output for the full array).

# Process just one quadrant
rows, cols = data.shape
data = data[: int(rows / 2), : int(cols / 2)] 

Does anyone have any tips, please? I've already split the original image into "coherent" subsets to make each dataset as small as possible. 260 out of 262 subsets are processed OK, but the two largest crash the kernel and I haven't managed to find a workaround yet.

Thank you! :-)

Version information

3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46) 
[GCC 9.4.0]
Linux-5.4.144+-x86_64-with-glibc2.31
scikit-image version: 0.18.3
numpy version: 1.21.5

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions