Skip to content

32 bit dtypes => 64 bit in morphology.reconstruction#6295

Closed
JamesSample wants to merge 1 commit intoscikit-image:mainfrom
JamesSample:6277-reconstruct-issue
Closed

32 bit dtypes => 64 bit in morphology.reconstruction#6295
JamesSample wants to merge 1 commit intoscikit-image:mainfrom
JamesSample:6277-reconstruct-issue

Conversation

@JamesSample
Copy link
Copy Markdown
Contributor

Apologies for the delay. Everyone in my household came down with Covid last week, so I've mostly been at home entertaining bored kids.

This is a very limited pull request, but I likely won't get chance to look at this again until I've cleared a backlog of other stuff, and I thought it better to submit something now to keep things moving...

Description

This PR refers to #6277.

I've converted int32 dtypes to int64 dtypes in both the Cython and Python code. I've tested that this version gives the same result as the original for a selection of small arrays, and it completes without error (and produces seemingly sensible results) for two arrays that were previously crashing the kernel.

For indexing into arrays in Cython I've used Py_ssize_t for signed data types and size_t for unsigned, but I don't have enough experience of Cython to know whether this is the best choice.

Note: This code still relies on filters._rank_order.rank_order, which converts ranks to uint32. This code will therefore still fail if 2 * seed.size > np.iinfo(np.uint32).max (i.e. if the number of values in one of the input arrays is greater than about 2.1e9). This limitation could be removed by also switching rank_order to 64 bit dtypes. For now, this code just raises an error (instead of crashing the kernel).

I'm afraid I haven't had time to make the memory handling in the Python more efficient, or to think about how to switch between 32 and 64 bit versions based on the size of the input arrays.

Checklist

  • Docstrings for all functions
  • Gallery example in ./doc/examples (new features only)
  • Benchmark in ./benchmarks, if your changes aren't covered by an
    existing benchmark
  • Unit tests
  • Clean style in the spirit of PEP8
  • Descriptive commit messages (see below)

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.
  • There is a bot to help automate backporting a PR to an older branch. For
    example, to backport to v0.19.x after merging, add the following in a PR
    comment: @meeseeksdev backport to v0.19.x
  • To run benchmarks on a PR, add the run-benchmark label. To rerun, the label
    can be removed and then added again. The benchmark output can be checked in
    the "Actions" tab.

@grlee77
Copy link
Copy Markdown
Contributor

grlee77 commented Mar 24, 2022

Thanks, @JamesSample. I will take a look soon and can help out a bit with rank_order as well

@grlee77
Copy link
Copy Markdown
Contributor

grlee77 commented Apr 13, 2022

Hi @JamesSample, I am going to be taking a look at this today. One option is to used Cython's fused types to allow both int32 and int64 variants so we don't always have to have the memory overhead of int64 when it isn't needed. One issue with the fused type approach is that we would also need a second fused type for the unsigned integers as well. Given those two fused types, Cython would generate 4 variants of the function. The following are the ones of interest:

  • one using uint32, int32
  • one using uint64, int64

but two unneeded variants would also be generated: (I am not aware of a way to avoid that)

  • one using uint32, int64
  • one using uint64, int32

Another recently supported option is to use Pythran instead. As of skimage 0.19 we do support its use in scikit-image. I think this may be a good candidate for that and will allow us to just create the variants we actually want. I will try it out today and report back

@grlee77
Copy link
Copy Markdown
Contributor

grlee77 commented Jun 11, 2022

This is included within the update in #6342, so let's go ahead and close this one

@grlee77 grlee77 closed this Jun 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants