Skip to content

Use fused types in denoise, warp#3486

Merged
stefanv merged 19 commits intoscikit-image:masterfrom
jni:bugfix/float-cast
Mar 6, 2019
Merged

Use fused types in denoise, warp#3486
stefanv merged 19 commits intoscikit-image:masterfrom
jni:bugfix/float-cast

Conversation

@jni
Copy link
Copy Markdown
Member

@jni jni commented Oct 19, 2018

Description

Fixes #3449, #1287

Supersedes #3469, #3253 (I think), #1977.

All credit to @AetherUnbound and @hmaarrfk here, I have just rebased their work on the latest master after @hmaarrfk fixed our negative indexing use in Cython.

Checklist

For reviewers

  • Check that the PR title is short, concise, and will make sense 1 year
    later.
  • Check that new functions are imported in corresponding __init__.py.
  • Check that new features, API changes, and deprecations are mentioned in
    doc/release/release_dev.rst.
  • Consider backporting the PR with @meeseeksdev backport to v0.14.x

@pep8speaks
Copy link
Copy Markdown

pep8speaks commented Oct 19, 2018

Hello @jni! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 18:68: E226 missing whitespace around arithmetic operator

Comment last updated at 2019-03-05 23:40:45 UTC

@hmaarrfk
Copy link
Copy Markdown
Member

Copy link
Copy Markdown

@AetherUnbound AetherUnbound left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me - thanks for compiling this all together! And thanks @hmaarrfk for fixing the negative indexing stuff.

@hmaarrfk
Copy link
Copy Markdown
Member

@jni, I'm going to say that though I appreciate that my original PR is taken seriously, I don't know how worthwhile it is to complicate the warping code if it doesn't really give a speedup. In my experience, templated code is so much harder to read and to adapt. By including this, we are forcing future developers to also develop warping algorithms that function for all numeric types. This is a huge potential burden.

I would look into why it was that any speedup was obtained for float64 types and maybe just apply that fix. It might have been passing a parameter by reference in a tight loop which was necessary to template the functions.

I think the assumption that integer arithmetic is faster than floatingpoint is seriously flawed due to massive hardware optimizations for the floating point units in modern processors, though I'm not an expert in this field.

float32/float64 seems to be a valid assumption if you can guarantee not upcasting. float16 only recently started to exist because GPUs are running out of memory for machine learning.

I can't speak specifically for @AetherUnbound's PR, I was mostly helping with syntax.

@jni
Copy link
Copy Markdown
Member Author

jni commented Oct 19, 2018

@hmaarrfk but speed is not the only advantage, there is also RAM usage, right? I think that's very valuable. And personally I don't find the code much more complicated. And you know I'm quite nitpicky about complicated code! =D

It looks like the test failure is to do with random noise values. Same failure on that Mac build as on AppVeyor:
https://travis-ci.org/scikit-image/scikit-image/jobs/443575582#L5285-L5301

Edit: here's the failure for quick reference:

__________________________ test_denoise_bilateral_2d ___________________________
    def test_denoise_bilateral_2d():
        img = checkerboard_gray.copy()[:50, :50]
        # add some random noise
        img += 0.5 * img.std() * np.random.rand(*img.shape)
        img = np.clip(img, 0, 1)
    
        out1 = restoration.denoise_bilateral(img, sigma_color=0.1,
                                             sigma_spatial=10, multichannel=False)
        out2 = restoration.denoise_bilateral(img, sigma_color=0.2,
                                             sigma_spatial=20, multichannel=False)
    
        # make sure noise is reduced in the checkerboard cells
        assert_(img[30:45, 5:15].std() > out1[30:45, 5:15].std())
>       assert_(out1[30:45, 5:15].std() > out2[30:45, 5:15].std())
E       AssertionError
skimage/restoration/tests/test_denoise.py:183: AssertionError

Dunno why it would happen in this PR and not elsewhere, since that test appears to use float64 anyway... Any ideas?

@hmaarrfk
Copy link
Copy Markdown
Member

Probably because we added some tests, the seed is different than before. Numpy (the repo) actually prints the values of the arrays that failed.

Would be useful if we had that....

@jni
Copy link
Copy Markdown
Member Author

jni commented Oct 20, 2018

@hmaarrfk it's not that the seed is different, it's that it's not set. You can run

pytest skimage/restoration/tests/test_denoise.py::test_denoise_bilateral_2d

repeatedly to get different results each time.

@hmaarrfk
Copy link
Copy Markdown
Member

@jni really? i thought the python seed was set to some consistent value everytime. I guess this isn't matlab again. It probably needs an environment variable.

@jni
Copy link
Copy Markdown
Member Author

jni commented Oct 20, 2018

Anyway, perhaps it's not a good test, but it's a bit concerning that this PR brought that out.

@hmaarrfk
Copy link
Copy Markdown
Member

The very first time I ran it, it failed, then all other times, it passed. I even installed pytest-repeat and ran it with

➤ pytest skimage/restoration/tests/test_denoise.py::test_denoise_bilateral_2d --count 100

still passed....

@hmaarrfk
Copy link
Copy Markdown
Member

export PYTHONHASHSEED=1   
pytest skimage/restoration/tests/test_denoise.py::test_denoise_bilateral_2d 

will fail, but I can't run it with --pdb

@hmaarrfk
Copy link
Copy Markdown
Member

    def test_denoise_bilateral_2d():
        img = checkerboard_gray.copy()[:50, :50]
        # add some random noise
        noise = 0.5 * img.std() * np.random.rand(*img.shape)
        img += noise
        img = np.clip(img, 0, 1)
    
        out1 = restoration.denoise_bilateral(img, sigma_color=0.1,
                                             sigma_spatial=10, multichannel=False)
        out2 = restoration.denoise_bilateral(img, sigma_color=0.2,
                                             sigma_spatial=20, multichannel=False)
    
        # make sure noise is reduced in the checkerboard cells
>       assert img[30:45, 5:15].std() > out1[30:45, 5:15].std()
E       assert 0.07044972960234154 > nan
E        +  where 0.07044972960234154 = <built-in method std of numpy.ndarray object at 0x7fcbe9b3ce40>()
E        +    where <built-in method std of numpy.ndarray object at 0x7fcbe9b3ce40> = array([[ 0.12671524,  0.15149392,  0.21206663,  0.15423946,  0.00545805,\n         0.16724259,  0.21282153,  0.1807636 ...1418,  0.08659465,  0.017158  ,  0.18889751,\n         0.09346233,  0.08582789,  0.12415705,  0.2002917 ,  0.07256701]]).std
E        +  and   nan = <built-in method std of numpy.ndarray object at 0x7fcbe9b3cdf0>()
E        +    where <built-in method std of numpy.ndarray object at 0x7fcbe9b3cdf0> = array([[ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan],\n       [ nan,  nan,  nan,  nan,  nan,  nan,  nan,...  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan],\n       [ nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan,  nan]]).std

cdef:
double[:, :, ::1] cimage = np.ascontiguousarray(image)
double[:, :, ::1] cu = u
np_floats[:, :, ::1] cu = u
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can cu be removed? u is already a horrible name

@hmaarrfk
Copy link
Copy Markdown
Member

Do we have a benchmark? is it possible to see what happens when we change the looping order of the tight loop in _denoise_tv_bregman to C ordering?

@hmaarrfk
Copy link
Copy Markdown
Member

Finally, when all those comments are addressed, you can release the gil. You will need this line of code somewhere

                      color_lut_bin = min(
                          <Py_ssize_t>(dist * dist_scale), max_color_lut_bin)

max_value /= bins

for b in range(bins):
color_lut[b] = _gaussian_weight(sigma, b * max_value)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python code should use numpy and not cythonisms.

@hmaarrfk
Copy link
Copy Markdown
Member

@jni, thanks for helping find the bug.

Can you please close this PR? it seems there is more work to do in the denoising PR.

@jni
Copy link
Copy Markdown
Member Author

jni commented Oct 21, 2018

@hmaarrfk I don't see why the debugging needs to happens specifically in one or the other PR? I'm happy to take this on. Thanks for the detailed and specific review either way!

@hmaarrfk
Copy link
Copy Markdown
Member

whatever works for you and @AetherUnbound \o/

@jni jni force-pushed the bugfix/float-cast branch from c73f259 to 5f78d05 Compare November 29, 2018 14:00
@jni
Copy link
Copy Markdown
Member Author

jni commented Nov 29, 2018

@AetherUnbound I've given this another go. Sorry to miss your ping earlier this month. I think @hmaarrfk is on holiday in Japan atm. =D

Let's see how my latest commits fare in CI! =)

@jni
Copy link
Copy Markdown
Member Author

jni commented Nov 29, 2018

One last thing I want to do is move all of the LUT code out of Cython. But for now it's bedtime.

@jni
Copy link
Copy Markdown
Member Author

jni commented Dec 1, 2018

It may not look like it, but CI is actually happy! 😝 🎉 🎉 🎉

@scikit-image/core can we have some reviews please? =D This fixes a regression in 0.14.1 relative to 0.14.0 so it's relatively urgent (within scikit-image levels of urgency, anyway =P).

The failures are in --pre because, as far as I can tell, NumPy 1.16 changed the deprecation message for numpy.matrix, so all the work in #3242 is now obsolete. 🤦‍♂️ I'll raise a separate issue but I don't think it should hold up this PR. [edit: @sciunto already found it in #3571]

https://travis-ci.org/scikit-image/scikit-image/jobs/461638460#L3340

ValueError: Unexpected warning: the matrix subclass is not the recommended way to represent matrices or deal with linear algebra (see https://docs.scipy.org/doc/numpy/user/numpy-for-matlab-users.html). Please adjust your code to use regular ndarray.

Copy link
Copy Markdown

@AetherUnbound AetherUnbound left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Beyond that test comment change

@AetherUnbound
Copy link
Copy Markdown

Ah, I don't have permission to push to your branch 😜

@jni
Copy link
Copy Markdown
Member Author

jni commented Dec 4, 2018

@AetherUnbound you don't have permission, but for future reference, you can always make a PR to my fork/branch.

@jni
Copy link
Copy Markdown
Member Author

jni commented Dec 4, 2018

Also, you can probably use the "suggest" button when making comments on code...?

@jni
Copy link
Copy Markdown
Member Author

jni commented Mar 5, 2019

Also, thank you very much for running the benchmarks! I have a visitor this week so I have been focusing on napari. But I agree with you that it would be excellent to push both 0.14.3 and 0.15 out the door. I think it's feasible. If we get this merged I'll devote some time today for release notes!

@hmaarrfk
Copy link
Copy Markdown
Member

hmaarrfk commented Mar 5, 2019

lets just say that I used to "wait for scikit image to comile. Now I go get a coffee.

@jni
Copy link
Copy Markdown
Member Author

jni commented Mar 5, 2019

@hmaarrfk this sounds like a win-win! 😂

@stefanv
Copy link
Copy Markdown
Member

stefanv commented Mar 5, 2019

This changes the return from skimage.transform.warp substantially. I am concerned about returning ints from an interpolation operation: there is significant precision loss there. I can imagine a dtype argument to determine the output, but it seems dangerous to presume the user always wants the same dtype out as in (yes, I know ndimage does this).

Either way, this behavior change is big enough that it would require deprecation if it were to go through.

@jni
Copy link
Copy Markdown
Member Author

jni commented Mar 5, 2019

@stefanv I think you've misread the PR. The output from warp is determined here:

cdef double[:, ::1] out = np.zeros((out_r, out_c), dtype=np.double)

and this line is not touched. As far as the warps module is concerned, and as far as I can tell, only internal functions are touched — which might explain the benchmark results 😂.

@stefanv
Copy link
Copy Markdown
Member

stefanv commented Mar 5, 2019

OK, so maybe I also missed the intent / benefit of this PR then. Do you mind summarizing?

@jni
Copy link
Copy Markdown
Member Author

jni commented Mar 5, 2019

@stefanv the most immediate benefit is that it fixes #3449, a bug that was introduced in 0.14.1 that is preventing CellProfiler from upgrading past 0.14.0. The warping changes lay the groundwork for allowing float32 warps. Really there wouldn't be any changes to warp but we needed fused types to fix the denoise error and @hmaarrfk had already implemented them in #3253, so we built off that.

@stefanv
Copy link
Copy Markdown
Member

stefanv commented Mar 5, 2019

OK, I am +1; @jni @hmaarrfk do you mind confirming that all inline comments have been addressed? E.g., the pointer[0] syntax should probably be pointer*, but otherwise everything looks good to me.

... Hold on to your butts! =P

Co-Authored-By: jni <juan.nunez-iglesias@monash.edu>
@hmaarrfk
Copy link
Copy Markdown
Member

hmaarrfk commented Mar 5, 2019

I think everything is in order @stefanv

@stefanv stefanv merged commit f0d48db into scikit-image:master Mar 6, 2019
stefanv added a commit that referenced this pull request Mar 6, 2019
stefanv pushed a commit that referenced this pull request Mar 6, 2019
* MNT: Add a fused numeric type to make fused_types more constent.

* MNT: make the interpolation use fused types for any->any type interpolation of images.

* MNT: Move even more numpy function calls from Cython to Python.

* MNT: More explicit type specifying

* BUG: Use fused floats in denoise

Pull all denoise array allocation out into python
Function used in python should be def not cdef
Have the correct number of arguments

* MNT: Add additional type check for denoise, more docs

* PEP8 fixes

* Remove redundant array u and view cu

* Mutate output array in Cython and trim in Python

* Replace height and width with existing rows and cols

* Change iteration order to match array order

* Move range_lut and color_lut properly from Cy to Py

* Ravel range LUT which is expected to be 1D

* Update comment in denoise bilateral tests

* Fix relative import in interpolation.pyx

* BENCH: Benchmark for warping with many types

* Rename warp benchmark file

* Update pointer syntax

Co-Authored-By: Mark Harfouche <mark.harfouche@gmail.com>
@stefanv
Copy link
Copy Markdown
Member

stefanv commented Mar 6, 2019

Don't ask :)

@jni
Copy link
Copy Markdown
Member Author

jni commented Mar 6, 2019

I think there's very little chance that this will work but...

@meeseeksdev backport to v0.14.x

@lumberbot-app
Copy link
Copy Markdown

lumberbot-app bot commented Mar 6, 2019

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
$ git checkout v0.14.x
$ git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
$ git cherry-pick -m1 f0d48db4c246989182aa01c837d04903bc2330ae
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
$ git commit -am 'Backport PR #3486: Use fused types in denoise, warp'
  1. Push to a named branch :
git push YOURFORK v0.14.x:auto-backport-of-pr-3486-on-v0.14.x
  1. Create a PR against branch v0.14.x, I would have named this PR:

"Backport PR #3486 on branch v0.14.x"

And apply the correct labels and milestones.

Congratulation you did some good work ! Hopefully your backport PR will be tested by the continuous integration and merged soon!

If these instruction are inaccurate, feel free to suggest an improvement.

@lumberbot-app lumberbot-app bot added the Still Needs Manual Backport MrMeeseeks-managed label label Mar 6, 2019
jni added a commit to jni/scikit-image that referenced this pull request Mar 6, 2019
* MNT: Add a fused numeric type to make fused_types more constent.

* MNT: make the interpolation use fused types for any->any type interpolation of images.

* MNT: Move even more numpy function calls from Cython to Python.

* MNT: More explicit type specifying

* BUG: Use fused floats in denoise

Pull all denoise array allocation out into python
Function used in python should be def not cdef
Have the correct number of arguments

* MNT: Add additional type check for denoise, more docs

* PEP8 fixes

* Remove redundant array u and view cu

* Mutate output array in Cython and trim in Python

* Replace height and width with existing rows and cols

* Change iteration order to match array order

* Move range_lut and color_lut properly from Cy to Py

* Ravel range LUT which is expected to be 1D

* Update comment in denoise bilateral tests

* Fix relative import in interpolation.pyx

* BENCH: Benchmark for warping with many types

* Rename warp benchmark file

* Update pointer syntax

Co-Authored-By: Mark Harfouche <mark.harfouche@gmail.com>
@jni
Copy link
Copy Markdown
Member Author

jni commented Mar 6, 2019

I've submitted a manual backport.

@hmaarrfk
Copy link
Copy Markdown
Member

hmaarrfk commented May 1, 2019

@meeseeksdev backport to v0.14.x

@lumberbot-app
Copy link
Copy Markdown

lumberbot-app bot commented May 1, 2019

Something went wrong ... Please have a look at my logs.

hmaarrfk added a commit to hmaarrfk/scikit-image that referenced this pull request May 1, 2019
* MNT: Add a fused numeric type to make fused_types more constent.

* MNT: make the interpolation use fused types for any->any type interpolation of images.

* MNT: Move even more numpy function calls from Cython to Python.

* MNT: More explicit type specifying

* BUG: Use fused floats in denoise

Pull all denoise array allocation out into python
Function used in python should be def not cdef
Have the correct number of arguments

* MNT: Add additional type check for denoise, more docs

* PEP8 fixes

* Remove redundant array u and view cu

* Mutate output array in Cython and trim in Python

* Replace height and width with existing rows and cols

* Change iteration order to match array order

* Move range_lut and color_lut properly from Cy to Py

* Ravel range LUT which is expected to be 1D

* Update comment in denoise bilateral tests

* Fix relative import in interpolation.pyx

* BENCH: Benchmark for warping with many types

* Rename warp benchmark file

* Update pointer syntax

Co-Authored-By: Mark Harfouche <mark.harfouche@gmail.com>
jni pushed a commit that referenced this pull request May 12, 2019
* Backport: use fused types in denoise, warp (#3486)

* MNT: Add a fused numeric type to make fused_types more constent.

* MNT: make the interpolation use fused types for any->any type interpolation of images.

* MNT: Move even more numpy function calls from Cython to Python.

* MNT: More explicit type specifying

* BUG: Use fused floats in denoise

Pull all denoise array allocation out into python
Function used in python should be def not cdef
Have the correct number of arguments

* MNT: Add additional type check for denoise, more docs

* PEP8 fixes

* Remove redundant array u and view cu

* Mutate output array in Cython and trim in Python

* Replace height and width with existing rows and cols

* Change iteration order to match array order

* Move range_lut and color_lut properly from Cy to Py

* Ravel range LUT which is expected to be 1D

* Update comment in denoise bilateral tests

* Fix relative import in interpolation.pyx

* BENCH: Benchmark for warping with many types

* Rename warp benchmark file

* Update pointer syntax

Co-Authored-By: Mark Harfouche <mark.harfouche@gmail.com>

* forward port check_sdist
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Still Needs Manual Backport MrMeeseeks-managed label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Denoise throws ValueError

5 participants