Enable warping with more than just doubles. by hmaarrfk · Pull Request #3253 · scikit-image/scikit-image

hmaarrfk · 2018-07-07T02:35:17Z

~~With complex types too!!! That was more difficult than it sounds....~~ too difficult to do without help from the compiler.

~~This breaks things because people might have expected an implicit cast to float. Needs somebody to make a decision on how to deprecate this.~~

I think that calling anything like: img_as_* ~~, or any kind of range checking should be considered a mistake~~ without converting the data back to the original type, should be considered a mistake.
Personally, I think that the parameter should be ripped out. Instead, people should have to think about what type they want their data to be represented as.
Edit: I just think that types and units are different. The img_as* functions assume that the units you are I think that the img_as_* functions within algorithms is a mistake as it assumes a very specific unit system namely that uint8 is in ADC counts, and that floats are normalized ADC counts. That might be true for most cameras, but at the end of the day, it is just one unit system, and isn't universaly true, especially for other data that might have ND correlations.

~~I probably need some help writing tests for this, but for now it passes, so that is a good thing right??? maybe there is an implicit cast as float somewhere.~~

Edit: clarification on what I think should be considered a mistake.

Checklist

[It's fine to submit PRs which are a work in progress! But before they are merged, all PRs should provide:]

Clean style in the spirit of PEP8
Docstrings for all functions
Benchmark in ./benchmarks, if your changes aren't covered by an
existing benchmark
Unit tests (started, extra tests may depend reviewer decision)

For reviewers

(Don't remove the checklist below.)

Check that the PR title is short, concise, and will make sense 1 year
later.
Check that new functions are imported in corresponding __init__.py.
Check that new features, API changes, and deprecations are mentioned in
doc/release/release_dev.rst.
Consider backporting the PR with @meeseeksdev backport to v0.14.x

jni · 2018-07-07T18:19:10Z

@hmaarrfk

Cool stuff!

I think that calling anything like: img_as_* , ~~or any kind of range checking should be considered a mistake~~ without converting the data back to the original type, should be considered a mistake.

No, repeated uint8 -> float -> uint8 -> float... conversions result in artifacts. See one demonstration here. As you know I am not a fan of our type and range conversions, but the conversion to float is there for a good reason and converting float -> uint8 should only happen at the very end of a pipeline — and we have no way of knowing when that is. So the user has to be responsible here, one way or another.

We also have to think about how to combine this with #3148...

Will have a chat with @stefanv about this, as well as the deprecation path.

for now it passes, so that is a good thing right???

Yes, very good! =)

hmaarrfk · 2018-07-08T01:36:52Z

while thinking about it. I think it is best for the user to be able to provide an output dtype if we want to allow flexibility. That would allow us to chain algorithms together if we need to, while leaving the user’s units intact. I also like the idea of providing an output array, but that is a different story. Maybe there is place for both.

…

On Sat, Jul 7, 2018 at 11:19 AM Juan Nunez-Iglesias < ***@***.***> wrote: @hmaarrfk <https://github.com/hmaarrfk> Cool stuff! I think that calling anything like: img_as_* , or any kind of range checking should be considered a mistake without converting the data back to the original type, should be considered a mistake. No, repeated uint8 -> float -> uint8 -> float... conversions result in artifacts. See one demonstration here <#2677 (comment)>. As you know I am not a fan of our type and range conversions, but the conversion to float is there for a good reason and converting float -> uint8 should only happen at the very end of a pipeline — and we have no way of knowing when that is. So the user has to be responsible here, one way or another. We also have to think about how to combine this with #3148 <#3148>... Will have a chat with @stefanv <https://github.com/stefanv> about this, as well as the deprecation path. for now it passes, so that is a good thing right??? Yes, very good! =) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3253 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAFfmNHH6IPGtqhmq42RSiRch6JoFM72ks5uEPujgaJpZM4VGMoJ> .

pep8speaks · 2018-07-10T02:50:00Z

Hello @hmaarrfk! Thanks for updating the PR.

There are no PEP8 issues in the file benchmarks/benchmarks_transform_warp.py !
There are no PEP8 issues in the file skimage/transform/_warps.py !
There are no PEP8 issues in the file skimage/transform/radon_transform.py !
In the file skimage/transform/tests/test_warps.py, following are the PEP8 issues :

Line 521:34: E241 multiple spaces after ','
Line 522:22: E241 multiple spaces after ','
Line 522:39: E241 multiple spaces after ','
Line 546:34: E241 multiple spaces after ','
Line 547:22: E241 multiple spaces after ','
Line 547:39: E241 multiple spaces after ','

Comment last updated on October 25, 2018 at 02:48 Hours UTC

codecov-io · 2018-07-10T02:50:02Z

Codecov Report

Merging #3253 into master will decrease coverage by 0.94%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3253      +/-   ##
==========================================
- Coverage   87.77%   86.82%   -0.95%     
==========================================
  Files         323      341      +18     
  Lines       27138    27439     +301     
==========================================
+ Hits        23820    23825       +5     
- Misses       3318     3614     +296

Impacted Files	Coverage Δ
skimage/transform/radon_transform.py	`92.54% <100%> (ø)`	⬆️
skimage/transform/_warps.py	`99.48% <100%> (+0.01%)`	⬆️
skimage/transform/tests/test_warps.py	`100% <100%> (ø)`	⬆️
skimage/io/_plugins/imread_plugin.py	`69.23% <0%> (-15.39%)`	⬇️
skimage/io/tests/test_fits.py	`83.01% <0%> (-13.21%)`	⬇️
skimage/io/_plugins/fits_plugin.py	`81.13% <0%> (-9.44%)`	⬇️
skimage/io/tests/test_imread.py	`70.58% <0%> (-3.93%)`	⬇️
skimage/viewer/tests/test_tools.py	`97.34% <0%> (-2.66%)`	⬇️
skimage/filters/tests/test_lpi_filter.py	`87.5% <0%> (-2.09%)`	⬇️
skimage/feature/texture.py	`77.57% <0%> (-1.01%)`	⬇️
... and 49 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ef1d91a...2126022. Read the comment docs.

hmaarrfk · 2018-07-10T03:54:43Z

@jni: I just improved the the PR taking into consideration your feedback. Honestly, its up to you and the BDFL team with regards to how you want to deal with types, I just wanted to add flexibility to the algorithms which I think this PR achieves without imposing my point of view on the codebase.

warping now allows you to specify the dtype of the output. This enables higher level algorithms to choose the precision of this "intermediate step".
The default is still to upvert everything to float64 and not to preserve the range.
The intermediate computation type is determined by the homography matrix. This is a weird hack. I wish I could just let the compiler decide what the appropriate intermediate type is. For example: uint8 computation can probably benefit from potential speedups from single point arithmetic. Though maybe that is "premature optimization". Specifically, this is commit: 8262d73 I'm kinda impartial to this so if you want, I can take it off.
I backed away from complex interpolation. To avoid code duplication, it really needs to be done with C/C++ macros so that the compiler can choose the appropriate types. I don't know how to do in Cython. Apparently you can use Jinja or something, but maybe just writing straight C code is easier. I also don't know how to combine this with point Fix tv denoise alt #1 as you cannot have complex->float casting.

hmaarrfk · 2018-09-07T11:04:56Z

plot twist, this actuallly speeds up float64->float64 somehow, but slows down uint8->uint8 significantly. fun times!

jni

@hmaarrfk I like this a lot! Thanks! Mostly just style and documentation comments.

jni · 2018-09-07T12:43:44Z

skimage/feature/_texture.pyx



-def _local_binary_pattern(double[:, ::1] image,
+def _local_binary_pattern(cnp.float64_t[:, ::1] image,


Why change this?

i don't know. gardening. Raymond Hettinger taught me not to do it. I'll revert.

jni · 2018-09-07T12:44:28Z

skimage/feature/_texture.pyx

-                    texture[i] = bilinear_interpolation(&image[0, 0], rows, cols,
-                                                        r + rp[i], c + cp[i],
-                                                        'C', 0)
+                    bilinear_interpolation[cnp.float64_t, double, double](


Whoa, what's going on here? Do you have a reference for this syntax?

https://cython.readthedocs.io/en/latest/src/userguide/fusedtypes.html#indexing

fused type indexing. I didn't want to make changes to this function.

jni · 2018-09-07T12:47:27Z

skimage/transform/_warps.py

+    output_shape : tuple (rows, cols), optional
+        Shape of the output image generated (default None).
+    dtype :
+        The numric type of the output image.


This docstring is pretty woefully inaccurate...

What needs to be added?

jni · 2018-09-07T12:50:36Z

skimage/_shared/fused_numerics.pxd

+
+ctypedef fused np_numeric:
+    np_real_numeric
+    np_complexes


I love this file, and have wanted it for a long time. Thank you!!!

This should really be part of cython....

Also, if you want this, you should take it out of this PR since I'm not sure where the performance drop is coming from here. Cython casting?

I agree about Cython building this in. Perhaps that should be your next PR? ;) In the meantime I'm happy to wait for you to hunt down the errant cast... And I'm also happy to merge, since ~90% of our warping is probably float-based.

i probably need a bit of time to polish it up. it will get there.

Perhaps that should be your next PR? ;

I'd also suggest writing a book on Practical Cython 😅. Seems like @hmaarrfk is a rare expert in this topic.

jni · 2018-09-07T12:51:02Z

skimage/_shared/interpolation.pxd

+cdef inline void nearest_neighbour_interpolation(
+    np_real_numeric* image, Py_ssize_t rows, Py_ssize_t cols,
+    np_floats r, np_floats c, char mode, np_real_numeric cval,
+    np_real_numeric_out* out) nogil:


Can you indent this 8 spaces to differentiate from a code block?

jni · 2018-09-07T12:52:43Z

skimage/_shared/interpolation.pxd

-cdef inline double quadratic_interpolation(double x, double[3] f) nogil:
+    top = (1 - dc) * top_left + dc * top_right
+    bottom = (1 - dc) * bottom_left + dc * bottom_right
+    out[0] = <np_real_numeric_out> ((1 - dr) * top + dr * bottom)


Looking at this code now, it might not be as hard as I thought to generalise to nD... =) (Not related to this PR, I just meant to say that your code has made me wonder about this long-standing goal.

hmaarrfk · 2018-09-07T13:44:29Z

skimage/feature/_texture.pyx


    # To compute the variance features
    cdef double sum_, var_, texture_i
+    cdef cnp.float64_t zero = 0


Why do i have this? I'm going to try and take it off.

hmaarrfk

Needs extra tests.

hmaarrfk · 2018-09-07T13:46:04Z

skimage/transform/_warps.py

+    if output_shape is None:
+        out = np.empty_like(image, dtype=dtype)
+    else:
+        out = np.empty(shape=output_shape[:2], dtype=dtype)


I'm not sure what happens if you pass a 3D image and specify output shape.

hmaarrfk · 2018-09-07T14:05:43Z

@jni, I'm glad you like this, but the performance is about 30% worse on unint8, and 20% better on float64.

Now that I got asv running, I think I need to be a little more systematic about the development of this.

edit: specified worse

stefanv

I suspect the slowdown comes from situations such as when your image is of a different dtype than your cval. Then, a cast has to happen every time that the cval is assigned to the output image. I doubt there is a way to tell Cython: the image can be of any numeric type, but then we also want the cval to be the same type, otherwise cast it to the same type before continuing.

stefanv · 2018-09-07T20:08:43Z

skimage/_shared/interpolation.pxd

 cdef inline Py_ssize_t fmin(Py_ssize_t one, Py_ssize_t two) nogil:
    return one if one < two else two

+ctypedef fused np_real_numeric_out:


What's happening here?

I'm forcing cross type compilation too. so uint8->uint16, but more importantly uint8->float64

I added a note about this.

hmaarrfk · 2018-09-07T20:16:34Z

@stefanv, the fact that np_real_numeric is the same type for cval and image means that they are necessarily the same time.

np_real_numeric_out allows the output to be of a different dtype following standard c type conversion.

However, I think you are correct in thinking that it is related to an extra cast somewhere.

hmaarrfk · 2018-09-10T01:50:10Z

Final closing remarks on my experience addign fused types in this:

It seems that might actually have sped up uint8 by a little bit. A real benchmark on a dedicated computer would be necessary for this with interpolation order=0.

interpolation order=3 seems to be compute bound and not memory bound.
To speed that one up, we need to make use of the AVX instructions.
Unfortunately, I'm not a gcc guru so I don't know if setting export CFLAGS=-march=native was enough to enable it in cython.

It did help by about 20% in the CPU bound case of a 4096x4096 matrix.

This might hit the limitations of what cython can do. Not too sure.

This might be a good demo for Pythran or Numba to attack.

hmaarrfk · 2018-09-10T02:34:40Z

I don't know why no speedup is seen with the float32 operations. Seems like you should be able to get a 2x speed up from going to float32 because SSE(2) can make 2x as many vectorized operations.

Currently, this is an
O(NxM) algorith, with memory order 1.

Maybe for vectorization, we can make this a
memory order (NxM) and computational order (NxM) algorithm.

Anyway, just some thoughts.

stefanv

This looks ready to go? @hmaarrfk Were there still noticeable slowdowns?

stefanv · 2018-10-24T18:08:30Z

skimage/transform/_warps_cy.pyx

    Parameters
    ----------
-    x, y : double
+    x, y : np_float


Considering our other discussion, do you think np_float is unambiguous here?

haha totally, i'll revert the notation!

…lation of images.

hmaarrfk · 2018-10-25T02:46:58Z

@stefanv I'm really not sure how to interpret them :/

       before           after         ratio
     [6f45aaa0]       [e4864e44]
     <master>         <feature_fused_warping>
+         149±9μs        302±100μs     2.03  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.uint16'>, 128, 0)
+         125±9μs          193±5μs     1.55  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.uint8'>, 128, 0)
+        375±20μs        581±200μs     1.55  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.uint8'>, 128, 1)
+         122±2μs          188±4μs     1.54  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.float64'>, 128, 0)
+     1.13±0.05ms       1.73±0.2ms     1.53  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.uint16'>, 128, 3)
+        433±10μs        647±100μs     1.49  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.uint16'>, 128, 1)
+         375±9μs         535±20μs     1.43  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.uint16'>, 128, 1)
+         372±6μs         500±30μs     1.34  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.uint8'>, 128, 1)
+         376±9μs         505±10μs     1.34  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.float64'>, 128, 1)
+         153±3μs         201±20μs     1.31  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.uint8'>, 128, 0)
+        145±10μs          188±4μs     1.29  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.float32'>, 128, 0)
+     1.02±0.01ms      1.21±0.03ms     1.19  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.uint16'>, 128, 3)
-        561±20ms         483±10ms     0.86  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.float32'>, 4096, 1)
-         266±6ms          219±7ms     0.82  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.float64'>, 4096, 0)
-        267±30ms          220±3ms     0.82  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.float64'>, 4096, 0)
-        258±20ms          196±3ms     0.76  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.float32'>, 4096, 0)
-      10.2±0.3ms       7.63±0.2ms     0.75  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.uint16'>, 1024, 0)
-        15.4±1ms       10.5±0.7ms     0.68  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.float64'>, 1024, 0)
-      10.9±0.8ms       7.36±0.2ms     0.68  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.uint8'>, 1024, 0)
-        264±40ms          173±4ms     0.66  benchmarks_transform_warp.WarpSuite.time_to_float64(<class 'numpy.uint8'>, 4096, 0)
-         295±8ms          176±4ms     0.60  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.float32'>, 4096, 0)
-      13.2±0.7ms       7.86±0.2ms     0.60  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.float32'>, 1024, 0)
-         286±6ms         150±10ms     0.52  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.uint16'>, 4096, 0)
-         279±9ms          121±8ms     0.43  benchmarks_transform_warp.WarpSuite.time_same_type(<class 'numpy.uint8'>, 4096, 0)

hmaarrfk · 2019-02-28T01:00:22Z

I think @jni is more ahead of me on this one. I just want to see the results of a command like

asv continuous

jni · 2019-02-28T03:37:29Z

@stefanv see #3486, which aims to supersede this one.

jni · 2019-02-28T03:39:49Z

@hmaarrfk those benchmarks look like small sizes might be dominated by the dispatch time, whereas big sizes benefit from type-specific computing?

What asv command did you use to run that?

hmaarrfk · 2019-02-28T03:40:51Z

I wish I remembered.....

hmaarrfk · 2019-02-28T03:44:10Z

Here is the docs for it

https://asv.readthedocs.io/en/stable/commands.html#asv-continuous

hmaarrfk force-pushed the feature_fused_warping branch from afde931 to 689245e Compare July 7, 2018 02:36

hmaarrfk changed the title ~~Enable warping with more than just doubles.~~ WIP: Needs deprecation decision : Enable warping with more than just doubles. Jul 7, 2018

hmaarrfk mentioned this pull request Jul 7, 2018

Remove cython md5 hashing since it breaks the build process #3254

Merged

9 tasks

This comment has been minimized.

Sign in to view

hmaarrfk force-pushed the feature_fused_warping branch from 1cd0abe to 8262d73 Compare July 10, 2018 03:32

hmaarrfk changed the title ~~WIP: Needs deprecation decision : Enable warping with more than just doubles.~~ Enable warping with more than just doubles. Jul 10, 2018

hmaarrfk mentioned this pull request Jul 10, 2018

benchmark: add some warping benchmarks to create a baseline #3260

Closed

9 tasks

hmaarrfk force-pushed the feature_fused_warping branch from 06df6da to 7c51f55 Compare September 7, 2018 11:04

hmaarrfk force-pushed the feature_fused_warping branch from 7c51f55 to a0e2a5d Compare September 7, 2018 11:12

jni requested changes Sep 7, 2018

View reviewed changes

hmaarrfk commented Sep 7, 2018

View reviewed changes

soupault added this to the 0.15 milestone Sep 7, 2018

soupault added the ⏩ type: Enhancement Improve existing features label Sep 7, 2018

stefanv reviewed Sep 7, 2018

View reviewed changes

hmaarrfk force-pushed the feature_fused_warping branch from e79e15a to 4b8a863 Compare September 10, 2018 00:45

jni mentioned this pull request Oct 8, 2018

Denoise throws ValueError #3449

Closed

5 tasks

AetherUnbound mentioned this pull request Oct 11, 2018

[WIP] Add fused types, attempt to use in denoise #3469

Closed

9 tasks

hmaarrfk force-pushed the feature_fused_warping branch 2 times, most recently from 2126022 to f88a814 Compare October 11, 2018 19:38

hmaarrfk force-pushed the feature_fused_warping branch 4 times, most recently from ee4ad77 to 0ca1d93 Compare October 15, 2018 04:08

jni mentioned this pull request Oct 19, 2018

Use fused types in denoise, warp #3486

Merged

9 tasks

hmaarrfk mentioned this pull request Oct 21, 2018

Improvement: Add doc notes on setting up the build environment #3472

Merged

4 tasks

stefanv reviewed Oct 24, 2018

View reviewed changes

hmaarrfk added 4 commits October 24, 2018 22:09

MNT: Add a fused numeric type to make fused_types more constent.

59fc309

MNT: make the interpolation use fused types for any->any type interpo…

a53bfa6

…lation of images.

FEAT: Enable warping with more than just floats

df62a50

BENCH: Benchmark for warping with many types

e4864e4

rename floats on docsring to be explicit about supported types

fc4e60a

hmaarrfk force-pushed the feature_fused_warping branch from 9fa5823 to fc4e60a Compare October 25, 2018 02:48

hmaarrfk closed this Feb 28, 2019



		def _local_binary_pattern(double[:, ::1] image,
		def _local_binary_pattern(cnp.float64_t[:, ::1] image,

Uh oh!

Conversation

hmaarrfk commented Jul 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

For reviewers

Uh oh!

jni commented Jul 7, 2018

Uh oh!

This comment has been minimized.

hmaarrfk commented Jul 8, 2018 via email

Uh oh!

pep8speaks commented Jul 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated on October 25, 2018 at 02:48 Hours UTC

Uh oh!

codecov-io commented Jul 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hmaarrfk commented Jul 10, 2018

Uh oh!

hmaarrfk commented Sep 7, 2018

Uh oh!

jni left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

soupault Sep 8, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hmaarrfk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hmaarrfk commented Sep 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stefanv left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hmaarrfk commented Sep 7, 2018

Uh oh!

hmaarrfk commented Sep 10, 2018

Uh oh!

hmaarrfk commented Jul 7, 2018 •

edited

Loading

pep8speaks commented Jul 10, 2018 •

edited

Loading

codecov-io commented Jul 10, 2018 •

edited

Loading

soupault Sep 8, 2018 •

edited

Loading

hmaarrfk commented Sep 7, 2018 •

edited

Loading