Improve SSIM documentation and warn about data range.#6595
Improve SSIM documentation and warn about data range.#6595stefanv merged 3 commits intoscikit-image:mainfrom mcourteaux:patch-1
Conversation
Many researchers rely on this function, but gives in case of floating-point data wrong results. I added some explanation on what might go wrong and how to accommodate for this. Personally, I'm in my fourth year of my PhD research and only find this out now. Probably, hundreds of papers have published incorrect SSIMs due to this subtlety.
mkcor
left a comment
There was a problem hiding this comment.
Thank you for this contribution! It should be highly valuable to many users.
| `dtype_range` in `skimage.util.dtype.py` has defined intervals from -1 to | ||
| +1. This yields an estimate of 2, instead of 1, which is most often |
There was a problem hiding this comment.
... for signed integers. Then the question is that of type inference.
Maybe link to https://scikit-image.org/docs/stable/user_guide/data_types.html as well.
|
Perhaps we should error out unless the data range is explicitly provided? |
Yes! It would be safer this way. These docs improvements would go into the error message then. Only error when floating-point input images? |
Implementation would be simpler to always error, but will break a bunch of code unnecessarily. Erroring on float only is therefore slightly preferable, IMO. |
|
I think erroring is extremely useful actually. As this change propagates through to people across the world, they'll have to pause for a second and think about this. I would guess many people will figure out their results were wrong. In general, this is going to be annoying for a lot of individuals, but at least now they know, and it feels like the most honest thing to do. |
Co-authored-by: Marianne Corvellec <marianne.corvellec@ens-lyon.org>
|
Sort of the only case where the estimate is going to be reliably correct is if you have uint8 data. All other cases can be vague and misleading. You very easily have a numpy array of uint8 type where in some calculation you add a |
|
All makes good sense, I agree with your assessment. |
|
@mcourteaux would you like to implement the erroring behaviour on floating-point data here, by adding new commits? @stefanv or should we merge this first (since this PR already brings an improvement) and let @mcourteaux submit a follow-up PR? |
I could give it a go. Currently on a break. Will do this in a couple of days. Feel free to also just merge it and then in some days, I submit a new PR. |
Wonderful! I have just approved this PR; it takes an approval by another maintainer before we can merge it. Your second PR will be most welcome. Thanks again, @mcourteaux! |
|
Thanks! I filed a tracking issue. |
Many researchers rely on this function, but gives in case of floating-point data wrong results. I added some explanation on what might go wrong and how to accommodate for this. Personally, I'm in my fourth year of my PhD research and only find this out now. Probably, hundreds of papers have published incorrect SSIMs due to this subtlety.