Skip to content

Conversation

@wiredfool
Copy link
Member

Fixes #452, The Pixel access object should use unsigned ints when dealing with unsigned formats.

@wiredfool
Copy link
Member Author

And upon further review -- I mode images are treated as Int32 in convert and imagemath, so I may be reverting that part.

@igg
Copy link

igg commented Dec 24, 2013

It is not clear from the documentation what the 'I' mode means (or actually what documentation counts as 'official'). According to this, "I" means "integer". Somewhere, I remember clearly seeing that without flags, 'I' is an unsigned integer, but now I can't locate it. What is clear is that other integer types ("I; 8", "I; 16", "I; 32") must have an 'S' flag in order to be considered signed, and this is how they are implemented in libImaging/Unpack.c. So a bare 'I' meaning signed would be the single exception to the convention used for all other integer types. In libImaging/Unpack.c, 'I' is not explicitly converted and uses a different conversion function than "I; 32S" or "I; 32", but if the result of the conversion ends up as Int32 like all the others, the implication is that 'I' is signed. Is there a definitive piece of documentation that says wether mode 'I' is signed or unsigned?

I suspect that if you were to poll people as to what they expect, you'll find that "imaging" people expect pixels to be unsigned unless otherwise specified (like the tiff spec, most of the integers in PIL, the output of Analog/Digital converters, etc), while general CS and math people would expect pixels to be signed unless otherwise specified, just like the various integers in C.

All integer types are converted to Int32 in libImaging/Unpack.c. The use of Int32 for the "universal" integer type means that unsigned 32-bit pixels ('I; 32') are actually not supported by this library since it will consider the upper-range of this type to be negative. Is it practical to keep track of an external sign flag and putting conditionals around all the math operations that need it? Without a flag or using 64-bit integers, its only possible to approximate correctness by scaling UInt32 pixel values to fit within the signed Int32 range, which doesn't appear like what gets done in this case. Although more correct than what is done now, scaling pixel values would be less desirable than clean support for both signed and unsigned 32-bit values.

@aclark4life
Copy link
Member

Travis fails on

running test_imagemath ...
Tests/test_imagemath.py:31: assert_equal(pixel(Imagath.eval("-A", images)), "I -1") failed:
- got 'I 4294967295', expected 'I -1'
Tests/test_imagemath.py:35: assert_equal(pixel(Imagath.eval("A-B", images)), "I -1") failed:
- got 'I 4294967295', expected 'I -1'

@wiredfool
Copy link
Member Author

Yeah don't merge as is. I think I'm going to roll back the i32 changes and go from there.

@wiredfool
Copy link
Member Author

This and the corresponding lack of documentation:

It is not clear from the documentation what the 'I' mode means"

leads to the issues here. It's not only signed/unsigned, it's endian issues as well. There was a previous patch were I started out interpreting the I16 mode as Uint16, native endian, until some parts of it convinced me that it was supposed to be little endian. Now, at least in the tiff and numpy/fromarray conversions, it's little endian. Somewhere in the searching here, I've run across a note that makes me think I was wrong, and that I16 is native. The only saving grace is that there's a note in the code saying that the support for I16 is experimental, so there whouldn't be that much code relying on it.

I agree that the I;32 mode appears signed according to all the code, and that it should more properly be called I;32S. I'm not sure that I want to go changing that now. I don't want to get into auto scaling or anything, as there's a pretty good history of just truncating or otherwise doing the least likely thing when promoting or demoting between image formats that differ only in bit width.

I think that the first thing to do is to gather the documentation in my head, the various *.c and *.h files, and in this thread and others and make a set of bits/bytes in memory layout doc. Like any sufficiently advanced system, there will be inconsistencies. I'd recommend creating explicit modes for the various signedness and endianess combinations and encouraging people to use those.

@aclark4life
Copy link
Member

Now targeting 2.4.0

@igg
Copy link

igg commented Jan 2, 2014

The I;32 mode certainly functions as signed, wether that's intended or not.
What are the chances of dumping all the pixel handling code here, and simply using numpy for it?
Is a numpy dependency really out of the question?
The alternative is a re-implementation of numpy's generic datatype handling functionality, which is what's done now using the Int32 up-casting strategy. Up-casting to a native/universal datatype is not really the best way of handling generics because this tradeoff sacrifices RAM and CPU efficiency for code-size, and it sets an inherent limit on what datatype can be handled (for e.g., the unsigned 32-bit type cannot be handled correctly by this library as written). A better strategy is to have all of the possible datatypes resolve to datatype-specific code-blocks, and hide this behind C++ templates or equivalent C macros.
Is a dependency on a C++ compiler out of the question also?

Essentially, what I'm suggesting is that before diving into fixing this, it may be worth considering pulling in primitive math and conversion functionality for generic datatypes from some other well-established library. The strategy of "up-casting" to Int32 is not a great solution, and may not be worth fixing.

@wiredfool
Copy link
Member Author

I highly doubt that we'd be pulling in numpy or c++ as a dependency. I wouldn't say never, because that's a very long time. But I just don't see it happening.

Right now, I'm not clear on what (further) specific problem needs fixing, other than the I32 modes are poorly documented and inconsistent if you assume what's there based on the naming convention. Also, there's no unsigned 32 bit mode.

FWIW, my impression of the non-8 bit modes are that they're edge cases, mainly used by people who've put stuff together on their own and are interpreting scientific imagery. They tend to get used in odd ways on non-standard machines. I'm open to changing that impression, but I'd need to hear from a survey of PIL users.

@igg
Copy link

igg commented Jan 7, 2014

I guess it depends what you do with imagery.
In my case, 8-bit images are "edge cases" - they almost certainly mean that the original images were mishandled somehow before I got them.
I have 8-bit vacation pics of course like everyone else, but I have yet to find a reason to do something with them in Python.

You are right - mainly I deal with scientific and medical images, which are consistently > 8 bits.
The imaging pieces in NumPy and SciPy all rely on PIL to deal with image file formats. NumPy and SciPy have growing popularity in the image processing field. You can Google for "Python 16-bit Tiff" to see the pages and pages of PIL woes, and the various ways of getting around the bugs that have been in PIL for a very long time.

Extended bit depths are making their way from professional/scientific fields to pro-am/consumer fields with the popularity of high dynamic range (HDR) photography, for example. The surge in popularity for Python in other fields of scientific/high-performance computing is remarkable, and image processing lags behind. The lack of proper support for > 8-bit image formats in Python is not helping.

Tiff has had support for non-8-bit images since at least the mid-80s.
Photoshop has had it for at least two decades. ImageMagick at least since the late 90s.
Preview.app and the Finder.app's "Quick Preview" (or whatever that's called) on OS X has worked with > 8-bit images for at least 5 years now. The preview functionality for > 8-bit images in the file browser even works with post-XP Windows (!).
So instead of thinking in terms of edge-cases, maybe the question should be, why should the de-facto standard Python imaging library be so far behind seemingly everything else?

Support for signed pixels is far less important than support for extended bit-depths in general, but the unsigned 32-bit mode is not as important as the unsigned 16-bit mode which is used extensively in scientific, medical and HDR imagery. So in that sense, you are right: Fixing and documenting the I32 and I16 modes (esp. unsigned) would be enough for now, and dropping support for unsigned 32bits is certainly more expedient than restructuring this library for proper support of all possible numeric types.

I would like to help testing/coding if you can provide a little guidance.

On Jan 5, 2014, at 7:26 PM, wiredfool wrote:

I highly doubt that we'd be pulling in numpy or c++ as a dependency. I wouldn't say never, because that's a very long time. But I just don't see it happening.

Right now, I'm not clear on what (further) specific problem needs fixing, other than the I32 modes are poorly documented and inconsistent if you assume what's there based on the naming convention. Also, there's no unsigned 32 bit mode.

FWIW, my impression of the non-8 bit modes are that they're edge cases, mainly used by people who've put stuff together on their own and are interpreting scientific imagery. They tend to get used in odd ways on non-standard machines. I'm open to changing that impression, but I'd need to hear from a survey of PIL users.


Reply to this email directly or view it on GitHub:
#455 (comment)

@wiredfool
Copy link
Member Author

My current position is:

  • The I16 negative values were a bug, are fixed in this pull request, and will be merged soon. If not independently, then part of my cffi-pixelaccess branch.
  • There is currently no support for unsigned I32 images. I'm not against them, assuming that they're added with proper documentation and testing. I'll need a good use story before adding them myself though.
  • There is a lack of documentation about the I* modes.
  • I'm not going to drop support for any modes
  • I'm reluctant to change existing modes in an incompatible manner without a really good reason. So, making I32 unsigned is not going to happen. But having an I32S mode, slowly deprecating I32, and making an I32U (or something) may.

Edge case is perhaps a bit harsh. Without use cases, I have no idea what people are going to cram into a tiff file. It's not called Thousands of Incompatible File Formats for nothing. I'd love to hear from the scientific/medical imaging community to hear what they're using Pil/low for, and what are the pain points.

@wiredfool wiredfool mentioned this pull request Jan 9, 2014
@wiredfool wiredfool merged commit 1dd80b2 into python-pillow:master Jan 31, 2014
@wiredfool wiredfool deleted the i16-pixelaccess branch April 4, 2014 20:35
@radarhere radarhere changed the title PixelAccess should use Unsigned ints. PixelAccess should use Unsigned ints Mar 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reading and converting 16-bit pixels

3 participants