-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
PixelAccess should use Unsigned ints #455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
And upon further review -- I mode images are treated as Int32 in convert and imagemath, so I may be reverting that part. |
|
It is not clear from the documentation what the 'I' mode means (or actually what documentation counts as 'official'). According to this, "I" means "integer". Somewhere, I remember clearly seeing that without flags, 'I' is an unsigned integer, but now I can't locate it. What is clear is that other integer types ("I; 8", "I; 16", "I; 32") must have an 'S' flag in order to be considered signed, and this is how they are implemented in libImaging/Unpack.c. So a bare 'I' meaning signed would be the single exception to the convention used for all other integer types. In libImaging/Unpack.c, 'I' is not explicitly converted and uses a different conversion function than "I; 32S" or "I; 32", but if the result of the conversion ends up as Int32 like all the others, the implication is that 'I' is signed. Is there a definitive piece of documentation that says wether mode 'I' is signed or unsigned? I suspect that if you were to poll people as to what they expect, you'll find that "imaging" people expect pixels to be unsigned unless otherwise specified (like the tiff spec, most of the integers in PIL, the output of Analog/Digital converters, etc), while general CS and math people would expect pixels to be signed unless otherwise specified, just like the various integers in C. All integer types are converted to Int32 in libImaging/Unpack.c. The use of Int32 for the "universal" integer type means that unsigned 32-bit pixels ('I; 32') are actually not supported by this library since it will consider the upper-range of this type to be negative. Is it practical to keep track of an external sign flag and putting conditionals around all the math operations that need it? Without a flag or using 64-bit integers, its only possible to approximate correctness by scaling UInt32 pixel values to fit within the signed Int32 range, which doesn't appear like what gets done in this case. Although more correct than what is done now, scaling pixel values would be less desirable than clean support for both signed and unsigned 32-bit values. |
|
Travis fails on |
|
Yeah don't merge as is. I think I'm going to roll back the i32 changes and go from there. |
|
This and the corresponding lack of documentation:
leads to the issues here. It's not only signed/unsigned, it's endian issues as well. There was a previous patch were I started out interpreting the I16 mode as Uint16, native endian, until some parts of it convinced me that it was supposed to be little endian. Now, at least in the tiff and numpy/fromarray conversions, it's little endian. Somewhere in the searching here, I've run across a note that makes me think I was wrong, and that I16 is native. The only saving grace is that there's a note in the code saying that the support for I16 is experimental, so there whouldn't be that much code relying on it. I agree that the I;32 mode appears signed according to all the code, and that it should more properly be called I;32S. I'm not sure that I want to go changing that now. I don't want to get into auto scaling or anything, as there's a pretty good history of just truncating or otherwise doing the least likely thing when promoting or demoting between image formats that differ only in bit width. I think that the first thing to do is to gather the documentation in my head, the various *.c and *.h files, and in this thread and others and make a set of bits/bytes in memory layout doc. Like any sufficiently advanced system, there will be inconsistencies. I'd recommend creating explicit modes for the various signedness and endianess combinations and encouraging people to use those. |
|
Now targeting 2.4.0 |
|
The I;32 mode certainly functions as signed, wether that's intended or not. Essentially, what I'm suggesting is that before diving into fixing this, it may be worth considering pulling in primitive math and conversion functionality for generic datatypes from some other well-established library. The strategy of "up-casting" to Int32 is not a great solution, and may not be worth fixing. |
|
I highly doubt that we'd be pulling in numpy or c++ as a dependency. I wouldn't say never, because that's a very long time. But I just don't see it happening. Right now, I'm not clear on what (further) specific problem needs fixing, other than the I32 modes are poorly documented and inconsistent if you assume what's there based on the naming convention. Also, there's no unsigned 32 bit mode. FWIW, my impression of the non-8 bit modes are that they're edge cases, mainly used by people who've put stuff together on their own and are interpreting scientific imagery. They tend to get used in odd ways on non-standard machines. I'm open to changing that impression, but I'd need to hear from a survey of PIL users. |
|
I guess it depends what you do with imagery. You are right - mainly I deal with scientific and medical images, which are consistently > 8 bits. Extended bit depths are making their way from professional/scientific fields to pro-am/consumer fields with the popularity of high dynamic range (HDR) photography, for example. The surge in popularity for Python in other fields of scientific/high-performance computing is remarkable, and image processing lags behind. The lack of proper support for > 8-bit image formats in Python is not helping. Tiff has had support for non-8-bit images since at least the mid-80s. Support for signed pixels is far less important than support for extended bit-depths in general, but the unsigned 32-bit mode is not as important as the unsigned 16-bit mode which is used extensively in scientific, medical and HDR imagery. So in that sense, you are right: Fixing and documenting the I32 and I16 modes (esp. unsigned) would be enough for now, and dropping support for unsigned 32bits is certainly more expedient than restructuring this library for proper support of all possible numeric types. I would like to help testing/coding if you can provide a little guidance. On Jan 5, 2014, at 7:26 PM, wiredfool wrote:
|
|
My current position is:
Edge case is perhaps a bit harsh. Without use cases, I have no idea what people are going to cram into a tiff file. It's not called Thousands of Incompatible File Formats for nothing. I'd love to hear from the scientific/medical imaging community to hear what they're using Pil/low for, and what are the pain points. |
Fixes #452, The Pixel access object should use unsigned ints when dealing with unsigned formats.