imgcodecs: tiff: Reduce memory usage to read 16bit image.#22404
imgcodecs: tiff: Reduce memory usage to read 16bit image.#22404alalek merged 5 commits intoopencv:3.4from
Conversation
alalek
left a comment
There was a problem hiding this comment.
Thank you for the fix!
Please take a look on comments below.
modules/imgcodecs/src/grfmt_tiff.cpp
Outdated
| // buffer_size should be larger than TIFFScanlineSize(). | ||
| CV_Assert((tsize_t)buffer_size >= TIFFScanlineSize(tif)); | ||
|
|
||
| // Currently supported dst_bpp is only 16. | ||
| CV_Assert(dst_bpp == 16); |
There was a problem hiding this comment.
Prefer using CV_Check*() macros instead of CV_Assert().
(int and size_t integer types are supported for now - you could try cast operands to these types)
There was a problem hiding this comment.
Thank you for your comment, I replaced CV_Assert() to CV_Check*() macros,
| #else | ||
| TEST(Imgcodecs_Tiff, decode_issue22388_16UC1) | ||
| #endif | ||
| { |
There was a problem hiding this comment.
Please add CV_TEST_TAG_MEMORY_6GB.
(problem is not android only - 32-bit environments may fail too, e.g. here).
Also these tests take some extra time, so CV_TEST_TAG_LONG should be added too.
As a first line of tests add this:
applyTestTag(
CV_TEST_TAG_MEMORY_6GB,
CV_TEST_TAG_LONG
);
There was a problem hiding this comment.
Thank you for your review.
Improved so that a test tag according to the required amount of memory is added before performing an image loading test.
So that, tests are no longer impacted based on differences in compilation environments.
modules/imgcodecs/src/grfmt_tiff.cpp
Outdated
| CV_TIFF_CHECK_CALL(TIFFSetField(tif, TIFFTAG_SAMPLEFORMAT, SAMPLEFORMAT_IEEEFP)); | ||
| } | ||
|
|
||
| CV_Assert(((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb"); |
There was a problem hiding this comment.
win32 build emits this - is it expected?
(however I'm not sure about stdout/stderr mess on windows system)
[ RUN ] Imgcodecs_Tiff.decode_tile_remainder
imread_('C:\build\precommit_windows32\build\_ocl_tmp\__opencv_temp.s92luevb\ocvF424.tmp.tiff'): can't read data: OpenCV(3.4.18-dev) C:\build\precommit_windows32\3.4\opencv\modules\imgcodecs\src\grfmt_tiff.cpp:529: error: (-215:Assertion failed) ((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb" in function 'cv::TiffDecoder::readData'
[ OK ] Imgcodecs_Tiff.decode_tile_remainder (85 ms)
There was a problem hiding this comment.
Linux32 logs:
[ RUN ] Imgcodecs_Tiff.decode_tile16384x16384
imread_('/tmp/__opencv_temp.hjunrS/__opencv_temp.DTpYok.tiff'): can't read data: OpenCV(3.4.18-dev) /build/precommit_linux32/3.4/opencv/modules/imgcodecs/src/grfmt_tiff.cpp:529: error: (-215:Assertion failed) ((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb" in function 'readData'
[ OK ] Imgcodecs_Tiff.decode_tile16384x16384 (11528 ms)
There was a problem hiding this comment.
decode_tile_reminder
decode_tile_reminder test is a test to decode tiff file in OpenCV extra. I see source code and I think temporary files are not used in tests. But error log shows used it.
https://github.com/opencv/opencv/blob/3.4.18/modules/imgcodecs/test/test_tiff.cpp#L96
I feel it's a strange phenomenon. I have not been able to reproduce it on linux32. To investigate it, I believe more testing with the new patch is necessary.
There was a problem hiding this comment.
decode_tile16384x16384
I believe the decode_tile16384x16384 test is affected by code that fixes a potential problem.
Currently, (a) the buffer size that is about to be allocated and (b) the buffer size that is actually allocated are different. Therefore, it can cause out of memory.
I think the new patch has fixed it.
3.4.18
https://github.com/opencv/opencv/blob/3.4.18/modules/imgcodecs/src/grfmt_tiff.cpp#L498
- Step 1. (Line 483) Verify that the memory size to be allocated by buffer does not exceed 1GB.
- Step 2. (Line 488-489) If dst_bpp=8, update to bpp=8, ncn=4 (to use TIFFReadRGBA* functions).
- Step 3. (Line 496) Calculate buffer_size.
- Step 4. (Line 497) Allocate buffer.
For example, consider executing imread( CV_8UC1 image file, IMREAD_COLOR).
- Step1: Size check is done in 8bit GRAY color space.
- Step2: bpp/ncn is rewritten.
- Step3: Calculated buffer_size is based on 24bit RGBA color space.
- Result: The actual amount of memory to allocate exceeds 1GB.
How to fix
I think it should be modified as follows.
- Step1: If dst_bpp=8, update to bpp=8, ncn=4 (to use TIFFReadRGBA* functions).
- Step2: Calculate buffer_size.
- Step 3: Verify if buffer_size does not exceed 1GB.
- Step 4: allocate buffer.
modules/imgcodecs/test/test_tiff.cpp
Outdated
| // See https://github.com/opencv/opencv/issues/22388 | ||
| string file3 = cv::tempfile(".tiff"); |
There was a problem hiding this comment.
Please left a comment that file size is about 2Mb.
There was a problem hiding this comment.
Thank you for your review!
In order not to cause Out of memory, I thought that I should reduce the memory size to allocate. This also applies to creating test data.
Therefore, I changed the method of testing from generating test data each time to referencing the data in the OpenCV extra.
As a result, cv::tempfile() is no longer used.
There was a problem hiding this comment.
BTW,
Unpacked files in opencv_extra: ~18Mb
Patch size for opencv_extra repository is about 40Kb (good compression?)
Looks strange, but looks good to me 👍
There was a problem hiding this comment.
Patch size for opencv_extra repository is about 40Kb (good compression?)
Those test tiff files contains 4 pixels in the corners and others pixels are blank (0.0.0).
This blank area is repeated (0,0,0) (0,0,0) ... So compression rate becomes high.
kmtr@kmtr-virtual-machine:~/work/opencv_extra/testdata/highgui/readwrite/huge-tiff$ tifftopnm CV_16UC4_2147483648.tif | hexdump -C
tifftopnm: writing PPM file
00000000 50 36 0a 33 32 37 36 38 20 38 31 39 32 0a 36 35 |P6.32768 8192.65|
00000010 35 33 35 0a a0 90 a1 91 a2 92 00 00 00 00 00 00 |535.............|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00030000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0 80 |................|
00030010 b1 81 b2 82 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00030020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
5ffd0010 00 00 00 00 c0 70 c1 71 c2 72 00 00 00 00 00 00 |.....p.q.r......|
5ffd0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
60000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0 60 |...............`|
60000010 d1 61 d2 62 |.a.b|
60000014
Patch size for opencv_extra repository is about 40Kb (good compression?)
Yes, LZW-compressed data is able to compress with Flate compress more in sometimes.
LZW has limitation of entry size in dictionally.(Usually 12bit). If all entries in the dictionary are used, it requests to re-create new dictionary to continue compression.
So in this case, LZW-comressed data has repeated same fragment many times. For example, "ff a0 1a 03 a0 5a.." sequences are repeated at many times. And Flate compressor can be compress those sequences more .
kmtr@kmtr-virtual-machine:~/work/opencv_extra/testdata/highgui/readwrite/huge-tiff$ cat CV_16UC4_2147483648.tif | hexdump -C | grep -e " ff\s *a0\s *1a"
|
Thank you very much for your review!! I ready 32bit(i686) ubuntu environment. And thoes problems are able to reproduce. Now I'm trying to fix. I found some additional issues around (memory/error) handling related this issue's test. To fix it takes a few days. I'm sorry. |
|
@Kumataro do you have any progress here? |
|
Hi, I'm sorry to wait you long time to fix this. WIP is here, but it is not completed. maybe there will be small changes . Addition support to 8bpp(From comments) I think that tests as But it needs to support conversion from (8/16bpp) x ( 1/3/4 ch) images to (8bpp) x ( 1/3 ch ) images in OpenCV is needed instead of TIFFReadRGB* functions in libtiff. They are related by mat_type of src image and imread modes. So that the number of combinations of test cases also increased very much. To reduce considerating to add test tag by hand, this test estimate the amount of memory used for destnation mat, libtiff, and work and add test tag automatically. ( However I'm also concerned that the fix is too complicated at little.) And So far I've only tested in a 32bit environment. But Tests for 64bit environments is needed too. After that, I would like to update this MR by next Sunday. |
|
I haven't finished testing on 64bit yet. |
|
I am very sorry to have kept you waiting. Based on the linux32/linux64 environment, this patch has been fixed. TIFFReadRGBA functions for dst_bpp=8 imagesBy using TIFFReadScanline, the required memory size is reduced from strip/frame buffer to line buffer. However, the TIFFReadRGB* function does a lot of conversions within the library, such as image color space conversion and color depth conversion. Therefore, I think to keep using TIFFReadRGBA funcs for general TIFF images is better. Impact of increased test timeMany tests have been added for some colorspace/colordepth conversions within imgcodecs. ( They are related by input tiff color space / depth and imread modes.) If the time required for testing is too long operationally, I believe the test cases need to be reduced. I would be happy if you could comment on this. |
| make_tuple("CV_16UC4", CV_16UC4), // 64bit RGBA | ||
| }; | ||
|
|
||
| TEST_P(Imgcodecs_Tiff_decode_Huge, regression) |
There was a problem hiding this comment.
216 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge (152481 ms total)
Currently these tests takes >60% of total opencv_test_imgcodecs time (even there are many test skips).
Default test set should check for regressions in OpenCV code.
We should not add more tests if OpenCV code coverage is not increased.
Please change:
- some small subset of tests (1-5sec in total, enabled by default)
- another "full" set of tests - mark them as
DISABLED_(GTest special prefix). They would be used for development purposes on changing of related code.
There was a problem hiding this comment.
Thank you for your answer. Exactly the answer to the part I was having trouble with.
I split the basic tests and full tests.
After fix, basic test run at 6s with "mem_6gb" tag. It is slowly a little.
But following result is running under old PC ( i7-4770).
With recent/general pc, I think it may be more quickly.
I hope this is an acceptable result.
[----------] 2 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge
[ RUN ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/0, where GetParam() = (1073479680, ("CV_8UC1", 0), IMREAD_COLOR)
[ OK ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/0 (3317 ms)
[ RUN ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/1, where GetParam() = (2147483648, ("CV_16UC4", 26), IMREAD_COLOR)
[ OK ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/1 (3191 ms)
[----------] 2 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge (6511 ms total)
modules/imgcodecs/test/test_tiff.cpp
Outdated
| if ( memory_usage_total > MEMORY_USAGE_LIMIT ) | ||
| { | ||
| throw SkipTestException( cv::format("Test is skipped ( memory_usage_total(%llu) > MEMORY_USAGE_LIMIT(%llu) )", | ||
| memory_usage_total, MEMORY_USAGE_LIMIT ) ); | ||
| } |
There was a problem hiding this comment.
Test tags are designed to do this internally.
BTW, amount of used memory could be verified through /usr/bin/time and --gtest_filter=exact_test_name
There was a problem hiding this comment.
Thank you for your comment!
4GiB is a threshold for memory allocation sometimes. There is no "mem_4gb" in the test tag. As one of the workarounds, I prototyped a method to directly verify the memory usage.
However, additional modifications limit the test cases that work by default. So this part is no longer needed and not important. i will remove this part.
|
Last week, I thought all the tests were successful. but It seems that I was mistaken. I'm very sorry for the late response. I will re-push commit at tomorrow.
|
|
Some issues may be not related to this patch:
Update: fixed branch |
|
Thank you very much for your comment ! compile waring C4244 on win32/64
It seems that there are same problem both win32 and win64.
It seems that this problem has been caused by this coding. const uint64_t huge_buffer_sizes_decode[] =
{
(uint64_t) 1 * 1024 * 1024 ,
(uint64_t) 1024 * 1024 * 1024 - 32768 * 4 * 2,
(uint64_t) 1024 * 1024 * 1024 , // 1GB
(uint64_t) 2048 * 1024 * 1024 , // 2GB
};I replace from const uint64_t huge_buffer_sizes_decode_Full[] =
{
1048576ull, // 1 * 1024 * 1024
1073479680ull, // 1024 * 1024 * 1024 - 32768 * 4 * 2
1073741824ull, // 1024 * 1024 * 1024
2147483648ull, // 2048 * 1024 * 1024
};Strictly, this constant value postfix syntax may be for C11++. But it is used in ts module. https://github.com/opencv/opencv/blob/3.4/modules/ts/src/ts_gtest.cpp#L1830
dnn+openc;
Thank you, and I will watch result of it. |
modules/imgcodecs/test/test_tiff.cpp
Outdated
| const int width = 32768; | ||
| int ncn = CV_MAT_CN(mat_type); | ||
| int depth = ( CV_MAT_DEPTH(mat_type) == CV_16U) ? 2 : 1; // 16bit or 8 bit | ||
| const int height = (uint64_t) buffer_size / width / ncn / depth; |
There was a problem hiding this comment.
There are build warnings from MSVS 2015:
C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]
C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'const int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]
Need to add static_cast / C-like cast.
There was a problem hiding this comment.
Thank you for your review !! I apologize for the many small mistakes.
This test uses buffer_size input parameter that is uint64. So height variable should be allow range of uint64_t. (Because there are no range-check.) And also change width variable to uint64_t too for consistency.
alalek
left a comment
There was a problem hiding this comment.
LGTM 👍
Thank you for contribution!

Merge with extra: opencv/opencv_extra#1006
fix #22388
#22388
Pull Request Readiness Checklist
See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request
Patch to opencv_extra has the same branch name.