imgcodecs: tiff: Reduce memory usage to read 16bit image. by Kumataro · Pull Request #22404 · opencv/opencv

Kumataro · 2022-08-19T08:57:26Z

Merge with extra: opencv/opencv_extra#1006
fix #22388

#22388

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

alalek

Thank you for the fix!

Please take a look on comments below.

alalek · 2022-08-24T22:02:07Z

modules/imgcodecs/src/grfmt_tiff.cpp

+                // buffer_size should be larger than TIFFScanlineSize().
+                CV_Assert((tsize_t)buffer_size >= TIFFScanlineSize(tif));
+
+                // Currently supported dst_bpp is only 16.
+                CV_Assert(dst_bpp == 16);


Prefer using CV_Check*() macros instead of CV_Assert().

(int and size_t integer types are supported for now - you could try cast operands to these types)

Thank you for your comment, I replaced CV_Assert() to CV_Check*() macros,

alalek · 2022-08-24T22:07:02Z

modules/imgcodecs/test/test_tiff.cpp

+#else
+TEST(Imgcodecs_Tiff, decode_issue22388_16UC1)
+#endif
+{


Please add CV_TEST_TAG_MEMORY_6GB.
(problem is not android only - 32-bit environments may fail too, e.g. here).

Also these tests take some extra time, so CV_TEST_TAG_LONG should be added too.

As a first line of tests add this:

applyTestTag( CV_TEST_TAG_MEMORY_6GB, CV_TEST_TAG_LONG );

Thank you for your review.

Improved so that a test tag according to the required amount of memory is added before performing an image loading test.

So that, tests are no longer impacted based on differences in compilation environments.

modules/imgcodecs/test/test_tiff.cpp

alalek · 2022-08-24T22:23:38Z

modules/imgcodecs/src/grfmt_tiff.cpp

                CV_TIFF_CHECK_CALL(TIFFSetField(tif, TIFFTAG_SAMPLEFORMAT, SAMPLEFORMAT_IEEEFP));
            }
+
+            CV_Assert(((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb");


win32 build emits this - is it expected?
(however I'm not sure about stdout/stderr mess on windows system)

[ RUN ] Imgcodecs_Tiff.decode_tile_remainder imread_('C:\build\precommit_windows32\build\_ocl_tmp\__opencv_temp.s92luevb\ocvF424.tmp.tiff'): can't read data: OpenCV(3.4.18-dev) C:\build\precommit_windows32\3.4\opencv\modules\imgcodecs\src\grfmt_tiff.cpp:529: error: (-215:Assertion failed) ((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb" in function 'cv::TiffDecoder::readData' [ OK ] Imgcodecs_Tiff.decode_tile_remainder (85 ms)

http://pullrequest.opencv.org/buildbot/builders/precommit_windows32/builds/100061/steps/test_imgcodecs/logs/stdio

Linux32 logs:

[ RUN ] Imgcodecs_Tiff.decode_tile16384x16384 imread_('/tmp/__opencv_temp.hjunrS/__opencv_temp.DTpYok.tiff'): can't read data: OpenCV(3.4.18-dev) /build/precommit_linux32/3.4/opencv/modules/imgcodecs/src/grfmt_tiff.cpp:529: error: (-215:Assertion failed) ((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb" in function 'readData' [ OK ] Imgcodecs_Tiff.decode_tile16384x16384 (11528 ms)

http://pullrequest.opencv.org/buildbot/builders/precommit_linux32/builds/100035/steps/test_imgcodecs/logs/stdio

decode_tile_reminder

decode_tile_reminder test is a test to decode tiff file in OpenCV extra. I see source code and I think temporary files are not used in tests. But error log shows used it.

https://github.com/opencv/opencv/blob/3.4.18/modules/imgcodecs/test/test_tiff.cpp#L96

I feel it's a strange phenomenon. I have not been able to reproduce it on linux32. To investigate it, I believe more testing with the new patch is necessary.

decode_tile16384x16384

I believe the decode_tile16384x16384 test is affected by code that fixes a potential problem.

Currently, (a) the buffer size that is about to be allocated and (b) the buffer size that is actually allocated are different. Therefore, it can cause out of memory.

I think the new patch has fixed it.

3.4.18

https://github.com/opencv/opencv/blob/3.4.18/modules/imgcodecs/src/grfmt_tiff.cpp#L498

Step 1. (Line 483) Verify that the memory size to be allocated by buffer does not exceed 1GB.

Step 2. (Line 488-489) If dst_bpp=8, update to bpp=8, ncn=4 (to use TIFFReadRGBA* functions).

Step 3. (Line 496) Calculate buffer_size.

Step 4. (Line 497) Allocate buffer.

For example, consider executing imread( CV_8UC1 image file, IMREAD_COLOR).

Step1: Size check is done in 8bit GRAY color space.

Step2: bpp/ncn is rewritten.

Step3: Calculated buffer_size is based on 24bit RGBA color space.

Result: The actual amount of memory to allocate exceeds 1GB.

How to fix

I think it should be modified as follows.

Step1: If dst_bpp=8, update to bpp=8, ncn=4 (to use TIFFReadRGBA* functions).

Step2: Calculate buffer_size.

Step 3: Verify if buffer_size does not exceed 1GB.

Step 4: allocate buffer.

alalek · 2022-08-24T22:26:49Z

modules/imgcodecs/test/test_tiff.cpp

+    // See https://github.com/opencv/opencv/issues/22388
+    string file3 = cv::tempfile(".tiff");


Please left a comment that file size is about 2Mb.

Thank you for your review!

In order not to cause Out of memory, I thought that I should reduce the memory size to allocate. This also applies to creating test data.

Therefore, I changed the method of testing from generating test data each time to referencing the data in the OpenCV extra.

As a result, cv::tempfile() is no longer used.

BTW,
Unpacked files in opencv_extra: ~18Mb
Patch size for opencv_extra repository is about 40Kb (good compression?)

Looks strange, but looks good to me 👍

Patch size for opencv_extra repository is about 40Kb (good compression?)

Those test tiff files contains 4 pixels in the corners and others pixels are blank (0.0.0).
This blank area is repeated (0,0,0) (0,0,0) ... So compression rate becomes high.

kmtr@kmtr-virtual-machine:~/work/opencv_extra/testdata/highgui/readwrite/huge-tiff$ tifftopnm CV_16UC4_2147483648.tif | hexdump -C tifftopnm: writing PPM file 00000000 50 36 0a 33 32 37 36 38 20 38 31 39 32 0a 36 35 |P6.32768 8192.65| 00000010 35 33 35 0a a0 90 a1 91 a2 92 00 00 00 00 00 00 |535.............| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00030000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 b0 80 |................| 00030010 b1 81 b2 82 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00030020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 5ffd0010 00 00 00 00 c0 70 c1 71 c2 72 00 00 00 00 00 00 |.....p.q.r......| 5ffd0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 60000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 d0 60 |...............`| 60000010 d1 61 d2 62 |.a.b| 60000014

Patch size for opencv_extra repository is about 40Kb (good compression?)

Yes, LZW-compressed data is able to compress with Flate compress more in sometimes.

LZW has limitation of entry size in dictionally.(Usually 12bit). If all entries in the dictionary are used, it requests to re-create new dictionary to continue compression.

So in this case, LZW-comressed data has repeated same fragment many times. For example, "ff a0 1a 03 a0 5a.." sequences are repeated at many times. And Flate compressor can be compress those sequences more .

kmtr@kmtr-virtual-machine:~/work/opencv_extra/testdata/highgui/readwrite/huge-tiff$ cat CV_16UC4_2147483648.tif | hexdump -C | grep -e " ff\s *a0\s *1a"

Kumataro · 2022-08-26T10:28:37Z

Thank you very much for your review!!

I ready 32bit(i686) ubuntu environment. And thoes problems are able to reproduce. Now I'm trying to fix.

I found some additional issues around (memory/error) handling related this issue's test.

To fix it takes a few days. I'm sorry.

asmorkalov · 2022-09-05T09:35:36Z

@Kumataro do you have any progress here?

Kumataro · 2022-09-05T12:59:14Z

Hi, I'm sorry to wait you long time to fix this. WIP is here, but it is not completed. maybe there will be small changes .

Addition support to 8bpp

(From comments) I think that tests as decode_tile_remainder and decode_tile16384x16384 request to support 8 bit images, not 16bit images.

But it needs to support conversion from (8/16bpp) x ( 1/3/4 ch) images to (8bpp) x ( 1/3 ch ) images in OpenCV is needed instead of TIFFReadRGB* functions in libtiff. They are related by mat_type of src image and imread modes. So that the number of combinations of test cases also increased very much.

https://github.com/Kumataro/opencv/blob/4c20742bbb6ed2e6231c3e308cdbe270fe94526d/modules/imgcodecs/src/grfmt_tiff.cpp#L622-L703

To reduce considerating to add test tag by hand, this test estimate the amount of memory used for destnation mat, libtiff, and work and add test tag automatically. ( However I'm also concerned that the fix is too complicated at little.)

https://github.com/Kumataro/opencv/blob/4c20742bbb6ed2e6231c3e308cdbe270fe94526d/modules/imgcodecs/test/test_tiff.cpp#L122-L219

And So far I've only tested in a 32bit environment. But Tests for 64bit environments is needed too. After that, I would like to update this MR by next Sunday.

Kumataro · 2022-09-05T13:12:30Z

I haven't finished testing on 64bit yet.
The committed branch is temporarily changed so that the build test by MR update does not work automatically.

Kumataro · 2022-09-11T01:07:34Z

I am very sorry to have kept you waiting.

Based on the linux32/linux64 environment, this patch has been fixed.

TIFFReadRGBA functions for dst_bpp=8 images

By using TIFFReadScanline, the required memory size is reduced from strip/frame buffer to line buffer.

However, the TIFFReadRGB* function does a lot of conversions within the library, such as image color space conversion and color depth conversion.

Therefore, I think to keep using TIFFReadRGBA funcs for general TIFF images is better.

Impact of increased test time

Many tests have been added for some colorspace/colordepth conversions within imgcodecs. ( They are related by input tiff color space / depth and imread modes.)

If the time required for testing is too long operationally, I believe the test cases need to be reduced.

I would be happy if you could comment on this.

alalek · 2022-09-11T17:59:55Z

modules/imgcodecs/test/test_tiff.cpp

+    make_tuple("CV_16UC4", CV_16UC4), // 64bit RGBA
+};
+
+TEST_P(Imgcodecs_Tiff_decode_Huge, regression)


216 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge (152481 ms total)

Currently these tests takes >60% of total opencv_test_imgcodecs time (even there are many test skips).

Default test set should check for regressions in OpenCV code.
We should not add more tests if OpenCV code coverage is not increased.

Please change:

some small subset of tests (1-5sec in total, enabled by default)

another "full" set of tests - mark them as DISABLED_ (GTest special prefix). They would be used for development purposes on changing of related code.

Thank you for your answer. Exactly the answer to the part I was having trouble with.

I split the basic tests and full tests.

After fix, basic test run at 6s with "mem_6gb" tag. It is slowly a little.

But following result is running under old PC ( i7-4770).
With recent/general pc, I think it may be more quickly.

I hope this is an acceptable result.

[----------] 2 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge [ RUN ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/0, where GetParam() = (1073479680, ("CV_8UC1", 0), IMREAD_COLOR) [ OK ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/0 (3317 ms) [ RUN ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/1, where GetParam() = (2147483648, ("CV_16UC4", 26), IMREAD_COLOR) [ OK ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/1 (3191 ms) [----------] 2 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge (6511 ms total)

alalek · 2022-09-11T18:00:02Z

modules/imgcodecs/test/test_tiff.cpp

+        if ( memory_usage_total > MEMORY_USAGE_LIMIT )
+        {
+            throw SkipTestException( cv::format("Test is skipped ( memory_usage_total(%llu) > MEMORY_USAGE_LIMIT(%llu) )",
+                memory_usage_total, MEMORY_USAGE_LIMIT ) );
+        }


Test tags are designed to do this internally.

BTW, amount of used memory could be verified through /usr/bin/time and --gtest_filter=exact_test_name

Thank you for your comment!

4GiB is a threshold for memory allocation sometimes. There is no "mem_4gb" in the test tag. As one of the workarounds, I prototyped a method to directly verify the memory usage.

However, additional modifications limit the test cases that work by default. So this part is no longer needed and not important. i will remove this part.

Kumataro · 2022-10-01T13:01:50Z

Last week, I thought all the tests were successful. but It seems that I was mistaken. I'm very sorry for the late response. I will re-push commit at tomorrow.

(1) The compile warning for Win32/64 environments will be fixed tomorrow. I'm thinking it's probably a simple problem that requires a type cast.
(2) Regarding the timeout of the dnn module test in the Linux64+opencl environment, I think it is difficult to judge whether the imgcodecs fix has an side-effect. I think it is better to commit (1) patch and watch (2) problem is reproduced or not.

alalek · 2022-10-01T18:32:09Z

Some issues may be not related to this patch:

please ignore Win64 warnings from build from 3.4 branch without patches: https://pullrequest.opencv.org/buildbot/builders/3_4-win64-vc14/builds/100005
please ignore dnn+opencl (not related to this PR)

Update: fixed branch

Kumataro · 2022-10-02T01:16:26Z

Thank you very much for your comment !

compile waring C4244 on win32/64

please ignore Win64 warnings from build from 3.4 branch without patches: https://pullrequest.opencv.org/buildbot/builders/3_4-win64-vc14/builds/100005

It seems that there are same problem both win32 and win64.

https://pullrequest.opencv.org/buildbot/builders/precommit_windows32/builds/100078/steps/compile%20release/logs/warnings%20%282%29

C:\build\precommit_windows32\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'int', possible loss of data [C:\build\precommit_windows32\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]
C:\build\precommit_windows32\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'const int', possible loss of data [C:\build\precommit_windows32\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]

https://pullrequest.opencv.org/buildbot/builders/precommit_windows64/builds/100116/steps/compile%20release/logs/warnings%20%282%29

C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]
C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'const int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]

It seems that this problem has been caused by this coding.

const uint64_t huge_buffer_sizes_decode[] =
{
    (uint64_t)    1 * 1024 * 1024            ,
    (uint64_t) 1024 * 1024 * 1024 - 32768 * 4 * 2,
    (uint64_t) 1024 * 1024 * 1024            , // 1GB
    (uint64_t) 2048 * 1024 * 1024            , // 2GB
};

I replace from (uint64_t) 1 * 1024 * 1024 to 1048576ull. I think it is fixed.

const uint64_t huge_buffer_sizes_decode_Full[] =
{
    1048576ull,    // 1 * 1024 * 1024
    1073479680ull, // 1024 * 1024 * 1024 - 32768 * 4 * 2
    1073741824ull, // 1024 * 1024 * 1024
    2147483648ull, // 2048 * 1024 * 1024
};

Strictly, this constant value postfix syntax may be for C11++. But it is used in ts module.

https://github.com/opencv/opencv/blob/3.4/modules/ts/src/ts_gtest.cpp#L1830

state_ = static_cast(1103515245ULL*state_ + 12345U) % kMaxRange;

dnn+openc;

please ignore dnn+opencl (not related to this PR)

Thank you, and I will watch result of it.

alalek · 2022-10-03T05:45:03Z

modules/imgcodecs/test/test_tiff.cpp

+        const int width  = 32768;
+        int ncn          = CV_MAT_CN(mat_type);
+        int depth        = ( CV_MAT_DEPTH(mat_type) == CV_16U) ? 2 : 1; // 16bit or 8 bit
+        const int height = (uint64_t) buffer_size / width / ncn / depth;


There are build warnings from MSVS 2015:

C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj] C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'const int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]

Need to add static_cast / C-like cast.

Thank you for your review !! I apologize for the many small mistakes.

This test uses buffer_size input parameter that is uint64. So height variable should be allow range of uint64_t. (Because there are no range-check.) And also change width variable to uint64_t too for consistency.

alalek

LGTM 👍
Thank you for contribution!

imgcodecs: tiff: Reduce memory usage to read 16bit image.

a02f1c3

asmorkalov added the category: imgcodecs label Aug 22, 2022

asmorkalov requested a review from alalek August 24, 2022 07:26

alalek reviewed Aug 24, 2022

View reviewed changes

imgcodecs: tiff: Reduce memory usage to read 8bit images

ac41f62

alalek reviewed Sep 11, 2022

View reviewed changes

imgcodecs: tiff: split basic test and full test.

c657023

imgcodecs: tiff: fix to warning C4244

0949f03

alalek reviewed Oct 3, 2022

View reviewed changes

imgcodecs: tiff: fix to warning C4244

bda7e22

alalek approved these changes Oct 3, 2022

View reviewed changes

alalek merged commit 2f79b1b into opencv:3.4 Oct 3, 2022

alalek mentioned this pull request Oct 15, 2022

(4.x) Merge 3.4 #22642

Merged

Kumataro mentioned this pull request Nov 12, 2022

Cannot open BigTiff file #22795

Closed

4 tasks

alalek mentioned this pull request Jan 8, 2023

(5.x) Merge 4.x #23113

Merged

		// See https://github.com/opencv/opencv/issues/22388
		string file3 = cv::tempfile(".tiff");

Uh oh!

Conversation

Kumataro commented Aug 19, 2022 • edited by alalek Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

decode_tile_reminder

Uh oh!

Choose a reason for hiding this comment

decode_tile16384x16384

3.4.18

How to fix

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kumataro commented Aug 26, 2022

Uh oh!

asmorkalov commented Sep 5, 2022

Uh oh!

Kumataro commented Sep 5, 2022

Addition support to 8bpp

Uh oh!

Kumataro commented Sep 5, 2022

Uh oh!

Kumataro commented Sep 11, 2022

TIFFReadRGBA functions for dst_bpp=8 images

Impact of increased test time

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Kumataro commented Oct 1, 2022

Uh oh!

alalek commented Oct 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kumataro commented Oct 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

compile waring C4244 on win32/64

dnn+openc;

Uh oh!

alalek Oct 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Kumataro commented Aug 19, 2022 •

edited by alalek

Loading

alalek commented Oct 1, 2022 •

edited

Loading

Kumataro commented Oct 2, 2022 •

edited

Loading

alalek Oct 3, 2022 •

edited

Loading