Skip to content

imgcodecs: tiff: Reduce memory usage to read 16bit image.#22404

Merged
alalek merged 5 commits intoopencv:3.4from
Kumataro:3.4-fix22388_2
Oct 3, 2022
Merged

imgcodecs: tiff: Reduce memory usage to read 16bit image.#22404
alalek merged 5 commits intoopencv:3.4from
Kumataro:3.4-fix22388_2

Conversation

@Kumataro
Copy link
Copy Markdown
Contributor

@Kumataro Kumataro commented Aug 19, 2022

Merge with extra: opencv/opencv_extra#1006
fix #22388

#22388

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
  • The PR is proposed to the proper branch
  • There is a reference to the original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix!

Please take a look on comments below.

Comment on lines +538 to +542
// buffer_size should be larger than TIFFScanlineSize().
CV_Assert((tsize_t)buffer_size >= TIFFScanlineSize(tif));

// Currently supported dst_bpp is only 16.
CV_Assert(dst_bpp == 16);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer using CV_Check*() macros instead of CV_Assert().

(int and size_t integer types are supported for now - you could try cast operands to these types)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment, I replaced CV_Assert() to CV_Check*() macros,

#else
TEST(Imgcodecs_Tiff, decode_issue22388_16UC1)
#endif
{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add CV_TEST_TAG_MEMORY_6GB.
(problem is not android only - 32-bit environments may fail too, e.g. here).

Also these tests take some extra time, so CV_TEST_TAG_LONG should be added too.

As a first line of tests add this:

applyTestTag(
    CV_TEST_TAG_MEMORY_6GB,
    CV_TEST_TAG_LONG
);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review.

Improved so that a test tag according to the required amount of memory is added before performing an image loading test.

So that, tests are no longer impacted based on differences in compilation environments.

CV_TIFF_CHECK_CALL(TIFFSetField(tif, TIFFTAG_SAMPLEFORMAT, SAMPLEFORMAT_IEEEFP));
}

CV_Assert(((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

win32 build emits this - is it expected?
(however I'm not sure about stdout/stderr mess on windows system)

[ RUN      ] Imgcodecs_Tiff.decode_tile_remainder
imread_('C:\build\precommit_windows32\build\_ocl_tmp\__opencv_temp.s92luevb\ocvF424.tmp.tiff'): can't read data: OpenCV(3.4.18-dev) C:\build\precommit_windows32\3.4\opencv\modules\imgcodecs\src\grfmt_tiff.cpp:529: error: (-215:Assertion failed) ((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb" in function 'cv::TiffDecoder::readData'

[       OK ] Imgcodecs_Tiff.decode_tile_remainder (85 ms)

http://pullrequest.opencv.org/buildbot/builders/precommit_windows32/builds/100061/steps/test_imgcodecs/logs/stdio

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux32 logs:

[ RUN      ] Imgcodecs_Tiff.decode_tile16384x16384
imread_('/tmp/__opencv_temp.hjunrS/__opencv_temp.DTpYok.tiff'): can't read data: OpenCV(3.4.18-dev) /build/precommit_linux32/3.4/opencv/modules/imgcodecs/src/grfmt_tiff.cpp:529: error: (-215:Assertion failed) ((uint64_t)tile_width0 * tile_height0 * ncn * std::max(1, (int)(bpp / bitsPerByte)) < MAX_TILE_SIZE) && "TIFF tile size is too large: >= 1Gb" in function 'readData'

[       OK ] Imgcodecs_Tiff.decode_tile16384x16384 (11528 ms)

http://pullrequest.opencv.org/buildbot/builders/precommit_linux32/builds/100035/steps/test_imgcodecs/logs/stdio

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decode_tile_reminder

decode_tile_reminder test is a test to decode tiff file in OpenCV extra. I see source code and I think temporary files are not used in tests. But error log shows used it.

https://github.com/opencv/opencv/blob/3.4.18/modules/imgcodecs/test/test_tiff.cpp#L96

I feel it's a strange phenomenon. I have not been able to reproduce it on linux32. To investigate it, I believe more testing with the new patch is necessary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decode_tile16384x16384

I believe the decode_tile16384x16384 test is affected by code that fixes a potential problem.

Currently, (a) the buffer size that is about to be allocated and (b) the buffer size that is actually allocated are different. Therefore, it can cause out of memory.

I think the new patch has fixed it.

3.4.18

https://github.com/opencv/opencv/blob/3.4.18/modules/imgcodecs/src/grfmt_tiff.cpp#L498

  • Step 1. (Line 483) Verify that the memory size to be allocated by buffer does not exceed 1GB.
  • Step 2. (Line 488-489) If dst_bpp=8, update to bpp=8, ncn=4 (to use TIFFReadRGBA* functions).
  • Step 3. (Line 496) Calculate buffer_size.
  • Step 4. (Line 497) Allocate buffer.

For example, consider executing imread( CV_8UC1 image file, IMREAD_COLOR).

  • Step1: Size check is done in 8bit GRAY color space.
  • Step2: bpp/ncn is rewritten.
  • Step3: Calculated buffer_size is based on 24bit RGBA color space.
  • Result: The actual amount of memory to allocate exceeds 1GB.

How to fix

I think it should be modified as follows.

  • Step1: If dst_bpp=8, update to bpp=8, ncn=4 (to use TIFFReadRGBA* functions).
  • Step2: Calculate buffer_size.
  • Step 3: Verify if buffer_size does not exceed 1GB.
  • Step 4: allocate buffer.

Comment on lines +57 to +58
// See https://github.com/opencv/opencv/issues/22388
string file3 = cv::tempfile(".tiff");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please left a comment that file size is about 2Mb.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review!

In order not to cause Out of memory, I thought that I should reduce the memory size to allocate. This also applies to creating test data.

Therefore, I changed the method of testing from generating test data each time to referencing the data in the OpenCV extra.

As a result, cv::tempfile() is no longer used.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW,
Unpacked files in opencv_extra: ~18Mb
Patch size for opencv_extra repository is about 40Kb (good compression?)

Looks strange, but looks good to me 👍

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch size for opencv_extra repository is about 40Kb (good compression?)

Those test tiff files contains 4 pixels in the corners and others pixels are blank (0.0.0).
This blank area is repeated (0,0,0) (0,0,0) ... So compression rate becomes high.

kmtr@kmtr-virtual-machine:~/work/opencv_extra/testdata/highgui/readwrite/huge-tiff$ tifftopnm CV_16UC4_2147483648.tif | hexdump -C
tifftopnm: writing PPM file
00000000  50 36 0a 33 32 37 36 38  20 38 31 39 32 0a 36 35  |P6.32768 8192.65|
00000010  35 33 35 0a a0 90 a1 91  a2 92 00 00 00 00 00 00  |535.............|
00000020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00030000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 b0 80  |................|
00030010  b1 81 b2 82 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00030020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
5ffd0010  00 00 00 00 c0 70 c1 71  c2 72 00 00 00 00 00 00  |.....p.q.r......|
5ffd0020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
60000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 d0 60  |...............`|
60000010  d1 61 d2 62                                       |.a.b|
60000014

Patch size for opencv_extra repository is about 40Kb (good compression?)

Yes, LZW-compressed data is able to compress with Flate compress more in sometimes.

LZW has limitation of entry size in dictionally.(Usually 12bit). If all entries in the dictionary are used, it requests to re-create new dictionary to continue compression.

So in this case, LZW-comressed data has repeated same fragment many times. For example, "ff a0 1a 03 a0 5a.." sequences are repeated at many times. And Flate compressor can be compress those sequences more .

kmtr@kmtr-virtual-machine:~/work/opencv_extra/testdata/highgui/readwrite/huge-tiff$ cat CV_16UC4_2147483648.tif | hexdump -C | grep -e " ff\s *a0\s *1a"

image

@Kumataro
Copy link
Copy Markdown
Contributor Author

Thank you very much for your review!!

I ready 32bit(i686) ubuntu environment. And thoes problems are able to reproduce. Now I'm trying to fix.

I found some additional issues around (memory/error) handling related this issue's test.

To fix it takes a few days. I'm sorry.

@asmorkalov
Copy link
Copy Markdown
Contributor

@Kumataro do you have any progress here?

@Kumataro
Copy link
Copy Markdown
Contributor Author

Kumataro commented Sep 5, 2022

Hi, I'm sorry to wait you long time to fix this. WIP is here, but it is not completed. maybe there will be small changes .

Addition support to 8bpp

(From comments) I think that tests as decode_tile_remainder and decode_tile16384x16384 request to support 8 bit images, not 16bit images.

But it needs to support conversion from (8/16bpp) x ( 1/3/4 ch) images to (8bpp) x ( 1/3 ch ) images in OpenCV is needed instead of TIFFReadRGB* functions in libtiff. They are related by mat_type of src image and imread modes. So that the number of combinations of test cases also increased very much.

https://github.com/Kumataro/opencv/blob/4c20742bbb6ed2e6231c3e308cdbe270fe94526d/modules/imgcodecs/src/grfmt_tiff.cpp#L622-L703

To reduce considerating to add test tag by hand, this test estimate the amount of memory used for destnation mat, libtiff, and work and add test tag automatically. ( However I'm also concerned that the fix is too complicated at little.)

https://github.com/Kumataro/opencv/blob/4c20742bbb6ed2e6231c3e308cdbe270fe94526d/modules/imgcodecs/test/test_tiff.cpp#L122-L219

And So far I've only tested in a 32bit environment. But Tests for 64bit environments is needed too. After that, I would like to update this MR by next Sunday.

@Kumataro
Copy link
Copy Markdown
Contributor Author

Kumataro commented Sep 5, 2022

I haven't finished testing on 64bit yet.
The committed branch is temporarily changed so that the build test by MR update does not work automatically.

@Kumataro
Copy link
Copy Markdown
Contributor Author

I am very sorry to have kept you waiting.

Based on the linux32/linux64 environment, this patch has been fixed.

TIFFReadRGBA functions for dst_bpp=8 images

By using TIFFReadScanline, the required memory size is reduced from strip/frame buffer to line buffer.

However, the TIFFReadRGB* function does a lot of conversions within the library, such as image color space conversion and color depth conversion.

Therefore, I think to keep using TIFFReadRGBA funcs for general TIFF images is better.

Impact of increased test time

Many tests have been added for some colorspace/colordepth conversions within imgcodecs. ( They are related by input tiff color space / depth and imread modes.)

If the time required for testing is too long operationally, I believe the test cases need to be reduced.

I would be happy if you could comment on this.

make_tuple("CV_16UC4", CV_16UC4), // 64bit RGBA
};

TEST_P(Imgcodecs_Tiff_decode_Huge, regression)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

216 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge (152481 ms total)

Currently these tests takes >60% of total opencv_test_imgcodecs time (even there are many test skips).

Default test set should check for regressions in OpenCV code.
We should not add more tests if OpenCV code coverage is not increased.

Please change:

  • some small subset of tests (1-5sec in total, enabled by default)
  • another "full" set of tests - mark them as DISABLED_ (GTest special prefix). They would be used for development purposes on changing of related code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your answer. Exactly the answer to the part I was having trouble with.

I split the basic tests and full tests.

After fix, basic test run at 6s with "mem_6gb" tag. It is slowly a little.

But following result is running under old PC ( i7-4770).
With recent/general pc, I think it may be more quickly.

I hope this is an acceptable result.

[----------] 2 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge
[ RUN      ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/0, where GetParam() = (1073479680, ("CV_8UC1", 0), IMREAD_COLOR)
[       OK ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/0 (3317 ms)
[ RUN      ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/1, where GetParam() = (2147483648, ("CV_16UC4", 26), IMREAD_COLOR)
[       OK ] Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge.regression/1 (3191 ms)
[----------] 2 tests from Imgcodecs_Tiff/Imgcodecs_Tiff_decode_Huge (6511 ms total)

Comment on lines +222 to +226
if ( memory_usage_total > MEMORY_USAGE_LIMIT )
{
throw SkipTestException( cv::format("Test is skipped ( memory_usage_total(%llu) > MEMORY_USAGE_LIMIT(%llu) )",
memory_usage_total, MEMORY_USAGE_LIMIT ) );
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test tags are designed to do this internally.

BTW, amount of used memory could be verified through /usr/bin/time and --gtest_filter=exact_test_name

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment!

4GiB is a threshold for memory allocation sometimes. There is no "mem_4gb" in the test tag. As one of the workarounds, I prototyped a method to directly verify the memory usage.

However, additional modifications limit the test cases that work by default. So this part is no longer needed and not important. i will remove this part.

@Kumataro
Copy link
Copy Markdown
Contributor Author

Kumataro commented Oct 1, 2022

Last week, I thought all the tests were successful. but It seems that I was mistaken. I'm very sorry for the late response. I will re-push commit at tomorrow.

  • (1) The compile warning for Win32/64 environments will be fixed tomorrow. I'm thinking it's probably a simple problem that requires a type cast.
  • (2) Regarding the timeout of the dnn module test in the Linux64+opencl environment, I think it is difficult to judge whether the imgcodecs fix has an side-effect. I think it is better to commit (1) patch and watch (2) problem is reproduced or not.

@alalek
Copy link
Copy Markdown
Member

alalek commented Oct 1, 2022

Some issues may be not related to this patch:

Update: fixed branch

@Kumataro
Copy link
Copy Markdown
Contributor Author

Kumataro commented Oct 2, 2022

Thank you very much for your comment !

compile waring C4244 on win32/64

please ignore Win64 warnings from build from 3.4 branch without patches: https://pullrequest.opencv.org/buildbot/builders/3_4-win64-vc14/builds/100005

It seems that there are same problem both win32 and win64.

https://pullrequest.opencv.org/buildbot/builders/precommit_windows32/builds/100078/steps/compile%20release/logs/warnings%20%282%29

C:\build\precommit_windows32\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'int', possible loss of data [C:\build\precommit_windows32\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]
C:\build\precommit_windows32\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'const int', possible loss of data [C:\build\precommit_windows32\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]

https://pullrequest.opencv.org/buildbot/builders/precommit_windows64/builds/100116/steps/compile%20release/logs/warnings%20%282%29

C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]
C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'const int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]

It seems that this problem has been caused by this coding.

const uint64_t huge_buffer_sizes_decode[] =
{
    (uint64_t)    1 * 1024 * 1024            ,
    (uint64_t) 1024 * 1024 * 1024 - 32768 * 4 * 2,
    (uint64_t) 1024 * 1024 * 1024            , // 1GB
    (uint64_t) 2048 * 1024 * 1024            , // 2GB
};

I replace from (uint64_t) 1 * 1024 * 1024 to 1048576ull. I think it is fixed.

const uint64_t huge_buffer_sizes_decode_Full[] =
{
    1048576ull,    // 1 * 1024 * 1024
    1073479680ull, // 1024 * 1024 * 1024 - 32768 * 4 * 2
    1073741824ull, // 1024 * 1024 * 1024
    2147483648ull, // 2048 * 1024 * 1024
};

Strictly, this constant value postfix syntax may be for C11++. But it is used in ts module.

https://github.com/opencv/opencv/blob/3.4/modules/ts/src/ts_gtest.cpp#L1830

state_ = static_cast(1103515245ULL*state_ + 12345U) % kMaxRange;

dnn+openc;

please ignore dnn+opencl (not related to this PR)

Thank you, and I will watch result of it.

const int width = 32768;
int ncn = CV_MAT_CN(mat_type);
int depth = ( CV_MAT_DEPTH(mat_type) == CV_16U) ? 2 : 1; // 16bit or 8 bit
const int height = (uint64_t) buffer_size / width / ncn / depth;
Copy link
Copy Markdown
Member

@alalek alalek Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are build warnings from MSVS 2015:

C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]
C:\build\precommit_windows64\3.4\opencv\modules\imgcodecs\test\test_tiff.cpp(97): warning C4244: 'initializing': conversion from 'uint64_t' to 'const int', possible loss of data [C:\build\precommit_windows64\build\modules\imgcodecs\opencv_test_imgcodecs.vcxproj]

Need to add static_cast / C-like cast.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your review !! I apologize for the many small mistakes.

This test uses buffer_size input parameter that is uint64. So height variable should be allow range of uint64_t. (Because there are no range-check.) And also change width variable to uint64_t too for consistency.

Copy link
Copy Markdown
Member

@alalek alalek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍
Thank you for contribution!

@alalek alalek merged commit 2f79b1b into opencv:3.4 Oct 3, 2022
@alalek alalek mentioned this pull request Oct 15, 2022
@Kumataro Kumataro mentioned this pull request Nov 12, 2022
4 tasks
@alalek alalek mentioned this pull request Jan 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

decoding Tiff file issue (-215:Assertion failed)

3 participants