Optimize cv::applyColorMap() for simple case by chacha21 · Pull Request #21645 · opencv/opencv

chacha21 · 2022-02-20T20:11:36Z

proposal for #21640
For regular cv::Mat CV_8UC1 src, applying the colormap is simpler than calling the cv::LUT() mechanism.

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

PR for opencv#21640 For regular cv::Mat CV_8UC1 src, applying the colormap is simpler than calling the cv::LUT() mechanism.

src as CV_8UC3 is handled with a BGR2GRAY conversion, the same optimized code being used afterwards

alalek · 2022-02-21T16:40:41Z

modules/imgproc/src/colormap.cpp

+            srcGray.forEach<unsigned char>([&](unsigned char& pixel, const int* position) -> void {
+                const int row = position[0];
+                const int col = position[1];
+                memcpy(dstMat.data+row*dstMat.step+col*elemSize, userColorMat.data+pixel*userColorMat.step, elemSize);


memcpy(..., elemSize)

Provide performance numbers without and with your patch.

Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz
Windows 7 64b

//testing cv::applyColorMap { cv::setRNGSeed(1234); cv::Mat colorMap = cv::Mat(256, 1, CV_8UC3); cv::randu(colorMap, 0, 255); std::vector<cv::Size> sizes = {{32, 32}, {64, 64}, {128, 128}, {256, 256}, {512, 512}, {1024, 1024}}; for(const auto& size : sizes) { cv::Mat src(size, CV_8UC1);//CV_8UC1//CV_8UC3 cv::Mat dst(size, CV_8UC3); __int64 totalTicks = 0; const size_t nbIters = 20000; for(size_t i = 0 ; i <nbIters ; ++i) { cv::randu(src, 0, 255); const auto ref = cv::getTickCount(); cv::applyColorMap(src, dst, colorMap); const auto now = cv::getTickCount(); totalTicks += (now-ref); } printf("%llu x cv::applyColorMap() for size(%d,%d) and type %s: %f ms\r\n", (unsigned long long)nbIters, size.width, size.height, (src.type() == CV_8UC1) ? "CV_8UC1" : (src.type() == CV_8UC3) ? "CV_8UC3" : "?", 1000.*totalTicks/cv::getTickFrequency()); } cv::waitKey(); }

results :

CV_8UC1
original code :
20000 x cv::applyColorMap() for size(32,32) and type CV_8UC1: 88.819806 ms
20000 x cv::applyColorMap() for size(64,64) and type CV_8UC1: 269.382595 ms
20000 x cv::applyColorMap() for size(128,128) and type CV_8UC1: 1035.405428 ms
20000 x cv::applyColorMap() for size(256,256) and type CV_8UC1: 3899.305621 ms
20000 x cv::applyColorMap() for size(512,512) and type CV_8UC1: 11115.692492 ms
20000 x cv::applyColorMap() for size(1024,1024) and type CV_8UC1: 39989.798904 ms

with patch :
20000 x cv::applyColorMap() for size(32,32) and type CV_8UC1: 105.990024 ms
20000 x cv::applyColorMap() for size(64,64) and type CV_8UC1: 178.348563 ms
20000 x cv::applyColorMap() for size(128,128) and type CV_8UC1: 390.924573 ms
20000 x cv::applyColorMap() for size(256,256) and type CV_8UC1: 1185.652095 ms
20000 x cv::applyColorMap() for size(512,512) and type CV_8UC1: 3715.295696 ms
20000 x cv::applyColorMap() for size(1024,1024) and type CV_8UC1: 15669.887541 ms

CV_8UC3
original code :
20000 x cv::applyColorMap() for size(32,32) and type CV_8UC3: 159.482952 ms
20000 x cv::applyColorMap() for size(64,64) and type CV_8UC3: 351.207949 ms
20000 x cv::applyColorMap() for size(128,128) and type CV_8UC3: 1232.039576 ms
20000 x cv::applyColorMap() for size(256,256) and type CV_8UC3: 7554.557066 ms
20000 x cv::applyColorMap() for size(512,512) and type CV_8UC3: 17448.084637 ms
20000 x cv::applyColorMap() for size(1024,1024) and type CV_8UC3: 69502.533269 ms

with patch :
20000 x cv::applyColorMap() for size(32,32) and type CV_8UC3: 145.020181 ms
20000 x cv::applyColorMap() for size(64,64) and type CV_8UC3: 263.634428 ms
20000 x cv::applyColorMap() for size(128,128) and type CV_8UC3: 690.046396 ms
20000 x cv::applyColorMap() for size(256,256) and type CV_8UC3: 1889.553916 ms
20000 x cv::applyColorMap() for size(512,512) and type CV_8UC3: 5617.506280 ms
20000 x cv::applyColorMap() for size(1024,1024) and type CV_8UC3: 22068.121624 ms

alalek · 2022-02-21T16:41:12Z

modules/imgproc/src/colormap.cpp

+            srcGray.forEach<unsigned char>([&](unsigned char& pixel, const int* position) -> void {
+                const int row = position[0];
+                const int col = position[1];
+                memcpy(dstMat.data+row*dstMat.step+col*elemSize, userColorMat.data+pixel*userColorMat.step, elemSize);


+row*dstMat.step+col*elemSize
+pixel*userColorMat.step

Manual pointer calculations should be avoided.

What would be a more canonical way to write that without branching one if() per possible userColorMat.type() ?
[edit]
Ok, figured it out

You have re-implemented LUT() functionality in this forEach block.

In general this not a good approach:

we have separate code for duplicated functionality.

code optimization level is not good (memcpy call with non-fixed dynamic elemSize value)

You should use keep using optimized LUT and tune colormap::UserColorMap implementation instead.

Unfortunately this function is not tested at all:

there are no accuracy tests

there are no performance tests.

You have re-implemented LUT() functionality in this forEach block.

Not really. cv::LUT() requires the same number of channels for src and dst, while in the context of applyColorMap(), src can be CV_8UC1, and dst CV_8UC3, so this is is more a "fast path" to avoid converting input on the sole purpose of making it "cv::LUT compliant"

there are no accuracy tests

I do not know how to show that, but I have run tests to check numerical identical results. (basically, not applying the patch, and comparing "gold" results to custom implementation through countNonZerot(absdiff(..))

there are no performance tests.

I do no know how to add such tests to the OpenCV build/test system, but I have already edited the current PR with a copy/paste of my timings

code optimization level is not good (memcpy call with non-fixed dynamic elemSize value)

branching fixed-size memcpy() for (elemSize == 1) and (elemSize == 3) improves the timings, but I thought it was not a good cv style

Not really.

Right. Got it. cv::LUT() doesn't support required mode.

To get PR merged please move optimization code to the real processing implementation:

void ColorMap::operator()(InputArray _src, OutputArray _dst) const

Next, There are several assumptions about src/dst/userColor mats. Need to add the right checks about these assumptions (code wont process 3D tensors properly). Reuse applicable checks from cv::LUT() implementation.
BTW, prefer to use CV_Check*() macros instead of CV_Assert().

BTW, probably it makes sense to deprecate and remove later support case with src.channels = 3.

rely on cv::Mat.ptr() to index data

Changes as suggested by reviewer

alalek

Thank you for update! Overall looks good.

alalek · 2022-02-28T13:22:43Z

modules/imgproc/src/colormap.cpp

+            srcGray.forEach<unsigned char>([&](unsigned char& pixel, const int* position) -> void {
+                const int row = position[0];
+                const int col = position[1];
+                dstMat.at<unsigned char>(row, col) = lut.at<unsigned char>(pixel, 0);


lut.at<unsigned char>(pixel, 0);

Please use lut.at<unsigned char>(pixel); as there is no difference in processing of 1D vectors as 2D matrices (both row/col-based are handled by .at<>() method)

.reshape() call above could be removed after that change.

alalek · 2022-02-28T13:25:11Z

modules/imgproc/src/colormap.cpp

+
+        const Mat lut = this->_lut.reshape(0, 256);//just in case LUT was in row format rather than col
+        const int lut_type = lut.type(), lut_depth = CV_MAT_DEPTH(lut_type), lut_cn = CV_MAT_CN(lut_type);
+        CV_CheckType(lut_type, (lut_depth == CV_8U) && (lut_cn == 1 || lut_cn == 3),


(lut_depth == CV_8U) && (lut_cn == 1 || lut_cn == 3)

this should be equal and simplified condition:

lut_type == CV_8UC1 || lut_type == CV_8UC3

alalek · 2022-02-28T13:28:59Z

modules/imgproc/src/colormap.cpp

+        if (lut_type == CV_8UC1)
+            srcGray.forEach<unsigned char>([&](unsigned char& pixel, const int* position) -> void {
+                const int row = position[0];
+                const int col = position[1];


_src.dims() == 2 "assumption" check should be added.

improvements suggsted by reviewer

chacha21 · 2022-03-01T07:32:38Z

original 4.x code
100000 x cv::applyColorMap(userColor) for size(32,32) and type CV_8UC1: 413.340877 ms
100000 x cv::applyColorMap(userColor) for size(64,64) and type CV_8UC1: 1322.934941 ms
100000 x cv::applyColorMap(userColor) for size(128,128) and type CV_8UC1: 5259.299441 ms
100000 x cv::applyColorMap(userColor) for size(256,256) and type CV_8UC1: 19481.239747 ms
100000 x cv::applyColorMap(userColor) for size(512,512) and type CV_8UC1: 53744.406390 ms
100000 x cv::applyColorMap(userColor) for size(1024,1024) and type CV_8UC1: 204328.241670 ms
100000 x cv::applyColorMap(JET) for size(32,32) and type CV_8UC1: 31242.315972 ms
100000 x cv::applyColorMap(JET) for size(64,64) and type CV_8UC1: 32304.962187 ms
100000 x cv::applyColorMap(JET) for size(128,128) and type CV_8UC1: 36312.397471 ms
100000 x cv::applyColorMap(JET) for size(256,256) and type CV_8UC1: 51336.647169 ms
100000 x cv::applyColorMap(JET) for size(512,512) and type CV_8UC1: 73202.010330 ms
100000 x cv::applyColorMap(JET) for size(1024,1024) and type CV_8UC1: 227124.843895 ms

with patch
100000 x cv::applyColorMap(userColor) for size(32,32) and type CV_8UC1: 491.504566 ms	=> +19%
100000 x cv::applyColorMap(userColor) for size(64,64) and type CV_8UC1: 748.580964 ms => -43%
100000 x cv::applyColorMap(userColor) for size(128,128) and type CV_8UC1: 1429.701348 ms	=> -73%
100000 x cv::applyColorMap(userColor) for size(256,256) and type CV_8UC1: 4100.422296 ms	=> -79%
100000 x cv::applyColorMap(userColor) for size(512,512) and type CV_8UC1: 12436.729648 ms	=> -77%
100000 x cv::applyColorMap(userColor) for size(1024,1024) and type CV_8UC1: 46005.752704 ms	=> -77%
100000 x cv::applyColorMap(JET) for size(32,32) and type CV_8UC1: 44762.778158 ms	=> +43%
100000 x cv::applyColorMap(JET) for size(64,64) and type CV_8UC1: 44510.404844 ms	=> +37%
100000 x cv::applyColorMap(JET) for size(128,128) and type CV_8UC1: 43447.038011 ms	=> +19%
100000 x cv::applyColorMap(JET) for size(256,256) and type CV_8UC1: 36111.826954 ms	=>-30%
100000 x cv::applyColorMap(JET) for size(512,512) and type CV_8UC1: 44628.305535 ms	=> -40%
100000 x cv::applyColorMap(JET) for size(1024,1024) and type CV_8UC1: 79051.147638 ms	=> -65%

It can be noted that small sizes seem not perform as well, especially for built-in colormaps. ~~I am not sure how to interpret that.~~ I have checked that it is related to the per-row granularity of for_each. Tuning the number of rows handled by each body of parallel_for_, so that each packet have minimum reasonable pixel count, improves performance (I am currently exploring to find a good compromise, taking inspiration from cv::LUT())

chacha21 · 2022-03-01T10:14:52Z

After tuning parallel work, timings are improved. But it is certainly heavily dependant on the numbers of CPU core/threads.


original 4.x code
100000 x cv::applyColorMap(userColor) for size(32,32) and type CV_8UC1: 413.340877 ms
100000 x cv::applyColorMap(userColor) for size(64,64) and type CV_8UC1: 1322.934941 ms
100000 x cv::applyColorMap(userColor) for size(128,128) and type CV_8UC1: 5259.299441 ms
100000 x cv::applyColorMap(userColor) for size(256,256) and type CV_8UC1: 19481.239747 ms
100000 x cv::applyColorMap(userColor) for size(512,512) and type CV_8UC1: 53744.406390 ms
100000 x cv::applyColorMap(userColor) for size(1024,1024) and type CV_8UC1: 204328.241670 ms
100000 x cv::applyColorMap(JET) for size(32,32) and type CV_8UC1: 31242.315972 ms
100000 x cv::applyColorMap(JET) for size(64,64) and type CV_8UC1: 32304.962187 ms
100000 x cv::applyColorMap(JET) for size(128,128) and type CV_8UC1: 36312.397471 ms
100000 x cv::applyColorMap(JET) for size(256,256) and type CV_8UC1: 51336.647169 ms
100000 x cv::applyColorMap(JET) for size(512,512) and type CV_8UC1: 73202.010330 ms
100000 x cv::applyColorMap(JET) for size(1024,1024) and type CV_8UC1: 227124.843895 ms

with patch
100000 x cv::applyColorMap(userColor) for size(32,32) and type CV_8UC1: 304.287106 ms	=> -26%
100000 x cv::applyColorMap(userColor) for size(64,64) and type CV_8UC1: 1093.173997 ms	=> -17%
100000 x cv::applyColorMap(userColor) for size(128,128) and type CV_8UC1: 2211.233782 ms	=> -58%
100000 x cv::applyColorMap(userColor) for size(256,256) and type CV_8UC1: 5893.139673 ms	=> -70%
100000 x cv::applyColorMap(userColor) for size(512,512) and type CV_8UC1: 17804.391272 ms	=> -67%
100000 x cv::applyColorMap(userColor) for size(1024,1024) and type CV_8UC1: 68944.028322 ms	=> -67%
100000 x cv::applyColorMap(JET) for size(32,32) and type CV_8UC1: 31083.159630 ms	=> -0.5%
100000 x cv::applyColorMap(JET) for size(64,64) and type CV_8UC1: 31872.934770 ms	=> -1.33%
100000 x cv::applyColorMap(JET) for size(128,128) and type CV_8UC1: 33071.244961 ms	=> -9%
100000 x cv::applyColorMap(JET) for size(256,256) and type CV_8UC1: 37666.202003 ms	=> -27%
100000 x cv::applyColorMap(JET) for size(512,512) and type CV_8UC1: 51594.348558 ms	=> -30%
100000 x cv::applyColorMap(JET) for size(1024,1024) and type CV_8UC1: 104773.154826 ms	=> -54%

alalek

Some comments about possible optimization improvements.

Please let me know when it is ready for merging.

alalek · 2022-03-01T11:07:54Z

modules/imgproc/src/colormap.cpp

+            const int minimalPixelsPerPacket = 1<<12;
+            const int rowsPerPacket = std::max(1, minimalPixelsPerPacket/srcGray.cols);
+            const int rowsPacketCount = (srcGray.rows+rowsPerPacket-1)/rowsPerPacket;
+            const int rows = srcGray.rows;
+            const int cols = srcGray.cols;
+            Range all(0, rowsPacketCount);


Looks like this prolog code is common between branches from lut_type == CV_8UC1 check

alalek · 2022-03-01T11:12:05Z

modules/imgproc/src/colormap.cpp

+            if (rowsPacketCount <= 1)
+              body(all);
+            else


We don't really need rowsPacketCount <= 1 code path - parallel_for() does the same thing internally.

BTW, there is third parameter of parallel_for_() - using of that parameter would allow to eliminate rowsPerPacket variable.

alalek · 2022-03-01T11:17:35Z

modules/imgproc/src/colormap.cpp

+            auto body = [&, rows, cols](const Range& range) -> void {
+                for(int row = range.start*rowsPerPacket ; row<std::min(rows, range.end*rowsPerPacket) ; ++row)
+                    for(int col = 0 ; col<cols ; ++col)
+                        dstMat.at<unsigned char>(row, col) = localLUT[srcGray.at<unsigned char>(row, col)];


Rows are stored as a contiguous arrays.
So we could replace .at() full computation of pointers with precomputed row pointer (col = 0, and .ptr() method) before inner loop:

uchar* dstRow = dstMat.ptr<uchar>(row, 0); const uchar* srcRow = srcMat.ptr<uchar>(row, 0); for(int col = 0 ; col<cols ; ++col) { dstMat[col] = localLUT[srcRow[col]]; }

alalek · 2022-03-01T11:20:47Z

modules/imgproc/src/colormap.cpp

+            unsigned char localLUT[256];//small performance improvement by bringing the full LUT locally first
+            _lut.copyTo(cv::Mat(256, 1, lut_type, localLUT));


Usually we don't need to copy data in case of 1D arrays. They are contiguous by default (except exotic col-based ROI cases).

CV_Assert(lut.isContinuous()); const uchar* lutPtr = lut.ptr<uchar>(0);

use nstripes parameter of parallel_for_ assume _lut is continuous to bring faster pixel indexing optimize src/dst access by contiguous rows of pixels do not locally copy the LUT any more, it is no more relevant with the new optimizations

chacha21 · 2022-03-01T13:32:10Z

original 4.x
100000 x cv::applyColorMap(userColor) for size(32,32) and type CV_8UC1: 413.340877 ms
100000 x cv::applyColorMap(userColor) for size(64,64) and type CV_8UC1: 1322.934941 ms
100000 x cv::applyColorMap(userColor) for size(128,128) and type CV_8UC1: 5259.299441 ms
100000 x cv::applyColorMap(userColor) for size(256,256) and type CV_8UC1: 19481.239747 ms
100000 x cv::applyColorMap(userColor) for size(512,512) and type CV_8UC1: 53744.406390 ms
100000 x cv::applyColorMap(userColor) for size(1024,1024) and type CV_8UC1: 204328.241670 ms
100000 x cv::applyColorMap(JET) for size(32,32) and type CV_8UC1: 31242.315972 ms
100000 x cv::applyColorMap(JET) for size(64,64) and type CV_8UC1: 32304.962187 ms
100000 x cv::applyColorMap(JET) for size(128,128) and type CV_8UC1: 36312.397471 ms
100000 x cv::applyColorMap(JET) for size(256,256) and type CV_8UC1: 51336.647169 ms
100000 x cv::applyColorMap(JET) for size(512,512) and type CV_8UC1: 73202.010330 ms
100000 x cv::applyColorMap(JET) for size(1024,1024) and type CV_8UC1: 227124.843895 ms

with patch
100000 x cv::applyColorMap(userColor) for size(32,32) and type CV_8UC1: 137.762362 ms => -67%
100000 x cv::applyColorMap(userColor) for size(64,64) and type CV_8UC1: 474.281744 ms => -64%
100000 x cv::applyColorMap(userColor) for size(128,128) and type CV_8UC1: 1154.152432 ms => -78%
100000 x cv::applyColorMap(userColor) for size(256,256) and type CV_8UC1: 3046.539612 ms => -84%
100000 x cv::applyColorMap(userColor) for size(512,512) and type CV_8UC1: 8813.734257 ms => -84%
100000 x cv::applyColorMap(userColor) for size(1024,1024) and type CV_8UC1: 32218.623244 ms => -84%
100000 x cv::applyColorMap(JET) for size(32,32) and type CV_8UC1: 30845.112720 ms => -1.3%
100000 x cv::applyColorMap(JET) for size(64,64) and type CV_8UC1: 31141.354067 ms => -3.6%
100000 x cv::applyColorMap(JET) for size(128,128) and type CV_8UC1: 31967.281620 ms => -12%
100000 x cv::applyColorMap(JET) for size(256,256) and type CV_8UC1: 35685.712188 ms => -30%
100000 x cv::applyColorMap(JET) for size(512,512) and type CV_8UC1: 40515.865259 ms => -45%
100000 x cv::applyColorMap(JET) for size(1024,1024) and type CV_8UC1: 62885.093060 ms => -72%

alalek

Well done! Thank you for contribution 👍

…mized Optimize cv::applyColorMap() for simple case * Optimize cv::applyColorMap() for simple case PR for 21640 For regular cv::Mat CV_8UC1 src, applying the colormap is simpler than calling the cv::LUT() mechanism. * add support for src as CV_8UC3 src as CV_8UC3 is handled with a BGR2GRAY conversion, the same optimized code being used afterwards * code style rely on cv::Mat.ptr() to index data * Move new implementation to ColorMap::operator() Changes as suggested by reviewer * style improvements suggsted by reviewer * typo * tune parallel work * better usage of parallel_for_ use nstripes parameter of parallel_for_ assume _lut is continuous to bring faster pixel indexing optimize src/dst access by contiguous rows of pixels do not locally copy the LUT any more, it is no more relevant with the new optimizations

chacha21 added 2 commits February 20, 2022 21:09

Optimize cv::applyColorMap() for simple case

82750bc

PR for opencv#21640 For regular cv::Mat CV_8UC1 src, applying the colormap is simpler than calling the cv::LUT() mechanism.

add support for src as CV_8UC3

018353c

src as CV_8UC3 is handled with a BGR2GRAY conversion, the same optimized code being used afterwards

alalek reviewed Feb 21, 2022

View reviewed changes

chacha21 added 2 commits February 22, 2022 11:35

code style

68668d3

rely on cv::Mat.ptr() to index data

Move new implementation to ColorMap::operator()

f569d18

Changes as suggested by reviewer

alalek reviewed Feb 28, 2022

View reviewed changes

chacha21 added 2 commits March 1, 2022 08:29

style

7baba85

improvements suggsted by reviewer

typo

8238b06

tune parallel work

e137161

alalek reviewed Mar 1, 2022

View reviewed changes

better usage of parallel_for_

61aef0a

use nstripes parameter of parallel_for_ assume _lut is continuous to bring faster pixel indexing optimize src/dst access by contiguous rows of pixels do not locally copy the LUT any more, it is no more relevant with the new optimizations

alalek approved these changes Mar 1, 2022

View reviewed changes

alalek merged commit ebb6915 into opencv:4.x Mar 1, 2022

opencv-pushbot mentioned this pull request Apr 23, 2022

(5.x) Merge 4.x #21903

Merged

		unsigned char localLUT[256];//small performance improvement by bringing the full LUT locally first
		_lut.copyTo(cv::Mat(256, 1, lut_type, localLUT));

Uh oh!

Conversation

chacha21 commented Feb 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 Feb 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 Feb 22, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 commented Mar 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chacha21 commented Mar 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chacha21 commented Mar 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alalek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chacha21 commented Feb 20, 2022 •

edited

Loading

chacha21 Feb 21, 2022 •

edited

Loading

chacha21 Feb 22, 2022 •

edited

Loading

chacha21 commented Mar 1, 2022 •

edited

Loading

chacha21 commented Mar 1, 2022 •

edited

Loading

chacha21 commented Mar 1, 2022 •

edited

Loading