RPP - float, int & tensor support: required for RALI-SOW3 by LokeshBonta · Pull Request #32 · ROCm/rpp

LokeshBonta · 2020-08-06T18:18:31Z

Major Work:

Given support for 7 variations for fused functions, u8-u8, u8-f32, u8-f16, u8-i8, i8-i8, f32-f32, f16-f16
Crop, Resize Crop, Crop Mirror Normalize, Rotate, Resize, Resize Crop Mirror
Unit testing framework and script

rrawther · 2020-08-07T18:39:48Z

src/modules/cl/kernel/crop_mirror_normalize.cl

+      dst_pixIdx += dst_inc[id_z];
+    }
+  } else {
+    for (indextmp = 0; indextmp < channel; indextmp++) {


consider loop unrolling and vector datatypes for better performance

rrawther · 2020-08-07T18:41:00Z

src/modules/cl/kernel/crop_mirror_normalize.cl

+      (id_x + id_y * max_dst_width[id_z]) * out_plnpkdind;
+  if ((id_x < dst_width[id_z]) && (id_y < dst_height[id_z])) {
+    for (indextmp = 0; indextmp < channel; indextmp++) {
+      output[dst_pixIdx] = (half)((input[src_pixIdx] - local_mean) / 255.0 * local_std_dev);


avoid division for all constant divisors. Use multiply by inverse instead. Applicable to all kernels

rrawther · 2020-08-07T18:44:12Z

src/modules/cl/kernel/resize.cl

+
+  unsigned int pixId;
+  pixId = id_x + id_y * dest_width + id_z * dest_width * dest_height;
+  A = srcPtr[x + y * source_width + id_z * source_height * source_width];


consider doing more work by using vector datatypes

rrawther · 2020-08-07T18:44:53Z

src/modules/cl/kernel/resize.cl

+           const unsigned int dest_height, const unsigned int dest_width,
+           const unsigned int channel) {
+  int A, B, C, D, x, y, index, pixVal;
+  float x_ratio = ((float)(source_width - 1)) / dest_width;


it is better is pass x_ratio and y_ratio instead of computing every time

rrawther · 2020-08-07T18:46:35Z

src/modules/cl/kernel/rotate.cl

+  int id_y = get_global_id(1);
+  int id_z = get_global_id(2);
+
+  int xc = id_x - dest_width / 2;


use >>1 instead of /2

rrawther · 2020-08-07T21:50:24Z

src/modules/cpu/host_fused_functions.hpp

-            color_twist_host(srcPtr, batch_srcSizeMax[batchCount], dstPtr, alpha, beta, hueShift, saturationFactor, chnFormat, channel);
+            color_twist_host(srcPtrImage, batch_srcSizeMax[batchCount], dstPtrImage, alpha, beta, hueShift, saturationFactor, chnFormat, channel);
+
+            if (outputFormatToggle == 1)


this looks very inefficient. Need to revisit

rrawther · 2020-08-07T23:39:54Z

src/modules/cpu/host_fused_functions.hpp

+            xG = _mm_loadu_ps(srcPtrTempG);
+            xB = _mm_loadu_ps(srcPtrTempB);
+
+            xR = _mm_div_ps(xR, pFactor);


please use mulps instead. True for all constant divisors

kiritigowda

@rrawther let me know this is good to merge. it LGTM.

rrawther · 2020-08-14T17:27:35Z

@kiritigowda : Pavel found some issues with GPU flow. Waiting for the status of that to merge

kiritigowda · 2020-08-21T14:43:45Z

Here is an overview of what got changed by this pull request:

Issues
======
+ Solved 1
- Added 6
           

Complexity increasing per file
==============================
- utilities/rpp-unittests/SOW3_HOST/tensorDifference.py  1
         

Clones added
============
- utilities/rpp-unittests/OCL/BatchPD_ocl_pkd3.cpp  63
- utilities/rpp-unittests/SOW3_OCL/BatchPD_ocl_pkd3.cpp  24
- src/modules/cl/cl_declarations.hpp  1
- utilities/rpp-unittests/SOW3_HOST/BatchPD_host_pkd3.cpp  22
- utilities/rpp-unittests/HOST/BatchPD_host_pkd3.cpp  101
- utilities/rpp-unittests/HOST/Single_host.cpp  4
- utilities/rpp-unittests/SOW3_HOST/BatchPD_host_pln1.cpp  22
- src/modules/cl/cl_fused_functions.cpp  3
- utilities/rpp-unittests/SOW3_OCL/BatchPD_ocl_pln1.cpp  23
- utilities/rpp-unittests/SOW3_HOST/BatchPD_host_pln3.cpp  24
- utilities/rpp-unittests/SOW3_OCL/BatchPD_ocl_pln3.cpp  25
- utilities/rpp-unittests/HIP/Single_hip.cpp  9
- src/include/cpu/rpp_cpu_common.hpp  12
- utilities/rpp-unittests/HOST/BatchPD_host_pln1.cpp  102
- utilities/rpp-unittests/HIP/BatchPD_hip.cpp  8
- src/modules/cl/cl_color_model_conversions.cpp  1
- utilities/rpp-unittests/OCL/Single_ocl.cpp  6
- include/rppi_fused_functions.h  2
- src/modules/cl/cl_geometry_transforms.cpp  19

See the complete overview on Codacy

utilities/rpp-unittests/SOW3_OCL/testAllScript.sh

utilities/rpp-unittests/SOW3_HOST/testAllScript.sh

utilities/rpp-unittests/SOW3_OCL/testAllScript.sh

utilities/rpp-unittests/SOW3_HOST/testAllScript.sh

utilities/rpp-unittests/SOW3_OCL/testAllScript.sh

utilities/rpp-unittests/SOW3_HOST/testAllScript.sh

Resize Bilinear interpolation - Tensor support

muthukumaravel7 and others added 30 commits August 8, 2019 13:41

Changed Channel extract and channel combine function call

11d5977

updated erode dilate kernals [OCL]

272e733

Non Working [FULLY BUILD] code for min_max_loc and mean_stddev

3e30624

Merge branch 'master' into muthu-dev

bcd80fa

Updated Rain GPU kernel for multiple destination image calls [OCL]

0d0588d

Updated Median, Non Max and Histogram and added support for mean

6ffdb7f

Updated tensor [OCL]

165ce64

updated table lookup [OCL]

33198ba

small updates in mean and stddev [OCL]

43c74f0

Full functioning code for mean and standard deviation [OCL]

0872d29

Added Support to Min Max Location [OCL]

94bf428

Added support for gaussian_image_pyramid [OCL]

404fe4a

Added support for laplacian_image_pyramid [OCL]

9bd0e1b

small modification in LIP [OCL]

2cbc2ed

small modification in Min Max Location and Mean stddev [OCL]

2b35c35

box filter hisEq [OCL]

39a8631

Added support for gaussian filter

b2e483b

Added support for bin in Histogram [OCL]

43eb217

updated sobel [OCL]

6997deb

Merge branch 'muthu-dev' into main-hipcl-dev

be7216c

Merge branch 'abi-host-dev-ms4' into main-hipcl-dev

d5c9c3e

Update in Temperature [CPU]

9dba423

FIX SNP CPU half noise issue [OCL]

649e179

fin small change in Absolute difference [OCL]

69eaed3

Small changes in Custom convolution and table lookup [OCL}

e78648f

Fix regressions due to scripting [cl & CPU].

787662e

fix histogram [OCL]

860672c

Merge branch 'main-hipcl-dev'

55ee414

Updated snow [OCL]

d982740

updated snow [CPU]

6b5278c

rrawther reviewed Aug 7, 2020

View reviewed changes

Update README.MD

233d430

rrawther reviewed Aug 7, 2020

View reviewed changes

r-abishek added 3 commits August 7, 2020 15:12

Codacy issues corrections in utilities/rpp-unittests

d4688ae

Codacy issues corrections for resize kernel

927342e

Codacy issues corrections in utilities/rpp-unittests OCL/HIP

2efe2da

rrawther reviewed Aug 7, 2020

View reviewed changes

r-abishek and others added 6 commits August 7, 2020 16:46

Codacy issues corrections in utilities/rpp-unittests

e3f0529

Codacy issues corrections in utilities/rpp-unittests

985d36c

Fix some codecy issues

e3cde57

Merge branch 'master' of https://github.com/LokeshBonta/rpp

a7b9800

Remove some Codecy issues in rpp unnittests

5352c9f

Remove a few codecy issues

d721224

kiritigowda approved these changes Aug 14, 2020

View reviewed changes

Remove Print statements

a82f7af