Crop Mirror Normalize - HOST Tensor AVX2 Support and Vectorized HIP support for U8 - F32 / F16 by sampath1117 · Pull Request #92 · r-abishek/rpp

sampath1117 · 2022-08-10T10:14:00Z

*Added for U8 - F32, U8 - F16 Variants
*Made changes to support normalization per each R, G, B channel

…v support for crop mirror normalize

…rmance tests

r-abishek

Please go over these changes

r-abishek · 2022-08-23T03:37:23Z

src/include/cpu/rpp_cpu_simd.hpp

+
+inline void rpp_store48_f32pln3_to_f32pkd3_avx(Rpp32f *dstPtr, __m256 *p)
+{
+    __m128 p128[8];


You aren't using 8 registers below

r-abishek · 2022-08-23T03:41:15Z

src/include/cpu/rpp_cpu_simd.hpp

+    _mm256_storeu_ps(dstPtrB + 8, p[5]);
+}
+
+inline void rpp_store48_f32pln3_to_f32pkd3_avx(Rpp32f *dstPtr, __m256 *p)


Also, could you just call the rpp_store24_f32pln3_to_f32pkd3_avx() two times to process all 48

We cannot use it because since the destination R G B AVX registers are not stored in continuous indices
In p[6]
p[0], p[2], p[4] - R,G,B registers for 0 - 7 locations
p[1], p[3], p[5] - R,G,B registers for 8 - 15 locations

@sampath1117 Could we do the shuffle when its in u8?

r-abishek · 2022-08-23T03:44:48Z

src/modules/rppt_tensor_geometric_augmentations.cpp

                                              roiType,
                                              rpp::deref(rppHandle));
    }
+    else if ((srcDescPtr->dataType == RpptDataType::U8) && (dstDescPtr->dataType == RpptDataType::F32))


Keep the same order of calls for host and gpu, either is fine.

r-abishek · 2022-08-23T03:49:54Z

src/modules/hip/kernel/crop_mirror_normalize.hpp

            dstIdx += dstStridesNCH.y;

+            cmnParams_f8.f4[0] = (float4)meanTensor[incrementPerImage + 1];          // Get mean for G channel
+            cmnParams_f8.f4[1] = (float4)(1 / stdDevTensor[incrementPerImage + 1]);  // Get (1 / stdDev) for G channel


Add the comments for the R channel too

…D3 variant

r-abishek

looks good

* added avx support for exposure u8 variant * added avx support for f32,f16,i8 variants of exposure added exposure case in performance tests * updated the description for exposure tensor function * cleanup * temporary changes to resolve merge conflicts * code cleanup * removed additional clock() for exposure case in test_suite * added vectorized hip support for exposure kernel * fixed bugs in exposure hip pkd3 variant * fixed minor bug in pln1 case * restructured exposure hip vectorized codes * Add ci * resolved merge conflicts and updated codesaccording to new file structure * updated exposure hip codes according to new file structure * minor formatting changes * minor formatting changes * Remove ci Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>

sampath1117 added 10 commits June 7, 2022 12:46

added support for cmn u8 to f32/f16 on HOST and HIP

0fca76f

added support for cmn hip to accept 3 mean and 3 std devs per image

4721d6b

added fix for incorrect copy of mean and stddev values for cmn hip

a9d6583

made changes to mean and stddev indices as per the 3 mean and 3 stdde…

574a7df

…v support for crop mirror normalize

Merge branch 'master' into sr/opt_cmn_u8_f

c09c0b8

fixed issues with merge

a7b3eaf

merge with master

5ceb168

fixed the normalization issue with pkd3-pkd3 no mirror case

21f2335

minor code cleanup

d05b369

updated parameters for mean and stddev for cmn in unittests and perfo…

7fc5aa6

…rmance tests

r-abishek changed the base branch from master to ar/opt_cmn_u8_f August 23, 2022 03:25

r-abishek requested changes Aug 23, 2022

View reviewed changes

r-abishek assigned sampath1117 Aug 23, 2022

r-abishek added the enhancement New feature or request label Aug 23, 2022

r-abishek added this to the sow7ms4 milestone Aug 23, 2022

sampath1117 added 2 commits August 23, 2022 05:35

minor code cleanup

a98af04

merge with master

50e4266

sampath1117 changed the base branch from ar/opt_cmn_u8_f to master August 23, 2022 06:26

sampath1117 changed the base branch from master to ar/opt_cmn_u8_f August 23, 2022 06:26

added missing test case for cmn u8-f32, u8-f16 in performance tests

f010000

sampath1117 force-pushed the sr/opt_cmn_u8_f branch from f57d004 to f010000 Compare September 7, 2022 07:04

minor changes in test suite

21bbecf

sampath1117 force-pushed the sr/opt_cmn_u8_f branch from f85c0c4 to 21bbecf Compare September 7, 2022 07:22

sampath1117 added 4 commits September 7, 2022 08:12

made changes to process data in PKD format for cmn u8-f32/f16 PKD3-PK…

8143d47

…D3 variant

fixed saturation issue for CMN U8-F32/F16 PKD3-PKD3 cases

3919af9

removed additional spaces

2b467a2

made changes to CMN F16 variants to avoid temporary float buffers usage

15450ab

sampath1117 force-pushed the sr/opt_cmn_u8_f branch from 3b17725 to 15450ab Compare September 16, 2022 17:41

r-abishek approved these changes Sep 20, 2022

View reviewed changes

r-abishek merged commit 5d90ad6 into r-abishek:ar/opt_cmn_u8_f Sep 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crop Mirror Normalize - HOST Tensor AVX2 Support and Vectorized HIP support for U8 - F32 / F16#92

Crop Mirror Normalize - HOST Tensor AVX2 Support and Vectorized HIP support for U8 - F32 / F16#92
r-abishek merged 18 commits intor-abishek:ar/opt_cmn_u8_ffrom
sampath1117:sr/opt_cmn_u8_f

sampath1117 commented Aug 10, 2022

Uh oh!

r-abishek left a comment

Uh oh!

r-abishek Aug 23, 2022

Uh oh!

r-abishek Aug 23, 2022

Uh oh!

sampath1117 Aug 23, 2022

Uh oh!

r-abishek Aug 23, 2022

Uh oh!

r-abishek Aug 23, 2022

Uh oh!

sampath1117 Aug 23, 2022

Uh oh!

r-abishek Aug 23, 2022

Uh oh!

sampath1117 Aug 23, 2022

Uh oh!

r-abishek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sampath1117 commented Aug 10, 2022

Uh oh!

r-abishek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

r-abishek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants