RPP Tensor Support - Swap channels on HOST and HIP by Dineshbabu-Ravichandran · Pull Request #400 · r-abishek/rpp

Dineshbabu-Ravichandran · 2025-01-31T10:29:13Z

Extends tensor support of Swap channels Augmentation for all permutation and optimized using AVX2 on HOST backend
Extends tensor support of Swap channels Augmentation for all permutation on HIP backend
Adds unit and performance tests support for the Swap channels Augmentation in test suite

…annel

Copilot

Pull Request Overview

This PR extends tensor support for the Swap Channels Augmentation on both HOST and HIP backends and adds corresponding unit and performance tests. Key changes include:

Adding SWAP_CHANNELS to the set of augmentation cases.
Updating test suite scripts and executor functions to handle a new permTensor parameter.
Modifying both HOST and HIP kernel invocations and API declarations to accommodate the swap channels permutation.

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
utilities/test_suite/rpp_test_suite_image.h	Added SWAP_CHANNELS to augmentation test cases and compare_output logic
utilities/test_suite/HOST/runImageTests.py	Added a new branch to run swap channels test variants on HOST
utilities/test_suite/HOST/Tensor_image_host.cpp	Updated swap channels handling by incorporating a permTensor parameter
utilities/test_suite/HIP/runImageTests.py	Added a new branch to run swap channels test variants on HIP
utilities/test_suite/HIP/Tensor_image_hip.cpp	Updated swap channels handling with HIP host memory allocation for permTensor
src/modules/tensor/rppt_tensor_data_exchange_operations.cpp	Modified swap channels functions to accept a permTensor parameter
src/modules/tensor/hip/kernel/swap_channels.cpp	Updated GPU kernel functions to incorporate the permTensor parameter
src/include/tensor/host_tensor_executors.hpp	Updated host function prototypes for swap channels operations
src/include/tensor/hip_tensor_executors.hpp	Updated HIP function prototypes for swap channels operations
api/rppt_tensor_data_exchange_operations.h	Updated API declarations to reflect the new swap channels parameter

Comments suppressed due to low confidence (1)

utilities/test_suite/HOST/runImageTests.py:85

[nitpick] The use of the literal "85" for the swap channels test case may reduce clarity and maintainability. Consider introducing a defined constant (e.g., SWAP_CHANNELS) to replace the magic number.

elif case == "85":

Copilot · 2025-04-17T19:23:17Z

utilities/test_suite/HIP/Tensor_image_hip.cpp

                {
                    testCaseName = "swap_channels";

+                    Rpp32u *permTensor = nullptr;


The memory allocated with hipHostMalloc for 'permTensor' is not freed after usage. Consider adding a call to hipFree(permTensor) at the end of its scope to prevent a memory leak.

r-abishek

@HazarathKumarM Please address comments to merge this PR

r-abishek · 2025-04-17T19:41:04Z

api/rppt_tensor_data_exchange_operations.h

 * \param [in] srcDescPtr source tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = 3)
 * \param [out] dstPtr destination tensor in HOST memory
 * \param [in] dstDescPtr destination tensor descriptor (Restrictions - numDims = 4, offsetInBytes >= 0, dataType = U8/F16/F32/I8, layout = NCHW/NHWC, c = same as that of srcDescPtr)
+ * \param [in] permTensor permutation tensor for swap channels operation


Just like hip, also say "in HOST memory" here

r-abishek · 2025-04-19T01:03:50Z

src/modules/tensor/cpu/kernel/swap_channels.cpp

+                    pSwap[0] = p[permTensor[0] * 2];         // channel swap
+                    pSwap[1] = p[permTensor[0] * 2 + 1];     // channel swap
+                    pSwap[2] = p[permTensor[1] * 2];         // channel swap
+                    pSwap[3] = p[permTensor[1] * 2 + 1];     // channel swap


Init an array [6] outside = {permTensor[0] * 2, permTensor[0] * 2 + 1, ...}

r-abishek · 2025-04-19T01:08:59Z

src/modules/tensor/cpu/kernel/swap_channels.cpp

+    // Adjust permutation tensor for NHWC - NCHW layout conversion
+    if((srcDescPtr->c == 3) && (srcDescPtr->layout == RpptLayout::NHWC) && (dstDescPtr->layout == RpptLayout::NCHW))
+    {
+        Rpp32u count = 0;


Why do we have this adjustment? If this is common to all bit depths, perhaps create a helper function at the top?

r-abishek · 2025-04-19T01:14:47Z

utilities/test_suite/HIP/Tensor_image_hip.cpp

                    testCaseName = "swap_channels";

+                    Rpp32u *permTensor = nullptr;
+                    CHECK_RETURN_STATUS(hipHostMalloc(&permTensor, 3 * sizeof(Rpp32u)));


copilot says there's no free for this.

r-abishek · 2025-04-19T01:16:58Z

utilities/test_suite/HIP/runImageTests.py

+            elif case == "85":
+                swapOrderRange = 6
+                # Run all variants of swap channels functions with additional argument of perm order
+                for swapOrder in range(swapOrderRange):


This needs a little more explanation in that comment.
Not quite clear what is swapOrderRange etc and what's being run

r-abishek · 2025-04-19T01:17:32Z

utilities/test_suite/HOST/runImageTests.py

+            elif case == "85":
+                swapOrderRange = 6
+                # Run all variants of swap channel functions with additional argument of swapOrder (0 - 5)
+                for swapOrder in range(swapOrderRange):


same comment

r-abishek · 2025-04-19T01:19:32Z

utilities/test_suite/rpp_test_suite_image.h

+    Rpp8u mapping[][3] = {
+        {0, 1, 2}, // axisMask 0 → R, G, B
+        {0, 2, 1}, // axisMask 1 → R, B, G
+        {1, 0, 2}, // axisMask 2 → G, R, B


Add content from this comment in one line in the python comment too. This is more clear.

Srihari-mcw · 2025-04-20T16:13:30Z

src/modules/tensor/cpu/kernel/swap_channels.cpp

+                for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrement)
+                {
+                    __m256 p[6], pSwap[6];
+                    rpp_simd_load(rpp_load48_u8pkd3_to_f32pln3_avx, srcPtrTemp, p);    // simd loads


Can't this be preferrably done with u8 load and store completely instead of conversion as it is a load store operation with some permutations?

Pls check and change across the file such instances

Srihari-mcw · 2025-04-20T16:26:45Z

src/modules/tensor/cpu/kernel/swap_channels.cpp

+                for (; vectorLoopCount < alignedLength; vectorLoopCount += vectorIncrement)
+                {
+                    __m256 p[6], pSwap[6];
+                    rpp_simd_load(rpp_load48_u8pkd3_to_f32pln3_avx, srcPtrTemp, p);    // simd loads


Pls check and change across the file such instances

* Fix header levels in changelog * Wrap bare URL in changelog

HazarathKumarM and others added 13 commits January 26, 2025 11:20

Initial version for random channel permute for HOST

c64c0ab

Implimented random channel permute for remaining bit depth on HOST

c00569a

Initial concat implementation for HIp

21660c2

Removed random channel permute implimentation and merged with swap ch…

65c7dda

…annel

deleted random channel permute kernel

c91c23b

Golden output added for all variant of swap channel

30de3d0

Minor changes

b671433

Minor changes

0dfbfa9

Changed from simd to AVX

d575067

Small changes in comments

0910b8e

Optimized for pkd variant

c2ce79e

Merge latest changes

b86571e

Fix Build errors and QA tests

d26538d

r-abishek requested a review from Copilot April 17, 2025 19:22

Copilot AI reviewed Apr 17, 2025

View reviewed changes

r-abishek requested changes Apr 19, 2025

View reviewed changes

Srihari-mcw reviewed Apr 20, 2025

View reviewed changes

Srihari-mcw requested changes Apr 20, 2025

View reviewed changes

HazarathKumarM added 2 commits April 22, 2025 03:31

resolved review comments

3c6225e

Add F32 bin files and fix Pkd3-Pkd3 QA tests

68cc6d1

r-abishek changed the base branch from develop to ar/opt_swap_channels April 22, 2025 23:11

r-abishek approved these changes Apr 22, 2025

View reviewed changes

r-abishek assigned Dineshbabu-Ravichandran Apr 22, 2025

r-abishek added the enhancement New feature or request label Apr 22, 2025

r-abishek added this to the sow12ms3 milestone Apr 22, 2025

r-abishek merged commit acecafe into r-abishek:ar/opt_swap_channels Apr 22, 2025

ManasaDattaT pushed a commit to ManasaDattaT/rpp that referenced this pull request Dec 19, 2025

Fix header levels in changelog (r-abishek#400)

e8a290a

* Fix header levels in changelog * Wrap bare URL in changelog

Conversation

Dineshbabu-Ravichandran commented Jan 31, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

r-abishek left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants