Consistent HOST and HIP/pinned buffers for respective API by r-abishek · Pull Request #628 · ROCm/rpp

r-abishek · 2025-10-08T06:40:59Z

RPP was originally also responsible for host to hip buffer conversions. This was removed during the course of tensor implementations to ensure all RPP HOST API only have HOST buffers, and GPU API only have HIP buffers (or pinned memory for smaller argument buffers).

The following functionality were still using the old style host->hip memcopy within RPP, and this is now being removed. After this, RPP tensor API will no longer be responsible for any HOST -> HIP buffer copy. The user is responsible to provide HOST buffers for HOST API, and HIP/Pinned memory for GPU API.

copy_param_float(), copy_param_uint() etc perform these copies and are now eliminated.
Just like all other tensor functionalities, pinned memory allocation from test suite is used for samller argument buffers.

These are the changed functionalities:
exposure
blend
brightness
color cast
color twist
constrast
crop mirror normalize
gamma_correction
gaussian_filter
noise
non_linear_blend
resize_mirror_normalize
water

@rrawther Please note equivalent changes in MIVisionX would need to be merged together with this PR.
A patch version change has been done for this tentatively from 2.2.0 to 2.2.1

…rmalize

…normalize

… rcm, color temperature

Mem copy elimination

Copilot

Pull Request Overview

This PR removes internal host-to-HIP buffer copy functionality from RPP to ensure consistent memory management. GPU APIs now require users to provide HIP/pinned memory buffers directly, eliminating the copy_param_float(), copy_param_uint(), and similar helper functions that previously performed host-to-device copies within RPP.

Key changes include:

Memory allocation updated from stack arrays to hipHostMalloc in test suite
API function signatures updated to pass tensor pointers directly to HIP kernels
Memory management responsibility shifted entirely to the user

Reviewed Changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
utilities/test_suite/HIP/Tensor_image_hip.cpp	Updated to use hipHostMalloc for parameter buffers instead of stack arrays; added cleanup code
src/modules/tensor/rppt_tensor_geometric_augmentations.cpp	Removed copy_param calls; parameters now passed directly to kernels
src/modules/tensor/rppt_tensor_filter_augmentations.cpp	Removed copy_param calls for gaussian_filter
src/modules/tensor/rppt_tensor_effects_augmentations.cpp	Removed copy_param calls; added hipHostMalloc for spatter mask arrays
src/modules/tensor/rppt_tensor_color_augmentations.cpp	Removed copy_param calls for all color augmentations
src/modules/tensor/hip/kernel/*.cpp	Updated function signatures to accept tensor pointers directly
src/include/tensor/hip_tensor_executors.hpp	Updated function declarations with new parameters
CMakeLists.txt	Version bump from 2.2.0 to 2.2.1; trailing whitespace cleanup
CHANGELOG.md	Added entry for memory copy elimination

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

utilities/test_suite/HIP/Tensor_image_hip.cpp

CHANGELOG.md

Address copilot comments for HIP HOST consistent allocation

LakshmiKumar23 · 2025-11-25T22:54:44Z

@r-abishek please check and resolve conflicts

LakshmiKumar23 · 2025-11-26T18:42:40Z

@Srihari-mcw @HazarathKumarM please add the doc changes as we discussed offline

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…hanges Ricap documentation memcpy changes

* Removed memcpy and used hipHostMalloc for allocation : blend * Removed memcpy and used hipHostMalloc for allocation : brightness * Removed memcpy and used hipHostMalloc for allocation : color cast * Removed memcpy and used hipHostMalloc for allocation : color twist * Removed memcpy and used hipHostMalloc for allocation : contrast * Removed memcpy and used hipHostMalloc for allocation : crop mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Exposure * Removed memcpy and used hipHostMalloc for allocation : Gamma correction * Removed memcpy and used hipHostMalloc for allocation : gaussian filter * Removed memcpy and used hipHostMalloc for allocation : Noise * Removed memcpy and used hipHostMalloc for allocation : Non linear blend * Removed memcpy and used hipHostMalloc for allocation : Resize mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Water * Added hipHostFree for all kernels in test suite * Added hipHostFree for all kernels in test suite * Removed memcpy and used hipHostMalloc for allocation : Flip, spatter, rcm, color temperature * Resolved copilot review comments * Updated version * Removed unused parameter * Updated version in cmakeList * removed the host to device mem copies for warp affine and rotate * Updated version * Removed comment * Updated Chnagelog file * Update patch version from 2.2.0 to 2.2.1 * Update CHANGELOG * Address copilot comments for HIP HOST consistent allocation * Documentation changes for updated memcpy changes * Update ricap outer API to use pinned memory and remove mem copy * Fix memory allocation and deallocation for permutationTensor * Update api/rppt_tensor_effects_augmentations.h Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix spelling of noiseProbability and saltProbability * Fix deallocation --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* F16 variants - Update loads and stores to AVX2 - Group 4 (#627) * Make changes for exposure, log and spatter * Updates for crop mirror normalize * Fix memory issues with log 1D * Remove changes for crop mirror normalize and restore rpp_cpu_simd_load_store.hpp * Update the alignedLength for log --------- Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com> * Package - Enable Lintian Support rpp (#633) * fix lintian errors * fix lintian overrides static error * lintian errors fixed * move lintian overrides into if deb check * use existing changelog. fix formatting * not installing lintian overrides. keeping original changelog name * remove overrides --------- Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> * Docs - Bump rocm-docs-core[api_reference] from 1.27.0 to 1.29.0 in /docs/sphinx (#638) Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.27.0 to 1.29.0. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](ROCm/rocm-docs-core@v1.27.0...v1.29.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-version: 1.29.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> * Test suite - Add QA pass/fail tests for F32 bit depth (#631) * Added golden outputs and resolved HOST backend * Updated bin files for median filter and resize crop mirror * Fix for median filter F32 QA * Updated bin files * Updated rcm review comments * Updated comments for rmn * Modified bitdepths and resolved review comments * Fix typo * resolve review comments --------- Co-authored-by: sampath117 <snehaa@multicorewareinc.com> Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com> * Test Suite - Error Code Capture for all tests (#635) * Updates to capture error code * Intialize RPP_SUCCESS as default value * Update the code to display error status as part of the C++ code execution * Update rpp_test_suite_common.h * Update utilities/test_suite/HIP/Tensor_audio_hip.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update utilities/test_suite/HIP/Tensor_image_hip.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update utilities/test_suite/HIP/Tensor_misc_hip.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update utilities/test_suite/HIP/Tensor_voxel_hip.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update utilities/test_suite/HOST/Tensor_audio_host.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update utilities/test_suite/HOST/Tensor_image_host.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update utilities/test_suite/HOST/Tensor_misc_host.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update utilities/test_suite/HOST/Tensor_voxel_host.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fixes for CI issues * Restore naming convention in voxel test suite * Fix compilation issues * Update the code to use func for funcName * Modify error message * Modify the print statements --------- Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> * F16 variants - Update loads and stores to AVX2 - Group 5 (#637) * Updates for crop mirror normalize * Updated flip F16 rawC and load store modifications * Updated blend with AVX support for F16 bitdepth * Updated color cast with AVX support for F16 bitdepth * Remove empty lines * Update comments * Fix comment in common function --------- Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> * Docs - Bump rocm-docs-core[api_reference] from 1.29.0 to 1.30.0 in /docs/sphinx (#640) Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.29.0 to 1.30.0. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](ROCm/rocm-docs-core@v1.29.0...v1.30.0) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-version: 1.30.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * HOST and HIP - pinned buffers for respective API (#628) * Removed memcpy and used hipHostMalloc for allocation : blend * Removed memcpy and used hipHostMalloc for allocation : brightness * Removed memcpy and used hipHostMalloc for allocation : color cast * Removed memcpy and used hipHostMalloc for allocation : color twist * Removed memcpy and used hipHostMalloc for allocation : contrast * Removed memcpy and used hipHostMalloc for allocation : crop mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Exposure * Removed memcpy and used hipHostMalloc for allocation : Gamma correction * Removed memcpy and used hipHostMalloc for allocation : gaussian filter * Removed memcpy and used hipHostMalloc for allocation : Noise * Removed memcpy and used hipHostMalloc for allocation : Non linear blend * Removed memcpy and used hipHostMalloc for allocation : Resize mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Water * Added hipHostFree for all kernels in test suite * Added hipHostFree for all kernels in test suite * Removed memcpy and used hipHostMalloc for allocation : Flip, spatter, rcm, color temperature * Resolved copilot review comments * Updated version * Removed unused parameter * Updated version in cmakeList * removed the host to device mem copies for warp affine and rotate * Updated version * Removed comment * Updated Chnagelog file * Update patch version from 2.2.0 to 2.2.1 * Update CHANGELOG * Address copilot comments for HIP HOST consistent allocation * Documentation changes for updated memcpy changes * Update ricap outer API to use pinned memory and remove mem copy * Fix memory allocation and deallocation for permutationTensor * Update api/rppt_tensor_effects_augmentations.h Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix spelling of noiseProbability and saltProbability * Fix deallocation --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Docs - Bump rocm-docs-core[api_reference] from 1.30.0 to 1.30.1 in /docs/sphinx (#643) Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.30.0 to 1.30.1. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](ROCm/rocm-docs-core@v1.30.0...v1.30.1) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-version: 1.30.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * CMakelists - Add optional GPU targets (#641) * add optional gpu targets * add addiitonal gpu targets * Rename function - hip_exec_roi_converison_ltrb_to_xywh to hip_exec_roi_conversion_ltrb_to_xywh (#645) Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> * Docs - Update CHANGELOG.md (#646) Updates --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Abishek <52214183+r-abishek@users.noreply.github.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com> Co-authored-by: jonatluu <jonatluu@amd.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: sampath117 <snehaa@multicorewareinc.com> Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com>

* Travis CI - key error fix * Fix Bug in ColorTwist (#6) (#8) (#9) * Added golden outputs and resolved HOST backend * Updated bin files for median filter and resize crop mirror * Updated bin files * Updated bin files for the next set of kernel F32 QA * Updated bin files for jpeg_compression_distortion * Fixed resize QA failures * Fix for Resize bilinear F32 QA HOST and HIP * Fix for lens correction QA f32 for HOST and HIP for 1e-4 precision * Fixed HIP rcm QA * updates for warp Affine F32 QA * Fix for RCM QA match for U8 and F32 updates AVX * Fix for lens correction AVX * Removed space * Fixed warp affine for every other varient with the updated changes * Add fixes to match precision in quantization * Fix Precision mismatches * Update default cutoff to 1e-5 and specialized cutoff to 1e-4 * F32 QA Fix * Made Quality percentage as arg from testsuite * Resolved copilot comments * Resolved the copilot comments * Resolved Codex comments * HOST and HIP - pinned buffers for respective API (#628) * Removed memcpy and used hipHostMalloc for allocation : blend * Removed memcpy and used hipHostMalloc for allocation : brightness * Removed memcpy and used hipHostMalloc for allocation : color cast * Removed memcpy and used hipHostMalloc for allocation : color twist * Removed memcpy and used hipHostMalloc for allocation : contrast * Removed memcpy and used hipHostMalloc for allocation : crop mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Exposure * Removed memcpy and used hipHostMalloc for allocation : Gamma correction * Removed memcpy and used hipHostMalloc for allocation : gaussian filter * Removed memcpy and used hipHostMalloc for allocation : Noise * Removed memcpy and used hipHostMalloc for allocation : Non linear blend * Removed memcpy and used hipHostMalloc for allocation : Resize mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Water * Added hipHostFree for all kernels in test suite * Added hipHostFree for all kernels in test suite * Removed memcpy and used hipHostMalloc for allocation : Flip, spatter, rcm, color temperature * Resolved copilot review comments * Updated version * Removed unused parameter * Updated version in cmakeList * removed the host to device mem copies for warp affine and rotate * Updated version * Removed comment * Updated Chnagelog file * Update patch version from 2.2.0 to 2.2.1 * Update CHANGELOG * Address copilot comments for HIP HOST consistent allocation * Documentation changes for updated memcpy changes * Update ricap outer API to use pinned memory and remove mem copy * Fix memory allocation and deallocation for permutationTensor * Update api/rppt_tensor_effects_augmentations.h Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix spelling of noiseProbability and saltProbability * Fix deallocation --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * resolved review comments * minor comment change * Resolved copilot review comments * Update src/modules/tensor/cpu/kernel/resize.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/modules/tensor/cpu/kernel/resize.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/modules/tensor/hip/kernel/jpeg_compression_distortion.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Updated test suite and resoled review comments * Updated HIP for F32 QA reduction function cases --------- Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com> Co-authored-by: Lokesh Bonta <lokeswara@multicorewareinc.com> Co-authored-by: sampath117 <snehaa@multicorewareinc.com> Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: ManasaDattaT <tammisetti.manasadatta@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>

* Travis CI - key error fix * Fix Bug in ColorTwist (#6) (#8) (#9) * Added golden outputs and resolved HOST backend * Updated bin files for median filter and resize crop mirror * Updated bin files * Updated bin files for the next set of kernel F32 QA * Updated bin files for jpeg_compression_distortion * Fixed resize QA failures * Fix for Resize bilinear F32 QA HOST and HIP * Fix for lens correction QA f32 for HOST and HIP for 1e-4 precision * Fixed HIP rcm QA * updates for warp Affine F32 QA * Fix for RCM QA match for U8 and F32 updates AVX * Fix for lens correction AVX * Removed space * Fixed warp affine for every other varient with the updated changes * Add fixes to match precision in quantization * Fix Precision mismatches * Update default cutoff to 1e-5 and specialized cutoff to 1e-4 * F32 QA Fix * Made Quality percentage as arg from testsuite * Resolved copilot comments * Resolved the copilot comments * Resolved Codex comments * HOST and HIP - pinned buffers for respective API (ROCm#628) * Removed memcpy and used hipHostMalloc for allocation : blend * Removed memcpy and used hipHostMalloc for allocation : brightness * Removed memcpy and used hipHostMalloc for allocation : color cast * Removed memcpy and used hipHostMalloc for allocation : color twist * Removed memcpy and used hipHostMalloc for allocation : contrast * Removed memcpy and used hipHostMalloc for allocation : crop mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Exposure * Removed memcpy and used hipHostMalloc for allocation : Gamma correction * Removed memcpy and used hipHostMalloc for allocation : gaussian filter * Removed memcpy and used hipHostMalloc for allocation : Noise * Removed memcpy and used hipHostMalloc for allocation : Non linear blend * Removed memcpy and used hipHostMalloc for allocation : Resize mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Water * Added hipHostFree for all kernels in test suite * Added hipHostFree for all kernels in test suite * Removed memcpy and used hipHostMalloc for allocation : Flip, spatter, rcm, color temperature * Resolved copilot review comments * Updated version * Removed unused parameter * Updated version in cmakeList * removed the host to device mem copies for warp affine and rotate * Updated version * Removed comment * Updated Chnagelog file * Update patch version from 2.2.0 to 2.2.1 * Update CHANGELOG * Address copilot comments for HIP HOST consistent allocation * Documentation changes for updated memcpy changes * Update ricap outer API to use pinned memory and remove mem copy * Fix memory allocation and deallocation for permutationTensor * Update api/rppt_tensor_effects_augmentations.h Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix spelling of noiseProbability and saltProbability * Fix deallocation --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * resolved review comments * minor comment change * Resolved copilot review comments * Update src/modules/tensor/cpu/kernel/resize.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/modules/tensor/cpu/kernel/resize.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/modules/tensor/hip/kernel/jpeg_compression_distortion.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Updated test suite and resoled review comments * Updated HIP for F32 QA reduction function cases --------- Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com> Co-authored-by: Lokesh Bonta <lokeswara@multicorewareinc.com> Co-authored-by: sampath117 <snehaa@multicorewareinc.com> Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: ManasaDattaT <tammisetti.manasadatta@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>

* add support for dilate in HOST backend * minor fix in changelog * added golden outputs remove commented code * resolve build errors * Add padding changes in HIP backend * fix sigsev issues * fix QA for 9x9 kernel * Add if condition for pack function and template for unpack and signext function * Fix the rename of preLoadRows and max Comments * Fix Fix remane of Loader and MorphVecLoader * Fix empty space, dilate_row_hip_compute function, removed if & else and aligned indent R. * Fix remove whitespace and restored all unnecessary changes. * Fix remove precision line and reverted back to static cast. * Fix remove empty line, rename of kernelSze & padPolicy and remove {} for single line condition * Fix Indentation of IF condition. * resolved review comments * resolve review comments * Test suite - Add QA pass/fail tests for F32 bit depth (#665) * Travis CI - key error fix * Fix Bug in ColorTwist (#6) (#8) (#9) * Added golden outputs and resolved HOST backend * Updated bin files for median filter and resize crop mirror * Updated bin files * Updated bin files for the next set of kernel F32 QA * Updated bin files for jpeg_compression_distortion * Fixed resize QA failures * Fix for Resize bilinear F32 QA HOST and HIP * Fix for lens correction QA f32 for HOST and HIP for 1e-4 precision * Fixed HIP rcm QA * updates for warp Affine F32 QA * Fix for RCM QA match for U8 and F32 updates AVX * Fix for lens correction AVX * Removed space * Fixed warp affine for every other varient with the updated changes * Add fixes to match precision in quantization * Fix Precision mismatches * Update default cutoff to 1e-5 and specialized cutoff to 1e-4 * F32 QA Fix * Made Quality percentage as arg from testsuite * Resolved copilot comments * Resolved the copilot comments * Resolved Codex comments * HOST and HIP - pinned buffers for respective API (#628) * Removed memcpy and used hipHostMalloc for allocation : blend * Removed memcpy and used hipHostMalloc for allocation : brightness * Removed memcpy and used hipHostMalloc for allocation : color cast * Removed memcpy and used hipHostMalloc for allocation : color twist * Removed memcpy and used hipHostMalloc for allocation : contrast * Removed memcpy and used hipHostMalloc for allocation : crop mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Exposure * Removed memcpy and used hipHostMalloc for allocation : Gamma correction * Removed memcpy and used hipHostMalloc for allocation : gaussian filter * Removed memcpy and used hipHostMalloc for allocation : Noise * Removed memcpy and used hipHostMalloc for allocation : Non linear blend * Removed memcpy and used hipHostMalloc for allocation : Resize mirror normalize * Removed memcpy and used hipHostMalloc for allocation : Water * Added hipHostFree for all kernels in test suite * Added hipHostFree for all kernels in test suite * Removed memcpy and used hipHostMalloc for allocation : Flip, spatter, rcm, color temperature * Resolved copilot review comments * Updated version * Removed unused parameter * Updated version in cmakeList * removed the host to device mem copies for warp affine and rotate * Updated version * Removed comment * Updated Chnagelog file * Update patch version from 2.2.0 to 2.2.1 * Update CHANGELOG * Address copilot comments for HIP HOST consistent allocation * Documentation changes for updated memcpy changes * Update ricap outer API to use pinned memory and remove mem copy * Fix memory allocation and deallocation for permutationTensor * Update api/rppt_tensor_effects_augmentations.h Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix spelling of noiseProbability and saltProbability * Fix deallocation --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * resolved review comments * minor comment change * Resolved copilot review comments * Update src/modules/tensor/cpu/kernel/resize.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/modules/tensor/cpu/kernel/resize.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/modules/tensor/hip/kernel/jpeg_compression_distortion.cpp Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Updated test suite and resoled review comments * Updated HIP for F32 QA reduction function cases --------- Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com> Co-authored-by: Lokesh Bonta <lokeswara@multicorewareinc.com> Co-authored-by: sampath117 <snehaa@multicorewareinc.com> Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: ManasaDattaT <tammisetti.manasadatta@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com> * Erode - HOST and HIP update (#666) * added initial api support for erode * added support for U8 and I8 bitdepths for 3, 5, 7, 9 kernel sizes * added F16 and F32 bitdepth support * added generic kernel support * added golden outputs removed commented code * fix build errors * Fix build and test_suite errors * revert padding changes * updated erode HIP kernel with latest changes * Add F32 QA * minor formatting fixes * minor comment fix * resolve copilot comments * resolve review comments * resolved review comments * Add unpack templating changes and fix segmentation issue * Fix PKD to PKD kernel 9 for Pack and Unpack changes. * Add and template signext function * Fix min Comments * Fix one min Comments * Add unroll and rename of preLoadRows * Fix remane of Loader and MorphVecLoader * Add empty line before comment * Fix remove empty line, rename of kernelSze & padPolicy and remove {} for single line condition * resolved review comments * fix build warnings --------- Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com> Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Mukesh Jayakodi <mukesh.jayakodi@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com> * fix build errors --------- Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com> Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Mukesh Jayakodi <mukesh.jayakodi@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com> Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com> Co-authored-by: Lokesh Bonta <lokeswara@multicorewareinc.com> Co-authored-by: sampath117 <snehaa@multicorewareinc.com> Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com> Co-authored-by: ManasaDattaT <tammisetti.manasadatta@multicorewareinc.com> Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>

HazarathKumarM and others added 23 commits September 17, 2025 08:29

Removed memcpy and used hipHostMalloc for allocation : blend

358b187

Removed memcpy and used hipHostMalloc for allocation : brightness

59376a0

Removed memcpy and used hipHostMalloc for allocation : color cast

d8c6b15

Removed memcpy and used hipHostMalloc for allocation : color twist

ca42f58

Removed memcpy and used hipHostMalloc for allocation : contrast

6969414

Removed memcpy and used hipHostMalloc for allocation : crop mirror no…

29d776b

…rmalize

Removed memcpy and used hipHostMalloc for allocation : Exposure

cca850b

Removed memcpy and used hipHostMalloc for allocation : Gamma correction

fbc525f

Removed memcpy and used hipHostMalloc for allocation : gaussian filter

78405c2

Removed memcpy and used hipHostMalloc for allocation : Noise

9e683bb

Removed memcpy and used hipHostMalloc for allocation : Non linear blend

7d9aaef

Removed memcpy and used hipHostMalloc for allocation : Resize mirror …

5a34ce3

…normalize

Removed memcpy and used hipHostMalloc for allocation : Water

fff9abe

Added hipHostFree for all kernels in test suite

c56182a

Merge branch 'apr/mem_cpy_rm' into apr/mem_cpy_rm_set2

859ce40

Added hipHostFree for all kernels in test suite

8bf07fa

Removed memcpy and used hipHostMalloc for allocation : Flip, spatter,…

82d36fb

… rcm, color temperature

Merge remote-tracking branch 'origin' into apr/mem_cpy_rm

96b828c

Merge remote-tracking branch 'origin/develop' into apr/mem_cpy_rm

5a21572

Resolved copilot review comments

33d8876

Updated version

f61fdf9

Removed unused parameter

b68ee69

Merge pull request #496 from RooseweltMcW/apr/mem_cpy_rm

e5d2750

Mem copy elimination

r-abishek requested a review from a team as a code owner October 8, 2025 06:41

r-abishek added the ci:precheckin label Oct 8, 2025

r-abishek changed the title ~~Ar/device memcpy removal~~ Consistent HOST and HIP/pinned buffers for respective API Oct 8, 2025

Updated version in cmakeList

abce1db

kiritigowda self-assigned this Oct 9, 2025

kiritigowda requested a review from rrawther October 9, 2025 17:56

Merge branch 'develop' into ar/device_memcpy_removal

386bed1

kiritigowda and others added 2 commits November 13, 2025 10:10

Merge branch 'develop' into ar/device_memcpy_removal

2bee3c4

Merge branch 'develop' into ar/device_memcpy_removal

26ace11

kiritigowda requested a review from Copilot November 17, 2025 18:05

Copilot started reviewing on behalf of kiritigowda November 17, 2025 18:06 View session

Merge branch 'develop' into ar/device_memcpy_removal

b517c40

Copilot finished reviewing on behalf of kiritigowda November 17, 2025 18:07

Copilot AI reviewed Nov 17, 2025

View reviewed changes

utilities/test_suite/HIP/Tensor_image_hip.cpp Outdated Show resolved Hide resolved

utilities/test_suite/HIP/Tensor_image_hip.cpp Outdated Show resolved Hide resolved

CHANGELOG.md Show resolved Hide resolved

Srihari-mcw and others added 2 commits November 18, 2025 07:47

Address copilot comments for HIP HOST consistent allocation

334ef28

Merge pull request #529 from Srihari-mcw/copilot_comments_hip_host_alloc

53d5ebd

Address copilot comments for HIP HOST consistent allocation

rrawther requested a review from AryanSalmanpour November 20, 2025 03:03

Merge branch 'develop' into ar/device_memcpy_removal

8afab8e

Srihari-mcw added 2 commits November 30, 2025 17:05

Documentation changes for updated memcpy changes

c675074

Update ricap outer API to use pinned memory and remove mem copy

4a0d361

rrawther approved these changes Dec 2, 2025

View reviewed changes

Srihari-mcw and others added 5 commits December 2, 2025 13:18

Fix memory allocation and deallocation for permutationTensor

f56d3fc

Update api/rppt_tensor_effects_augmentations.h

113a8e1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Fix spelling of noiseProbability and saltProbability

e72b052

Fix deallocation

c8a9f1c

Merge pull request #537 from Srihari-mcw/ricap_documentation_memcpy_c…

25e0d09

…hanges Ricap documentation memcpy changes

LakshmiKumar23 approved these changes Dec 2, 2025

View reviewed changes

kiritigowda merged commit 0f31f4f into ROCm:develop Dec 2, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent HOST and HIP/pinned buffers for respective API#628

Consistent HOST and HIP/pinned buffers for respective API#628
kiritigowda merged 53 commits intoROCm:developfrom
r-abishek:ar/device_memcpy_removal

r-abishek commented Oct 8, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LakshmiKumar23 commented Nov 25, 2025

Uh oh!

LakshmiKumar23 commented Nov 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

r-abishek commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LakshmiKumar23 commented Nov 25, 2025

Uh oh!

LakshmiKumar23 commented Nov 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

r-abishek commented Oct 8, 2025 •

edited

Loading