Skip to content

Dilate - HOST and HIP update#667

Merged
LakshmiKumar23 merged 28 commits intoROCm:developfrom
r-abishek:ar/opt_dilate
Feb 12, 2026
Merged

Dilate - HOST and HIP update#667
LakshmiKumar23 merged 28 commits intoROCm:developfrom
r-abishek:ar/opt_dilate

Conversation

@r-abishek
Copy link
Copy Markdown
Member

Adds tensor style implementation for Dilate HOST and updates to HIP implementation.
Adds relevant QA unit and performance tests.

@r-abishek r-abishek requested a review from Copilot January 27, 2026 04:57
@r-abishek r-abishek added the enhancement New feature or request label Jan 27, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds HOST backend support for the Dilate morphological operation and updates the HIP implementation. It implements a tensor-style interface for the dilate operation with support for multiple data types (U8, I8, F16, F32) and includes comprehensive testing infrastructure.

Changes:

  • Added HOST backend support for dilate operation in test suite and implementation
  • Implemented SIMD-optimized dilate kernels for both integer and floating-point types
  • Updated API documentation to correct parameter description for dilate operation

Reviewed changes

Copilot reviewed 7 out of 17 changed files in this pull request and generated no comments.

Show a summary per file
File Description
utilities/test_suite/common.py Added "HOST" to supported backends for dilate operation (case 41)
utilities/test_suite/HOST/runImageTests.py Added "dilate" to the set of operations requiring kernel size iteration in unit and performance tests
utilities/test_suite/HOST/Tensor_image_host.cpp Added DILATE case implementation with kernel size support for multiple bit depths
src/modules/tensor/rppt_tensor_morphological_operations.cpp Implemented rppt_dilate_host function with support for U8, I8, F16, F32 data types
src/include/tensor/host_tensor_executors.hpp Added function declarations for dilate_char_host_tensor and dilate_float_host_tensor templates
src/include/common/cpu/rpp_cpu_filter.hpp Added SIMD helper functions for dilate operation (blend_shuffle_max and blend_permute_max variants) and morphological operation infrastructure
api/rppt_tensor_morphological_operations.h Added documentation for rppt_dilate_host and corrected parameter description for rppt_dilate_gpu

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov
Copy link
Copy Markdown

codecov bot commented Jan 29, 2026

Codecov Report

❌ Patch coverage is 97.58065% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...es/tensor/rppt_tensor_morphological_operations.cpp 94.92% 3 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #667      +/-   ##
===========================================
+ Coverage    92.65%   92.70%   +0.05%     
===========================================
  Files          194      196       +2     
  Lines        82918    87559    +4641     
===========================================
+ Hits         76827    81170    +4343     
- Misses        6091     6389     +298     
Files with missing lines Coverage Δ
src/include/common/cpu/rpp_cpu_filter.hpp 100.00% <100.00%> (ø)
src/modules/tensor/cpu/kernel/dilate.cpp 92.99% <ø> (ø)
src/modules/tensor/hip/kernel/dilate.cpp 99.64% <ø> (ø)
...es/tensor/rppt_tensor_morphological_operations.cpp 95.52% <94.92%> (-0.56%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kiritigowda kiritigowda self-assigned this Feb 3, 2026
// Nearest-neighbor padding
for (int i = 0; i < 8; i++)
{
int clampedX = roiBeginX + max(0, min(id_x_i + i, (roiWidth - 1))); int clampedIdx = (id_z * srcStridesNH.x) + (clampedY * srcStridesNH.y) + (clampedX * 3);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split the current line into two lines for readability and consistency with the 5x5 (and other) kernels.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@LakshmiKumar23
Copy link
Copy Markdown
Contributor

@r-abishek can you please resolve merge conflicts? Thanks!

r-abishek and others added 4 commits February 11, 2026 22:03
* Travis CI - key error fix

* Fix Bug in ColorTwist (#6) (#8) (#9)

* Added golden outputs and resolved HOST backend

* Updated bin files for median filter and resize crop mirror

* Updated bin files

* Updated bin files for the next set of kernel F32 QA

* Updated bin files for jpeg_compression_distortion

* Fixed resize QA failures

* Fix for Resize bilinear F32 QA HOST and HIP

* Fix for lens correction QA f32 for HOST and HIP for 1e-4 precision

* Fixed HIP rcm QA

* updates for warp Affine F32 QA

* Fix for RCM QA match for U8 and F32 updates AVX

* Fix for lens correction AVX

* Removed space

* Fixed warp affine for every other varient with the updated changes

* Add fixes to match precision in quantization

* Fix Precision mismatches

* Update default cutoff to 1e-5 and specialized cutoff to 1e-4

* F32 QA Fix

* Made Quality percentage as arg from testsuite

* Resolved copilot comments

* Resolved the copilot comments

* Resolved Codex comments

* HOST and HIP - pinned buffers for respective API (ROCm#628)

* Removed memcpy and used hipHostMalloc for allocation : blend

* Removed memcpy and used hipHostMalloc for allocation : brightness

* Removed memcpy and used hipHostMalloc for allocation : color cast

* Removed memcpy and used hipHostMalloc for allocation : color twist

* Removed memcpy and used hipHostMalloc for allocation : contrast

* Removed memcpy and used hipHostMalloc for allocation : crop mirror normalize

* Removed memcpy and used hipHostMalloc for allocation : Exposure

* Removed memcpy and used hipHostMalloc for allocation : Gamma correction

* Removed memcpy and used hipHostMalloc for allocation : gaussian filter

* Removed memcpy and used hipHostMalloc for allocation : Noise

* Removed memcpy and used hipHostMalloc for allocation : Non linear blend

* Removed memcpy and used hipHostMalloc for allocation : Resize mirror normalize

* Removed memcpy and used hipHostMalloc for allocation : Water

* Added hipHostFree for all kernels in test suite

* Added hipHostFree for all kernels in test suite

* Removed memcpy and used hipHostMalloc for allocation : Flip, spatter, rcm, color temperature

* Resolved copilot review comments

* Updated version

* Removed unused parameter

* Updated version in cmakeList

* removed the host to device mem copies for warp affine and rotate

* Updated version

* Removed comment

* Updated Chnagelog file

* Update patch version from 2.2.0 to 2.2.1

* Update CHANGELOG

* Address copilot comments for HIP HOST consistent allocation

* Documentation changes for updated memcpy changes

* Update ricap outer API to use pinned memory and remove mem copy

* Fix memory allocation and deallocation for permutationTensor

* Update api/rppt_tensor_effects_augmentations.h

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Fix spelling of noiseProbability and saltProbability

* Fix deallocation

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com>
Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* resolved review comments

* minor comment change

* Resolved copilot review comments

* Update src/modules/tensor/cpu/kernel/resize.cpp

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/modules/tensor/cpu/kernel/resize.cpp

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update src/modules/tensor/hip/kernel/jpeg_compression_distortion.cpp

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Updated test suite and resoled review comments

* Updated HIP for F32 QA reduction function cases

---------

Co-authored-by: Kiriti Gowda <kiriti.nageshgowda@amd.com>
Co-authored-by: Lokesh Bonta <lokeswara@multicorewareinc.com>
Co-authored-by: sampath117 <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com>
Co-authored-by: ManasaDattaT <tammisetti.manasadatta@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: hmaddise <HazarathKumar.Maddisetty@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
* added initial api support for erode

* added support for U8 and I8 bitdepths for 3, 5, 7, 9 kernel sizes

* added F16 and F32 bitdepth support

* added generic kernel support

* added golden outputs

removed commented code

* fix build errors

* Fix build and test_suite errors

* revert padding changes

* updated erode HIP kernel with latest changes

* Add F32 QA

* minor formatting fixes

* minor comment fix

* resolve copilot comments

* resolve review comments

* resolved review comments

* Add unpack templating changes and fix segmentation issue

* Fix PKD to PKD kernel 9  for Pack and Unpack changes.

* Add and template signext function

* Fix min Comments

* Fix one min Comments

* Add unroll and rename of preLoadRows

* Fix remane of Loader and MorphVecLoader

* Add empty line before comment

* Fix remove empty line, rename of kernelSze & padPolicy and remove {} for single line condition

* resolved review comments

* fix build warnings

---------

Co-authored-by: sampath1117 <sampath.rachumallu@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Mukesh Jayakodi <mukesh.jayakodi@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com>
Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
@LakshmiKumar23 LakshmiKumar23 merged commit 8f2d602 into ROCm:develop Feb 12, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci:precheckin enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants