Sobel Filter on HOST and HIP by r-abishek · Pull Request #673 · ROCm/rpp

r-abishek · 2026-02-18T23:30:38Z

Adds Sobel Filter for tensor API on HOST and HIP
Adds support for U8/FF16/F32/I8 and NHWC/NCHW variants
Adds relevant unit / performance / qa tests

…ferent conditions

Sobel filter Kernel implementation

Copilot

Pull request overview

This PR adds Sobel Filter support for tensor API on both HOST and HIP backends. The implementation supports multiple data types (U8/F16/F32/I8) and layouts (NHWC/NCHW) with three kernel sizes (3x3, 5x5, 7x7) and three gradient types (X, Y, XY).

Changes:

Added rppt_sobel_filter API function with comprehensive parameter validation
Implemented CPU and HIP kernels for Sobel filter with AVX2 optimizations for HOST
Added test infrastructure including unit tests, performance tests, and QA tests

Reviewed changes

Copilot reviewed 12 out of 39 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
api/rppt_tensor_filter_augmentations.h	Added API declaration and documentation for rppt_sobel_filter
src/modules/tensor/rppt_tensor_filter_augmentations.cpp	Implemented main function with RGB to grayscale conversion support
src/modules/tensor/cpu/kernel/sobel_filter.cpp	CPU kernel implementation with AVX2 SIMD optimizations
src/include/tensor/host_tensor_executors.hpp	Added host executor function declaration
src/include/tensor/hip_tensor_executors.hpp	Added HIP executor function declaration
utilities/test_suite/rpp_test_suite_image.h	Added test infrastructure support and helper functions
utilities/test_suite/common.py	Added sobel_filter to augmentation mappings
utilities/test_suite/HOST/runImageTests.py	Added test cases for HOST backend
utilities/test_suite/HIP/runImageTests.py	Added test cases for HIP backend
utilities/test_suite/HOST/Tensor_image_host.cpp	Added HOST test executable support
utilities/test_suite/HIP/Tensor_image_hip.cpp	Added HIP test executable support
docs/data/doxygenOutputs/*.png	Added documentation images for all gradient types and kernel sizes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-18T23:35:49Z

utilities/test_suite/rpp_test_suite_image.h

 {
    Rpp32f *rowTemp, *rowTempRef, *outVal, *outRefVal, *outputTemp, *outputTempRef, *outputTempChn, *outputTempRefChn;
-    Rpp32f cutoff = (testCase == LENS_CORRECTION) ? 1e-4 : 1e-5;
+    Rpp32f cutoff = ((testCase == LENS_CORRECTION)|| (testCase == SOBEL_FILTER)) ? 1e-4 : 1e-5;


Inconsistent spacing around '||' operator. The convention in this file shows spacing on both sides of boolean operators (see line 1103). Add a space after ')' and before '||' for consistency.

Copilot · 2026-02-18T23:35:49Z

utilities/test_suite/rpp_test_suite_image.h

+    }
+}
+
+// returns the interpolation type used for image resizing or scaling operations.


The comment "returns the interpolation type used for image resizing or scaling operations." is incorrect for this function. This function actually returns a string combining kernel size and gradient type for the Sobel filter, not interpolation types. The comment should be updated to accurately describe the function's purpose, such as "returns the kernel size and gradient type for sobel filter operations."

Copilot · 2026-02-18T23:35:49Z

src/modules/tensor/rppt_tensor_filter_augmentations.cpp

+    }
+#ifdef GPU_SUPPORT
+    else if ((handleBackend == RppBackend::RPP_HIP_BACKEND) && (executionBackend == RppBackend::RPP_HIP_BACKEND))
+    {


Missing offset validation check for HIP backend. Similar filter operations (like median_filter and gaussian_filter) validate that srcDescPtr->offsetInBytes is sufficient for the kernel size with a check like: if (srcDescPtr->offsetInBytes < 12 * (kernelSize / 2)) return RPP_ERROR_LOW_OFFSET;. This validation should be added after line 558 to ensure the offset can accommodate the kernel padding requirements.

Suggested change

{

{

if (srcDescPtr->offsetInBytes < 12 * (kernelSize / 2))

return RPP_ERROR_LOW_OFFSET;

Copilot · 2026-02-18T23:35:50Z

utilities/test_suite/rpp_test_suite_image.h

+        Rpp32u kernelSize, gradientType;
+        get_kernel_size_and_gradient_type(additionalParam, kernelSize, gradientType);
+
+        func += kernelSizeAndGradientName; 


Trailing whitespace after the variable declaration. Remove the extra space after the semicolon.

Copilot · 2026-02-18T23:35:50Z

src/modules/tensor/rppt_tensor_filter_augmentations.cpp

+#ifdef GPU_SUPPORT
+    else if ((handleBackend == RppBackend::RPP_HIP_BACKEND) && (executionBackend == RppBackend::RPP_HIP_BACKEND))
+    {
+        RpptDescPtr inputDesc = srcDescPtr; 


Trailing whitespace after the declaration. Remove the extra space after the semicolon for consistency.

Suggested change

RpptDescPtr inputDesc = srcDescPtr;

RpptDescPtr inputDesc = srcDescPtr;

Copilot · 2026-02-18T23:35:50Z

src/modules/tensor/rppt_tensor_filter_augmentations.cpp

+            size_t dataSize = dstDescPtr->strides.nStride * dstDescPtr->n * elementSize;
+
+            CHECK_RETURN_STATUS(hipMalloc(&tempPtr, dataSize));
+


Inconsistent formatting: Extra whitespace after the variable initialization. Align formatting with the rest of the codebase (compare to line 568 which has no trailing space).

Suggested change

Copilot · 2026-02-18T23:35:50Z

src/modules/tensor/cpu/kernel/sobel_filter.cpp

+                        Rpp32s rowKernelLoopLimit = kernelSize;
+                        get_kernel_loop_limit(i, rowKernelLoopLimit, padLength, unpaddedHeight);
+                        RpptImageBorderEdge padVertical = i < padLength ? RpptImageBorderEdge::TOP_EDGE : RpptImageBorderEdge::BOTTOM_EDGE;
+                        process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);


Function parameter mismatch: process_left_border_columns_pln_pln is defined with 9 parameters (including filterXTensor and filterYTensor), but here it's being called with only 8 parameters (only filter). This will cause a compilation error. For non-bidirectional cases (X or Y gradient only), the function should be called with the same filter passed twice, or there should be a separate overload for single-filter cases. The calls at lines 321, 483, and 634 have this issue.

Suggested change

process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);

process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, filter, padVertical);

Copilot · 2026-02-18T23:35:51Z

src/modules/tensor/cpu/kernel/sobel_filter.cpp

+                        Rpp32s rowKernelLoopLimit = kernelSize;
+                        get_kernel_loop_limit(i, rowKernelLoopLimit, padLength, unpaddedHeight);
+                        RpptImageBorderEdge padVertical = i < padLength ? RpptImageBorderEdge::TOP_EDGE : RpptImageBorderEdge::BOTTOM_EDGE;
+                        process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);


Same function parameter mismatch as line 321 - process_left_border_columns_pln_pln expects 9 parameters but receives only 8.

Suggested change

process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);

process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, borderType, padVertical);

Copilot · 2026-02-18T23:35:51Z

src/modules/tensor/cpu/kernel/sobel_filter.cpp

+                        Rpp32s rowKernelLoopLimit = kernelSize;
+                        get_kernel_loop_limit(i, rowKernelLoopLimit, padLength, unpaddedHeight);
+                        RpptImageBorderEdge padVertical = i < padLength ? RpptImageBorderEdge::TOP_EDGE : RpptImageBorderEdge::BOTTOM_EDGE;
+                        process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);


Same function parameter mismatch as line 321 - process_left_border_columns_pln_pln expects 9 parameters but receives only 8.

Suggested change

process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);

process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, 1, padVertical);

Copilot · 2026-02-18T23:35:51Z

utilities/test_suite/rpp_test_suite_image.h

+}
+
+// returns the interpolation type used for image resizing or scaling operations.
+inline std::string get_kernel_size_and_gradient_type(unsigned int val, Rpp32u &kernelSize, Rpp32u &gradientType)


Inconsistent spacing: there should be no space between 'Rpp32u' and '&kernelSize' in the parameter list. The ampersand should be adjacent to the parameter name, following the convention used throughout the codebase.

codecov · 2026-02-19T01:20:46Z

Codecov Report

❌ Patch coverage is 97.96163% with 17 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...odules/tensor/rppt_tensor_filter_augmentations.cpp	91.08%	14 Missing ⚠️
src/modules/tensor/cpu/kernel/sobel_filter.cpp	99.60%	2 Missing ⚠️
src/modules/tensor/hip/kernel/sobel_filter.cpp	99.43%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop     #673      +/-   ##
===========================================
+ Coverage    92.44%   92.48%   +0.04%     
===========================================
  Files          196      198       +2     
  Lines        88305    89139     +834     
===========================================
+ Hits         81632    82437     +805     
- Misses        6673     6702      +29

Files with missing lines	Coverage Δ
src/modules/tensor/hip/kernel/sobel_filter.cpp	`99.43% <99.43%> (ø)`
src/modules/tensor/cpu/kernel/sobel_filter.cpp	`99.60% <99.60%> (ø)`
...odules/tensor/rppt_tensor_filter_augmentations.cpp	`92.94% <91.08%> (-0.79%)`	⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

msiddaiah

Please restest before merging, after addressing the comments by me and copilot.

msiddaiah · 2026-02-23T22:32:48Z

src/modules/tensor/cpu/kernel/sobel_filter.cpp

+/*
+MIT License
+
+Copyright (c) 2019 - 2025 Advanced Micro Devices, Inc.


extend the copyright date 2026

msiddaiah · 2026-02-23T22:53:16Z

src/modules/tensor/rppt_tensor_filter_augmentations.cpp

+{
+    if ((kernelSize != 3) && (kernelSize != 5) && (kernelSize != 7))
+        return RPP_ERROR_INVALID_ARGUMENTS;
+    if ((sobelType != 0) && (sobelType != 1) && (sobelType != 2))


Use #defines for 0, 1, 2 for sobel type checking for code readability

msiddaiah · 2026-02-23T23:11:58Z

src/modules/tensor/hip/kernel/sobel_filter.cpp

+    sum_f8x.f4[1] = FLOAT4_ZERO;
+    sum_f8y.f4[0] = FLOAT4_ZERO;
+    sum_f8y.f4[1] = FLOAT4_ZERO;
+    if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))