Skip to content

Sobel Filter on HOST and HIP#673

Merged
kiritigowda merged 18 commits intoROCm:developfrom
r-abishek:ar/opt_sobel_filter
Mar 6, 2026
Merged

Sobel Filter on HOST and HIP#673
kiritigowda merged 18 commits intoROCm:developfrom
r-abishek:ar/opt_sobel_filter

Conversation

@r-abishek
Copy link
Copy Markdown
Member

  • Adds Sobel Filter for tensor API on HOST and HIP
  • Adds support for U8/FF16/F32/I8 and NHWC/NCHW variants
  • Adds relevant unit / performance / qa tests

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Sobel Filter support for tensor API on both HOST and HIP backends. The implementation supports multiple data types (U8/F16/F32/I8) and layouts (NHWC/NCHW) with three kernel sizes (3x3, 5x5, 7x7) and three gradient types (X, Y, XY).

Changes:

  • Added rppt_sobel_filter API function with comprehensive parameter validation
  • Implemented CPU and HIP kernels for Sobel filter with AVX2 optimizations for HOST
  • Added test infrastructure including unit tests, performance tests, and QA tests

Reviewed changes

Copilot reviewed 12 out of 39 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
api/rppt_tensor_filter_augmentations.h Added API declaration and documentation for rppt_sobel_filter
src/modules/tensor/rppt_tensor_filter_augmentations.cpp Implemented main function with RGB to grayscale conversion support
src/modules/tensor/cpu/kernel/sobel_filter.cpp CPU kernel implementation with AVX2 SIMD optimizations
src/include/tensor/host_tensor_executors.hpp Added host executor function declaration
src/include/tensor/hip_tensor_executors.hpp Added HIP executor function declaration
utilities/test_suite/rpp_test_suite_image.h Added test infrastructure support and helper functions
utilities/test_suite/common.py Added sobel_filter to augmentation mappings
utilities/test_suite/HOST/runImageTests.py Added test cases for HOST backend
utilities/test_suite/HIP/runImageTests.py Added test cases for HIP backend
utilities/test_suite/HOST/Tensor_image_host.cpp Added HOST test executable support
utilities/test_suite/HIP/Tensor_image_hip.cpp Added HIP test executable support
docs/data/doxygenOutputs/*.png Added documentation images for all gradient types and kernel sizes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

{
Rpp32f *rowTemp, *rowTempRef, *outVal, *outRefVal, *outputTemp, *outputTempRef, *outputTempChn, *outputTempRefChn;
Rpp32f cutoff = (testCase == LENS_CORRECTION) ? 1e-4 : 1e-5;
Rpp32f cutoff = ((testCase == LENS_CORRECTION)|| (testCase == SOBEL_FILTER)) ? 1e-4 : 1e-5;
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent spacing around '||' operator. The convention in this file shows spacing on both sides of boolean operators (see line 1103). Add a space after ')' and before '||' for consistency.

Copilot uses AI. Check for mistakes.
}
}

// returns the interpolation type used for image resizing or scaling operations.
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "returns the interpolation type used for image resizing or scaling operations." is incorrect for this function. This function actually returns a string combining kernel size and gradient type for the Sobel filter, not interpolation types. The comment should be updated to accurately describe the function's purpose, such as "returns the kernel size and gradient type for sobel filter operations."

Copilot uses AI. Check for mistakes.
}
#ifdef GPU_SUPPORT
else if ((handleBackend == RppBackend::RPP_HIP_BACKEND) && (executionBackend == RppBackend::RPP_HIP_BACKEND))
{
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing offset validation check for HIP backend. Similar filter operations (like median_filter and gaussian_filter) validate that srcDescPtr->offsetInBytes is sufficient for the kernel size with a check like: if (srcDescPtr->offsetInBytes < 12 * (kernelSize / 2)) return RPP_ERROR_LOW_OFFSET;. This validation should be added after line 558 to ensure the offset can accommodate the kernel padding requirements.

Suggested change
{
{
if (srcDescPtr->offsetInBytes < 12 * (kernelSize / 2))
return RPP_ERROR_LOW_OFFSET;

Copilot uses AI. Check for mistakes.
Rpp32u kernelSize, gradientType;
get_kernel_size_and_gradient_type(additionalParam, kernelSize, gradientType);

func += kernelSizeAndGradientName;
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace after the variable declaration. Remove the extra space after the semicolon.

Copilot uses AI. Check for mistakes.
#ifdef GPU_SUPPORT
else if ((handleBackend == RppBackend::RPP_HIP_BACKEND) && (executionBackend == RppBackend::RPP_HIP_BACKEND))
{
RpptDescPtr inputDesc = srcDescPtr;
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing whitespace after the declaration. Remove the extra space after the semicolon for consistency.

Suggested change
RpptDescPtr inputDesc = srcDescPtr;
RpptDescPtr inputDesc = srcDescPtr;

Copilot uses AI. Check for mistakes.
size_t dataSize = dstDescPtr->strides.nStride * dstDescPtr->n * elementSize;

CHECK_RETURN_STATUS(hipMalloc(&tempPtr, dataSize));

Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent formatting: Extra whitespace after the variable initialization. Align formatting with the rest of the codebase (compare to line 568 which has no trailing space).

Suggested change

Copilot uses AI. Check for mistakes.
Rpp32s rowKernelLoopLimit = kernelSize;
get_kernel_loop_limit(i, rowKernelLoopLimit, padLength, unpaddedHeight);
RpptImageBorderEdge padVertical = i < padLength ? RpptImageBorderEdge::TOP_EDGE : RpptImageBorderEdge::BOTTOM_EDGE;
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function parameter mismatch: process_left_border_columns_pln_pln is defined with 9 parameters (including filterXTensor and filterYTensor), but here it's being called with only 8 parameters (only filter). This will cause a compilation error. For non-bidirectional cases (X or Y gradient only), the function should be called with the same filter passed twice, or there should be a separate overload for single-filter cases. The calls at lines 321, 483, and 634 have this issue.

Suggested change
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, filter, padVertical);

Copilot uses AI. Check for mistakes.
Rpp32s rowKernelLoopLimit = kernelSize;
get_kernel_loop_limit(i, rowKernelLoopLimit, padLength, unpaddedHeight);
RpptImageBorderEdge padVertical = i < padLength ? RpptImageBorderEdge::TOP_EDGE : RpptImageBorderEdge::BOTTOM_EDGE;
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same function parameter mismatch as line 321 - process_left_border_columns_pln_pln expects 9 parameters but receives only 8.

Suggested change
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, borderType, padVertical);

Copilot uses AI. Check for mistakes.
Rpp32s rowKernelLoopLimit = kernelSize;
get_kernel_loop_limit(i, rowKernelLoopLimit, padLength, unpaddedHeight);
RpptImageBorderEdge padVertical = i < padLength ? RpptImageBorderEdge::TOP_EDGE : RpptImageBorderEdge::BOTTOM_EDGE;
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same function parameter mismatch as line 321 - process_left_border_columns_pln_pln expects 9 parameters but receives only 8.

Suggested change
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, padVertical);
process_left_border_columns_pln_pln(srcPtrTemp, dstPtrTemp, kernelSize, padLength, unpaddedWidth, rowKernelLoopLimit, filter, 1, padVertical);

Copilot uses AI. Check for mistakes.
}

// returns the interpolation type used for image resizing or scaling operations.
inline std::string get_kernel_size_and_gradient_type(unsigned int val, Rpp32u &kernelSize, Rpp32u &gradientType)
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent spacing: there should be no space between 'Rpp32u' and '&kernelSize' in the parameter list. The ampersand should be adjacent to the parameter name, following the convention used throughout the codebase.

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 19, 2026

Codecov Report

❌ Patch coverage is 97.96163% with 17 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...odules/tensor/rppt_tensor_filter_augmentations.cpp 91.08% 14 Missing ⚠️
src/modules/tensor/cpu/kernel/sobel_filter.cpp 99.60% 2 Missing ⚠️
src/modules/tensor/hip/kernel/sobel_filter.cpp 99.43% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #673      +/-   ##
===========================================
+ Coverage    92.44%   92.48%   +0.04%     
===========================================
  Files          196      198       +2     
  Lines        88305    89139     +834     
===========================================
+ Hits         81632    82437     +805     
- Misses        6673     6702      +29     
Files with missing lines Coverage Δ
src/modules/tensor/hip/kernel/sobel_filter.cpp 99.43% <99.43%> (ø)
src/modules/tensor/cpu/kernel/sobel_filter.cpp 99.60% <99.60%> (ø)
...odules/tensor/rppt_tensor_filter_augmentations.cpp 92.94% <91.08%> (-0.79%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@msiddaiah msiddaiah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please restest before merging, after addressing the comments by me and copilot.

/*
MIT License

Copyright (c) 2019 - 2025 Advanced Micro Devices, Inc.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extend the copyright date 2026

{
if ((kernelSize != 3) && (kernelSize != 5) && (kernelSize != 7))
return RPP_ERROR_INVALID_ARGUMENTS;
if ((sobelType != 0) && (sobelType != 1) && (sobelType != 2))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use #defines for 0, 1, 2 for sobel type checking for code readability

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

sum_f8x.f4[1] = FLOAT4_ZERO;
sum_f8y.f4[0] = FLOAT4_ZERO;
sum_f8y.f4[1] = FLOAT4_ZERO;
if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use #define for magic number 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

sum_f8x.f4[1] = FLOAT4_ZERO;
sum_f8y.f4[0] = FLOAT4_ZERO;
sum_f8y.f4[1] = FLOAT4_ZERO;
if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use #define for magic number 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

sum_f8x.f4[1] = FLOAT4_ZERO;
sum_f8y.f4[0] = FLOAT4_ZERO;
sum_f8y.f4[1] = FLOAT4_ZERO;
if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use #define for magic number 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

float *filter_row5 = &filter_row1[20];
sum_f8.f4[0] = FLOAT4_ZERO;
sum_f8.f4[1] = FLOAT4_ZERO;
if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use #define for magic number 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

float *filter_row7 = &filter_row1[42];
sum_f8.f4[0] = FLOAT4_ZERO;
sum_f8.f4[1] = FLOAT4_ZERO;
if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use #define for magic number 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

float *filter_row3 = &filter_row1[6];
sum_f8.f4[0] = FLOAT4_ZERO;
sum_f8.f4[1] = FLOAT4_ZERO;
if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use #define for magic number 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

float *filter_row5 = &filter_row1[20];
sum_f8.f4[0] = FLOAT4_ZERO;
sum_f8.f4[1] = FLOAT4_ZERO;
if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use #define for magic number 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

float *filter_row7 = &filter_row1[42];
sum_f8.f4[0] = FLOAT4_ZERO;
sum_f8.f4[1] = FLOAT4_ZERO;
if ((id_x_i >= 0) && ((id_x_i + 7) < roiWidth) && (id_y_i >= 0) && (id_y_i < roiHeight))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use #define for magic number 7

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Copy Markdown

@msiddaiah msiddaiah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Test before merging once

@kiritigowda kiritigowda merged commit 187caab into ROCm:develop Mar 6, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants