Skip to content

Sobel filter Kernel implementation#552

Merged
r-abishek merged 13 commits intor-abishek:ar/opt_sobel_filterfrom
RooseweltMcW:apr/sobel
Feb 18, 2026
Merged

Sobel filter Kernel implementation#552
r-abishek merged 13 commits intor-abishek:ar/opt_sobel_filterfrom
RooseweltMcW:apr/sobel

Conversation

@RooseweltMcW
Copy link
Copy Markdown

No description provided.

ManasaDattaT pushed a commit to ManasaDattaT/rpp that referenced this pull request Dec 19, 2025
…of case numbers (r-abishek#552)

* Remove test case numbers from python scripts

* Add erode and dilate test cases

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
@HazarathKumarM HazarathKumarM changed the base branch from develop to master January 13, 2026 07:29
@HazarathKumarM HazarathKumarM changed the base branch from master to develop January 13, 2026 07:29
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only one image is added, but we have 3 types of sobel filter and different kernel sizes. check and add the missing images here

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added for all the kernel size and gradients, Done

T *srcPtrChannel, *dstPtrChannel;
srcPtrChannel = srcPtrImage + (roi.xywhROI.xy.y * srcDescPtr->strides.hStride) + roi.xywhROI.xy.x;
dstPtrChannel = dstPtrImage;
if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NCHW) && (srcDescPtr->c == 1))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: This if is the only branch that stores to dstPtr. For every other layout/channel combination we fall through and still return RPP_SUCCESS, so NHWC tensors (which the API docs advertise) or NCHW inputs with c==3 silently produce garbage. Please add real handling for those cases, or reject them before returning success.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
RpptSubpixelLayout srcSubpixelLayout = RpptSubpixelLayout::RGBtype;
tempPtr = rpp::deref(rppHandle).GetInitHandle()->mem.mcpu.scratchBufferHost;
rppt_color_to_greyscale_host(srcPtr, srcDescPtr, tempPtr, dstDescPtr, srcSubpixelLayout, rppHandle);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: rppt_color_to_greyscale_host returns an RppStatus. If that conversion fails (for example because the destination descriptor is NHWC), we plow ahead on uninitialized scratch data and later return RPP_SUCCESS. Please capture the status and propagate failures instead of ignoring them.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


if ((srcDescPtr->dataType == RpptDataType::U8) && (dstDescPtr->dataType == RpptDataType::U8))
{
sobel_filter_host_tensor(static_cast<Rpp8u*>(tempPtr) + srcDescPtr->offsetInBytes,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: When the source tensor is already single-channel, tempPtr still aliases the original source data, but we describe it with dstDescPtr here. Any src/dst stride/layout mismatch (e.g. NHWC source going to an NCHW output) means we read the wrong pixels yet return success. Conversely, in the RGB path you now have scratch memory in the destination layout, but you are still adding srcDescPtr->offsetInBytes. Please use the descriptor/offset that actually matches tempPtr in each case.

CHECK_RETURN_STATUS(hipMalloc(&tempPtr, dataSize));

RpptSubpixelLayout srcSubpixelLayout = RpptSubpixelLayout::RGBtype;
rppt_color_to_greyscale_gpu(srcPtr, srcDescPtr, tempPtr, dstDescPtr, srcSubpixelLayout, rppHandle);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking: Same issue on the HIP path—the status from rppt_color_to_greyscale_gpu is ignored. If the conversion fails we still launch the Sobel kernels on whatever happens to be in tempPtr and return success. Please bail out (and free the scratch allocation) when the conversion reports an error.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

RooseweltMcW pushed a commit to HazarathKumarM/rpp that referenced this pull request Jan 21, 2026
…of case numbers (r-abishek#552)

* Remove test case numbers from python scripts

* Add erode and dilate test cases

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
api/rppdefs.h Outdated
/*! \brief The specified axis is invalid or out of range. (Needs to adhere to function specification.) \ingroup group_rppdefs */
RPP_ERROR_INVALID_AXIS = -26
RPP_ERROR_INVALID_AXIS = -26,
/*! \brief Invalid src tensor layout or channels. (Needs to adhere to function specification.) \ingroup group_rppdefs */
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have INVALID_SRC_LAYOUT or INVALID_CHANNELS
Why do we need a new one?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Sobel filter, we need both channels and layout check, so we need combined them into a single check and added a new ENUM here.
if it's not needed, we can have two separate checks

#include <cstring>
#include <omp.h>
#include <half/half.hpp>
#if __has_include(<half/half.hpp>)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change is added?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved, Done.

#include <type_traits>
#include <cstdint>
#include <half/half.hpp>
#if __has_include(<half/half.hpp>)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it is added in this PR?

/*
MIT License

Copyright (c) 2019 - 2024 Advanced Micro Devices, Inc.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check copyright, if this file was introduced in 2024 only modify the copyright accordingly
Same is applicable for all new files in this PR

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified, Done.

#include "host_tensor_executors.hpp"
#include "rpp_cpu_filter.hpp"

Rpp32f sobel3x3X[9] = {-1, 0, 1,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have all thes variables as global?
Can we not have them as local variables, you can define it static const since it won't be modified

result = subprocess.Popen([buildFolderPath + "/build/Tensor_image_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth.value), str(outputFormatToggle.value), str(case), str(swapOrder), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.PIPE) # nosec
log_detected(result, errorLog, imageAugmentationMap[int(case)][0], get_bit_depth(bitDepth.value), get_image_layout_type(layout, outputFormatToggle.value, "HIP"))
elif imageAugmentationMap[int(case)][0] == "sobel_filter":
for kernelSizeAndGradient in range(9):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using range(9) here?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have support for only 3, 5 and 7 right?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kernelSize ranges from 3 to 7 but there are 3 gradient for each kernelSize (X, Y, XY) hence 3x3 - 9 range is set as additionalParam

@HazarathKumarM HazarathKumarM changed the base branch from develop to master January 21, 2026 09:58
@HazarathKumarM HazarathKumarM changed the base branch from master to develop January 21, 2026 09:58
@r-abishek
Copy link
Copy Markdown
Owner

@HazarathKumarM ToT merge and Unified API changes to be reflected.

@r-abishek r-abishek changed the base branch from develop to ar/opt_sobel_filter February 18, 2026 23:28
@r-abishek r-abishek added the enhancement New feature or request label Feb 18, 2026
@r-abishek r-abishek merged commit 07071d4 into r-abishek:ar/opt_sobel_filter Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants