Sobel filter Kernel implementation#552
Sobel filter Kernel implementation#552r-abishek merged 13 commits intor-abishek:ar/opt_sobel_filterfrom
Conversation
…of case numbers (r-abishek#552) * Remove test case numbers from python scripts * Add erode and dilate test cases --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
There was a problem hiding this comment.
only one image is added, but we have 3 types of sobel filter and different kernel sizes. check and add the missing images here
There was a problem hiding this comment.
Added for all the kernel size and gradients, Done
3dfcd80 to
c38f8ad
Compare
| T *srcPtrChannel, *dstPtrChannel; | ||
| srcPtrChannel = srcPtrImage + (roi.xywhROI.xy.y * srcDescPtr->strides.hStride) + roi.xywhROI.xy.x; | ||
| dstPtrChannel = dstPtrImage; | ||
| if ((srcDescPtr->layout == RpptLayout::NCHW) && (dstDescPtr->layout == RpptLayout::NCHW) && (srcDescPtr->c == 1)) |
There was a problem hiding this comment.
Blocking: This if is the only branch that stores to dstPtr. For every other layout/channel combination we fall through and still return RPP_SUCCESS, so NHWC tensors (which the API docs advertise) or NCHW inputs with c==3 silently produce garbage. Please add real handling for those cases, or reject them before returning success.
| { | ||
| RpptSubpixelLayout srcSubpixelLayout = RpptSubpixelLayout::RGBtype; | ||
| tempPtr = rpp::deref(rppHandle).GetInitHandle()->mem.mcpu.scratchBufferHost; | ||
| rppt_color_to_greyscale_host(srcPtr, srcDescPtr, tempPtr, dstDescPtr, srcSubpixelLayout, rppHandle); |
There was a problem hiding this comment.
Blocking: rppt_color_to_greyscale_host returns an RppStatus. If that conversion fails (for example because the destination descriptor is NHWC), we plow ahead on uninitialized scratch data and later return RPP_SUCCESS. Please capture the status and propagate failures instead of ignoring them.
|
|
||
| if ((srcDescPtr->dataType == RpptDataType::U8) && (dstDescPtr->dataType == RpptDataType::U8)) | ||
| { | ||
| sobel_filter_host_tensor(static_cast<Rpp8u*>(tempPtr) + srcDescPtr->offsetInBytes, |
There was a problem hiding this comment.
Blocking: When the source tensor is already single-channel, tempPtr still aliases the original source data, but we describe it with dstDescPtr here. Any src/dst stride/layout mismatch (e.g. NHWC source going to an NCHW output) means we read the wrong pixels yet return success. Conversely, in the RGB path you now have scratch memory in the destination layout, but you are still adding srcDescPtr->offsetInBytes. Please use the descriptor/offset that actually matches tempPtr in each case.
| CHECK_RETURN_STATUS(hipMalloc(&tempPtr, dataSize)); | ||
|
|
||
| RpptSubpixelLayout srcSubpixelLayout = RpptSubpixelLayout::RGBtype; | ||
| rppt_color_to_greyscale_gpu(srcPtr, srcDescPtr, tempPtr, dstDescPtr, srcSubpixelLayout, rppHandle); |
There was a problem hiding this comment.
Blocking: Same issue on the HIP path—the status from rppt_color_to_greyscale_gpu is ignored. If the conversion fails we still launch the Sobel kernels on whatever happens to be in tempPtr and return success. Please bail out (and free the scratch allocation) when the conversion reports an error.
…of case numbers (r-abishek#552) * Remove test case numbers from python scripts * Add erode and dilate test cases --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
api/rppdefs.h
Outdated
| /*! \brief The specified axis is invalid or out of range. (Needs to adhere to function specification.) \ingroup group_rppdefs */ | ||
| RPP_ERROR_INVALID_AXIS = -26 | ||
| RPP_ERROR_INVALID_AXIS = -26, | ||
| /*! \brief Invalid src tensor layout or channels. (Needs to adhere to function specification.) \ingroup group_rppdefs */ |
There was a problem hiding this comment.
We already have INVALID_SRC_LAYOUT or INVALID_CHANNELS
Why do we need a new one?
There was a problem hiding this comment.
For Sobel filter, we need both channels and layout check, so we need combined them into a single check and added a new ENUM here.
if it's not needed, we can have two separate checks
| #include <cstring> | ||
| #include <omp.h> | ||
| #include <half/half.hpp> | ||
| #if __has_include(<half/half.hpp>) |
| #include <type_traits> | ||
| #include <cstdint> | ||
| #include <half/half.hpp> | ||
| #if __has_include(<half/half.hpp>) |
| /* | ||
| MIT License | ||
|
|
||
| Copyright (c) 2019 - 2024 Advanced Micro Devices, Inc. |
There was a problem hiding this comment.
Check copyright, if this file was introduced in 2024 only modify the copyright accordingly
Same is applicable for all new files in this PR
| #include "host_tensor_executors.hpp" | ||
| #include "rpp_cpu_filter.hpp" | ||
|
|
||
| Rpp32f sobel3x3X[9] = {-1, 0, 1, |
There was a problem hiding this comment.
Why do we have all thes variables as global?
Can we not have them as local variables, you can define it static const since it won't be modified
| result = subprocess.Popen([buildFolderPath + "/build/Tensor_image_hip", srcPath1, srcPath2, dstPathTemp, str(bitDepth.value), str(outputFormatToggle.value), str(case), str(swapOrder), str(numRuns), str(testType), str(layout), "0", str(qaMode), str(decoderType), str(batchSize)] + roiList + [scriptPath], stdout=subprocess.PIPE, stderr=subprocess.PIPE) # nosec | ||
| log_detected(result, errorLog, imageAugmentationMap[int(case)][0], get_bit_depth(bitDepth.value), get_image_layout_type(layout, outputFormatToggle.value, "HIP")) | ||
| elif imageAugmentationMap[int(case)][0] == "sobel_filter": | ||
| for kernelSizeAndGradient in range(9): |
There was a problem hiding this comment.
We have support for only 3, 5 and 7 right?
There was a problem hiding this comment.
kernelSize ranges from 3 to 7 but there are 3 gradient for each kernelSize (X, Y, XY) hence 3x3 - 9 range is set as additionalParam
…ferent conditions
|
@HazarathKumarM ToT merge and Unified API changes to be reflected. |
No description provided.