Skip to content

Random erase dropout kernel implementation#556

Merged
r-abishek merged 67 commits intor-abishek:ar/dropout_random_erasefrom
RooseweltMcW:apr/random_erase_dropout
Feb 18, 2026
Merged

Random erase dropout kernel implementation#556
r-abishek merged 67 commits intor-abishek:ar/dropout_random_erasefrom
RooseweltMcW:apr/random_erase_dropout

Conversation

@RooseweltMcW
Copy link
Copy Markdown

No description provided.

snehaa8 and others added 30 commits August 22, 2025 07:39
…ied QA for logic to implement random seed for non QA
#include "hip_tensor_executors.hpp"
#include <random>

__device__ inline uint generate_seed(uint x, uint y, uint z, int seed)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comments for all the hardcoded numbers and specify in the comments why those numbers are needed

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the new logic the LCG method to produce random noise is removed and the noise buffer is passed from the test suite, Done

@HazarathKumarM HazarathKumarM changed the base branch from develop to master January 13, 2026 08:56
@HazarathKumarM HazarathKumarM changed the base branch from master to develop January 13, 2026 08:56
Copy link
Copy Markdown
Collaborator

@HazarathKumarM HazarathKumarM left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RooseweltMcW please address the comments

#endif // GPU_SUPPORT

/*! \brief Random Erase augmentation on HOST backend for a NCHW/NHWC layout tensor
* \details This function erases erases random regions from an image and fills with random noise, for a batch of RGB(3 channel) / greyscale(1 channel) images with an NHWC/NCHW tensor layout.<br>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erase is repeated twice. please correct it

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified

* \param [in] anchorBoxInfoTensor anchorBoxInfo values of type RpptRoiLtrb for each erase-region inside each image in the batch HOST memory. Restrictions -
- 0 <= anchorBoxInfo[i] < respective image width/height
- Erase-region anchor boxes on each image given by the user must not overlap
* \param [in] numBoxesTensor number of erase-regions per image, for each image in the batch (numBoxesTensor[n] >= 0)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for random erase its always one box right . Do we even need to pass this tensor to kernel?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, Done

RpptROIPtr roiPtrInput = &roiTensorPtrSrc[batchCount];
compute_roi_validation_host(roiPtrInput, &roi, &roiDefault, roiType);

Rpp32u numBoxes = numBoxesTensor[batchCount];
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as commneted above numBoxes won't be needed I feel

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed


for(int count = 0; count < numBoxes; count++)
{
Rpp32u x1 = static_cast<Rpp32u>(RPPPRANGECHECK(anchorBoxInfo[count].lt.x, roi.xywhROI.xy.x, roi.xywhROI.roiWidth));
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

modify the func name here, it should be RPPRANGECHECK

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change requires modifying every other function which is using the function RPPP to RPP which has to be a global change

int id_y = hipBlockIdx_y * hipBlockDim_y + hipThreadIdx_y;
int id_z = hipBlockIdx_z * hipBlockDim_z + hipThreadIdx_z;

auto &roi = roiTensorPtrSrc[id_z].xywhROI;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it required. please modify the below condition like all other kernels

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, Done.

Rpp32u numBoxes = numBoxesTensor[id_z];
uint dstIdx = id_z * dstStridesNH.x + id_y * dstStridesNH.y + id_x * 3;

for (int i = 0; i < numBoxes; i++)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loop inside hip kernel?, can we not make use of the gpu threads and make it parallel

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed and processed only one box logic

// if src layout is NHWC, copy src to dst
if (srcDescPtr->layout == RpptLayout::NHWC)
{
hipMemcpyAsync(dstPtr, srcPtr, static_cast<size_t>(srcDescPtr->n * srcDescPtr->strides.nStride * sizeof(T)), hipMemcpyDeviceToDevice, handle.GetStream());
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already srcPtr is copied to dstPtr, then why for loop was there in the hip kernels for num boxes


int globalThreads_x = dstDescPtr->w;
int globalThreads_y = dstDescPtr->h;
int globalThreads_z = handle.GetBatchSize();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed to launch kernel per box. Have to replace it with actual box dimensions

r-abishek and others added 5 commits January 19, 2026 04:34
RooseweltMcW pushed a commit to HazarathKumarM/rpp that referenced this pull request Jan 21, 2026
…ocs/sphinx (r-abishek#556)

Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.19.1 to 1.20.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v1.19.1...v1.20.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.20.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
@r-abishek r-abishek added the enhancement New feature or request label Feb 18, 2026
@r-abishek r-abishek changed the base branch from develop to ar/dropout_random_erase February 18, 2026 23:45
@r-abishek r-abishek merged commit e0b5d2b into r-abishek:ar/dropout_random_erase Feb 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants