Skip to content

Tensor Arithmetic Operations - Updated Branch#491

Open
Srihari-mcw wants to merge 131 commits intor-abishek:ar/broadcasting_tensor_arithmeticfrom
Srihari-mcw:tensor_arithmetic_float_divide
Open

Tensor Arithmetic Operations - Updated Branch#491
Srihari-mcw wants to merge 131 commits intor-abishek:ar/broadcasting_tensor_arithmeticfrom
Srihari-mcw:tensor_arithmetic_float_divide

Conversation

@Srihari-mcw
Copy link
Copy Markdown
Collaborator

No description provided.

HazarathKumarM and others added 30 commits February 27, 2026 04:19
* Sobel filter implementation for HOST with QA for U8 and F32

* Fixed HIP F32 image generation and QA passed for 1e-4

* Revert test suite image file

* Updated HOST backend implementation for sobel filter

* Updated documentation and added images

* Resolved review comments

* Resolved review comments and modified error status to set for two different conditions

* updated unified api for sobel filter

* cleanup sobel filter api

* Resolve copilot review comments

* resolve review comments

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
* Optimized version of channel dropout HIP backend and working code for HOST AVX, SSE

* Modified name for dropout compute function

* Modified way of AVX and SSE version channel dropout to avoid if statments

* Modified Channel Dropout with generic compute code reused

* Parameters and name change for channel dropout

* Modified HIP for better performance

* Modified the code and made the channel dropout templated version for all the bitdepths

* Modified Erase kernel for grid and cutout dropout version for better performance

* Modified the .h file to have the dropout to effects and added output images in the docs

* Added output image and modified the .h file to effects for channel dropout

* Removed space

* added space

* Moved grid dropout .h to effects

* Modified channel dropout for I8 variant HOST side

* Resolved all review comments and modified code to produce results for i8 variant

* Removed empty line

* Resolved review comments

* Modified HOST after merge

* Made changes after merging and QA passed for dropout

* Channel dropout make_float 4 macro changes

* Updated QA with random generator and updated BIN files

* Modified QA name changes

* Modified RandomSeed value passed as parameter to the function call

* Update rppt_tensor_effects_augmentations.cpp

indentation modified

* Initial modified HIP backend Grid dropout with better performance

* Removed space and review comments resolved

* Added I8 support for grid dropout

* Updated and modified indentation

* Modified HIP backend test suite changes

* Host side modification for dropout to use init function in API level and have a seperate file for Grid dropout HOST backend

* Modified color buffer to use scratch Buffer Host

* colon removed

* Fixed linker issue and HIP backend passed

* Added random erase dropout functionality and modified the test suite

* channel dropout implementation

* Resolved all the review comments and modified the magic number to set as constant for better understanding, added required comments

* Removed other varients of dropout

* Removed rd inside kernel

* Update kernel and removed unwanted functions

* HOST modifications for randomErase

* Modified randomization in test suite

* Removed unwanted files and headers

* Resolved review comments

* Updated documentation and resolved comments

* updated random noise generation in test suite HOST implementation

* Updated kernel files for random erase to remove noise generation logic in test suite and pass buffer for random noise

* updated init dropout function

* Added break statement after merge conflicts

* ROI fixes - Box filter and Median filter (ROCm#652)

* Add Box and Median Filter ROI fixes after minor corrections

* Fix source index computation

---------

Co-authored-by: Mukesh <mukesh.jayakodi@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>

* Removed numBoxTensor parameter and resolved review comments

* Added early condition to check and return the invalid ROI region

* Updated kernel to use anchorBox info for random erase kernel

* Add unified api for random_erase

* Resolved review comments

* Updated param name for batchSize

* Resolved review comments

* Reverted modified changes

---------

Co-authored-by: sampath117 <snehaa@multicorewareinc.com>
Co-authored-by: RooseweltMcW <austin.roosewelt@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Mukesh <mukesh.jayakodi@multicorewareinc.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com>
Emboss kernel

---------

Co-authored-by: RooseweltMcW <austin.roosewelt@multicorewareinc.com>
Co-authored-by: root <root@ixt-sjc2-52.local.lan>
Co-authored-by: sampath117 <snehaa@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com>
Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com>
* Add HIP kernel launch error checking

* Rename RPP_ERROR_GPU -> RPP_ERROR_HIP_LAUNCH

* Address PR comments

* Move melScalePtr deletion to before the HIP launch

* Add checks for HIP sobel filter

* Fix docstring

* Move HIP_CHECK_LAUNCH_RETURN to sobel_filter.cpp

* Bump version and update changelog

* Address PR comments

---------

Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Dropout (Grid and Cutout) on HIP and HOST

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: sampath117 <snehaa@multicorewareinc.com>
Co-authored-by: RooseweltMcW <austin.roosewelt@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: arvindcheru <90783369+arvindcheru@users.noreply.github.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Co-authored-by: Maddisetty <hmaddise@ctr2-alola-login-01.amd.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com>
Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
…ocs/sphinx (ROCm#692)

Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.32.0 to 1.33.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.33.1/CHANGELOG.md)
- [Commits](ROCm/rocm-docs-core@v1.32.0...v1.33.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.33.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
Dropout Kernels

---------

Co-authored-by: sampath117 <snehaa@multicorewareinc.com>
Co-authored-by: RooseweltMcW <austin.roosewelt@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com>
Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
* add yuv_To_rgb kernel, test case and 3 test images and their outputs

* clean up kernel parameters

* test suite upgrade

* fix kernel to support full and studio range

* change copyright to 2026

* add enums for color standard and range

* update version for adding this new kernel

* add hip platform to CmakeLists

* fix merge conflict errors
* Updates for tensor bitwise operations to have two inputs

* Initial addition across pipeline

* Updates for testing and fix issues

* Testing i8

* Tensor Binary Bitwise Operations extended for short and int datatypes

* Changes in test suite to allocate memory for int32 and uint32 datatypes

* Updates for templating bitwise binary operations

* Initial updates for a HIP Version

* Fixes for U8 HIP Implementation and Running

* Initial updates for 2D bitwise or

* Update the tensor implementations to fix accuracy

* Updates to not replace the pointers

* Updates for 4D or higher

* Fix compilation issues

* Updates to fix output for 4D case

* Fixes for 1D case

* Modifications in outer API and support other bit depths also

* Code addition to vectorized test of uchar type

* Updates to test U8/F32 vector versions

* Updates for fixing tensor bitwise operations vectorization with integer

* Updates to test performance

* Fix memory related bugs with test suite

* Add Strides definition for individual samples

* Updates to use newer strides and ROI for computation

* Integrated broadcast support and templated all tensor bitwise operations

* Added test suite support

* updated the incompatible dimensions check

* updated mem allocs

* Updates to check overall dims to avoid memory errors

* Fix issues found while testing different ndims

* Rename variables and remove print statements

* Allocate lesser memory

* Remove unnecessary files

* Remove dependency of broadcast.hpp

* Updates for CMakeLists.txt

* Remove unused function

* Remove line

* Cleanup rppt_tensor_bitwise_operations.cpp file

* Add documentation for external API functions and remove load/store functions

* Further cleanup of files

* Rename hip kernel file

* Update comments

* Initial cleanup of HIP Kernels

* Add EOF line in common.py

* Further cleanup of kernel files

* Add changes for separating broadcasting on HOST

* Update the code for HIP for broadcastMode

* Fix compilation issues

* Update vectorized implementation for bitwise OR

* Update code for 3D vectorization

* Update code for 4D vectorization

* Template implementations

* Separate broadcast and non broadcast codes into two paths

* Pass only the destination strides

* broadcastMode introduced as parameter

* Add comments

* Test Suite Fixes for HOST

* Further changes to rpp_test_suite_misc.h

* Add more fixes

* Fixes for HIP Side

* Fix issues with test suite

* Update the compare_output calls

* Minor cleanup fixes

* Updates for consistency

* Check for nullptr

* Further cleanup updates

* Cleanup fixes

* Further cleanup files

* Complete remaining cleanup

* Fix test suite compilation issues

* Fix QA issues with tensor bitwise with broadcasting

* Fix QA issues with tensor and

* Update vectorized versions for 1D, 2D and 3D cases of broadcasting

* Compile fixes with vectorized version

* Add Bitwise Operations, Function Templates Struct and Bin File Consolidation for both HOST and HIP.

* Fix rename to bitDepthTestMode, Datatype format and remove other bin files

* Fix Broadcast condition for HIP & HOST and add roi broadcast condition

* Fix the renamed of BitdepthTestmode and resolved test suite error

* Fix Comments correction, Malloc Corrections and I8 datatype conversion.

* update unified API for tensor bitwise ops

* fix HOST QA issues

* revert unnecessary changes

* fix normalize issues and cleanup

* revert error code capture code

* resolved copilot review comments

* fix build error

* resolve review comments

* resolve review comments

---------

Co-authored-by: Srihari-mcw <srihari@multicorewareinc.com>
Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Mukesh <mukesh.jayakodi@multicorewareinc.com>
Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com>
* added support for tensor add tensor and tensor mul scalar

* Add golden outputs

* update RPP version

* resolve review comments

* rename new audio function names

* rename new hip audio functions

---------

Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
* GPU support for Histogram Equalize

* HOST and HIP version of Histogram Equalise with QA for U8

* Updated build LUT function with AVX version

* Updated docs image and updated AVX logic

* Optimize Hist Equalize

* Updated Helper function

* Updated helpers

* Cleanup and clamp condition

* Clean up for histogram equalize HOST

* Cleanup

* Code cleanup

* Taking memory from scratch buffer

* Changed C style casts

* updated unified api for histogram equalize

* updated the documentation

* Resolved review comments

* Updated case list for histogram equalize

* Updated parameter names

* modified aligned length

* modified mem allocations

* resolve review comments

* resolve review comments

* resolve review comments

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: ManasaDattaT <tammisetti.manasadatta@multicorewareinc.com>
Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com>
Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
* optimized Median Filter 3 and 5 kernels

* optimizations for kernel size 7 and 9

* cleanup

* optimize HIP code and cleanup

* gaussian filter optimization

* resolve review comments

* fix build error

* resolve review comments

* resolve review comments

* resolved review comments

* resolved review comments

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com>
Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
* update images in readme

* Add voxel gifs , modify the images for 2D augmentations

* update testsuite readme

* updated original images

* updated augmentation table and added slice image

---------

Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com>
Co-authored-by: Rajy Rawther <Rajy.MeeyakhanRawther@amd.com>
* Suppress unused functions (roi_conversion)

* Add extra braces to hip load helpers in rpp_hip_load_store.hpp

* Remove unused variables in rpp_cpu_simd_load_store.hpp

* Check return status in random_erase.cpp

* Resolve uninitialized variables and fix bug in tensor_mean.cpp/jitter.cpp

* Suppress any warnings that haven't been addressed in phase 1

* Fix typo in function name

* Fix typo in function name

* Fix typos

---------

Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants