RPP F32 QA : Review Comments Resolution#431
RPP F32 QA : Review Comments Resolution#431r-abishek merged 5 commits intor-abishek:ar/test_suite_upgrade_9_f32qafrom
Conversation
HazarathKumarM
commented
May 7, 2025
- Resolved review comments added by Rajy
- Modified SIMD print functions to use union which help us eliminating an extra storeu_ps instruction
There was a problem hiding this comment.
Pull Request Overview
This PR resolves review comments by modifying SIMD print functions to use union-based conversions, eliminating an extra storeu_ps instruction.
- Replaced temporary arrays with unions for SIMD print functions
- Consolidated and standardized the conversions across various data types
Comments suppressed due to low confidence (1)
src/include/common/cpu/rpp_cpu_simd_load_store.hpp:295
- [nitpick] Consider using the loop variable name 'ct' instead of 'i' for consistency with the other print functions.
for (int i = 0; i < 8; ++i)
r-abishek
left a comment
There was a problem hiding this comment.
@HazarathKumarM Pls address comments
| union { | ||
| __m128i vec; | ||
| char arr[16]; | ||
| } u; |
There was a problem hiding this comment.
Please don't do the union initialization inside every function. Take it out globally.
Next, we should ideally pass pointer to __m128i (call by reference). So receive it in __m128i* vPrintArrayPtr.
A global u is already defined, so we just use that here next
u.vec = *(vPrintArrayPtr);
This way, two copies are avoided:
- The call by value copy when you call
rpp_mm_print_epi8(__m128i vPrintArray) - The copy at
u.vec = vPrintArray;
Pls check u.arr should also be accessible after that.
|
|
|
||
| // Union for handling 128-bit SIMD data (SSE). | ||
| union RppSIMD128 { | ||
| __m128i m128i_val; |
There was a problem hiding this comment.
Open braces in next line as per RPP standard everywhere
| char printArray[16]; | ||
| _mm_storeu_si128((__m128i *)printArray, vPrintArray); | ||
| inline void rpp_mm_print_epi8(__m128i *v) { | ||
| RppSIMD128 u; |
There was a problem hiding this comment.
Same comment. Open braces in next line as per RPP standard everywhere
| // Union for handling 128-bit SIMD data (SSE). | ||
| union RppSIMD128 { | ||
| __m128i m128i_val; | ||
| __m128 m128_val; |
There was a problem hiding this comment.
Please dont use snake_case here too, call it m128iVal and m128Val
r-abishek
left a comment
There was a problem hiding this comment.
@HazarathKumarM Please address comments on following similar rpp standards
There was a problem hiding this comment.
Pull Request Overview
This PR resolves review comments from Rajy and refines SIMD print functions by introducing union-based types, while removing explicit boundary check calls in several tensor processing kernels.
- Removed calls to rpp_pixel_check_0to1 in threshold, crop, crop_and_patch, and blend kernels
- Revised SIMD print helper functions to use union-based data for improved performance and clarity
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/modules/tensor/cpu/kernel/threshold.cpp | Removed boundary check calls; relies on compute_threshold logic |
| src/modules/tensor/cpu/kernel/crop_and_patch.cpp | Removed boundary check calls in pixel processing |
| src/modules/tensor/cpu/kernel/crop.cpp | Removed boundary check calls in pixel processing |
| src/modules/tensor/cpu/kernel/blend.cpp | Removed boundary check calls during alpha-blending |
| src/include/common/cpu/rpp_cpu_simd_load_store.hpp | Introduced union types and refactored SIMD print helpers |
Comments suppressed due to low confidence (4)
src/modules/tensor/cpu/kernel/threshold.cpp:434
- Removal of boundary checks in threshold functions should be validated; ensure that compute_threshold_* functions or another mechanism guarantees valid pixel ranges. Consider adding a clarifying comment about how boundary conditions are now enforced.
rpp_pixel_check_0to1(p, 1);
src/modules/tensor/cpu/kernel/crop_and_patch.cpp:448
- With the boundary check removed, verify that pixel values remain within valid ranges as expected downstream. A brief comment explaining this design decision might improve code clarity.
rpp_pixel_check_0to1(p, 3);
src/modules/tensor/cpu/kernel/crop.cpp:231
- Ensure that the removal of this boundary check does not lead to invalid pixel values in subsequent processing; document the alternative validation mechanism if one exists.
rpp_pixel_check_0to1(p, 3);
src/modules/tensor/cpu/kernel/blend.cpp:341
- Verify that eliminating the boundary check in the blend function is safe and that the blending operation correctly constrains the pixel values. Consider documenting this behavior for future maintainers.
rpp_pixel_check_0to1(p1, 3);
| const __m256i avx_pxMaskG = _mm256_setr_epi8(0x80, 1, 0x80, 0x80, 4, 0x80, 0x80, 7, 0x80, 0x80, 10, 0x80, 0x80, 13, 0x80, 0x80, 16, 0x80, 0x80, 19, 0x80, 0x80, 22, 0x80, 0x80, 25, 0x80, 0x80, 28, 0x80, 0x80, 0x80); | ||
| const __m256i avx_pxMaskB = _mm256_setr_epi8(0x80, 0x80, 2, 0x80, 0x80, 5, 0x80, 0x80, 8, 0x80, 0x80, 11, 0x80, 0x80, 14, 0x80, 0x80, 17, 0x80, 0x80, 20, 0x80, 0x80, 23, 0x80, 0x80, 26, 0x80, 0x80, 29, 0x80, 0x80); | ||
|
|
||
| // Union for handling 128-bit SIMD data (SSE). |
There was a problem hiding this comment.
[nitpick] It may be helpful to add a brief explanation on why the union-based SIMD structures were introduced and how they contribute to eliminating extra store instructions, to aid future maintenance.
all review comments are addressed now |
* Add F32 QA Golden outputs * modify Doxygen comments * modify range check functions * RPP F32 QA : Review Comments Resolution (#431) * Modified SIMD print functions to use union * remove redundant unions in print functions * removed pixel checks * remove pixel check in threshold * resolve review comments --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
…s/sphinx (r-abishek#431) Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.7.1 to 1.7.2. - [Release notes](https://github.com/ROCm/rocm-docs-core/releases) - [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md) - [Commits](ROCm/rocm-docs-core@v1.7.1...v1.7.2) --- updated-dependencies: - dependency-name: rocm-docs-core[api_reference] dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com>
* Add F32 QA Golden outputs * modify Doxygen comments * modify range check functions * RPP F32 QA : Review Comments Resolution (r-abishek#431) * Modified SIMD print functions to use union * remove redundant unions in print functions * removed pixel checks * remove pixel check in threshold * resolve review comments --------- Co-authored-by: HazarathKumarM <hazarathkumar@multicorewareinc.com> Co-authored-by: Kiriti Gowda <kiritigowda@gmail.com> Co-authored-by: HazarathKumarM <119284987+HazarathKumarM@users.noreply.github.com> Co-authored-by: Lakshmi Kumar <lakshmi.kumar@amd.com>
