Median and Gaussian Filter optimizations#681
Conversation
RPP Median Filter and Gaussian Optimizations
There was a problem hiding this comment.
Pull request overview
This PR implements significant performance optimizations for median and gaussian filters targeting AVX2-level performance comparable to OpenCV. The changes introduce sorting networks for small kernels (3×3, 5×5), histogram-based methods for large kernels with U8 types, and various SIMD optimizations.
Changes:
- Median filter: Added specialized 5×5 sorting network implementations for HIP and CPU with AVX2 vectorization
- Median filter: Introduced histogram-based median calculation for U8 types with large kernels (7×7, 9×9+)
- Gaussian filter: Pre-computed filter coefficients, optimized kernel generation, added prefetching and loop unrolling
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| src/modules/tensor/hip/kernel/median_filter.cpp | Fixed double semicolon, added optimized 5×5 median implementation with sorting network, refactored compute_median with histogram method for large U8 kernels |
| src/modules/tensor/cpu/kernel/median_filter.cpp | Added AVX2-vectorized sorting networks for 3×3 and 5×5 median filters across multiple data types (U8, I8, F32, F16), implemented histogram-based median for large U8 kernels |
| src/modules/tensor/hip/kernel/gaussian_filter.cpp | Changed expf to __expf intrinsic, optimized kernel generation with single-pass normalization |
| src/modules/tensor/cpu/kernel/gaussian_filter.cpp | Pre-allocated and broadcast filter coefficients, optimized kernel generation, added prefetching, unrolled convolution loops for 3×3 kernel |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #681 +/- ##
===========================================
+ Coverage 92.45% 92.48% +0.03%
===========================================
Files 215 215
Lines 94897 95987 +1090
===========================================
+ Hits 87729 88767 +1038
- Misses 7168 7220 +52
🚀 New features to boost your workflow:
|
rrawther
left a comment
There was a problem hiding this comment.
please address the review comments
|
@r-abishek @HazarathKumarM code coverage drop 7% . Please take a look |
|
@LakshmiKumar23 With the recent CI runs, we are observing the code coverage of this PR as 92.48%. we are not observing any drop |

This PR contains specific powerful optimizations in Median and Gaussian filter to match and marginally beat AVX2 level performance using OpenCV.
PERFORMANCE COMPARISONS FOR MEDIAN FILTER AND GAUSSIAN FILTER