Skip to content

F16 load/store updates for 4 kernels#444

Merged
r-abishek merged 1 commit intor-abishek:ar/opt_f16_loads_stores_3from
Srihari-mcw:f16_load_store_updates_4_kernels
Jul 9, 2025
Merged

F16 load/store updates for 4 kernels#444
r-abishek merged 1 commit intor-abishek:ar/opt_f16_loads_stores_3from
Srihari-mcw:f16_load_store_updates_4_kernels

Conversation

@Srihari-mcw
Copy link
Copy Markdown
Collaborator

@Srihari-mcw Srihari-mcw commented Jun 8, 2025

  • Replaced scalar load/store and conversion to FP32, with AVX2 intrinsics – no additions or removals to external user API for the following kernels : ​
  • Brightness, Contrast, Magnitude and Vignette​
  • Observe gains of 4% –12% over the existing version

@Srihari-mcw
Copy link
Copy Markdown
Collaborator Author

image
image

@r-abishek r-abishek changed the base branch from develop to ar/opt_f16_loads_stores_3 July 9, 2025 05:44
@r-abishek r-abishek changed the title Updates for 4 kernels F16 load/store updates for 4 kernels Jul 9, 2025
Copy link
Copy Markdown
Owner

@r-abishek r-abishek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@r-abishek r-abishek added the enhancement New feature or request label Jul 9, 2025
@r-abishek r-abishek merged commit 73117d0 into r-abishek:ar/opt_f16_loads_stores_3 Jul 9, 2025
ManasaDattaT pushed a commit to ManasaDattaT/rpp that referenced this pull request Dec 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants