-
Notifications
You must be signed in to change notification settings - Fork 702
Open
Labels
Description
Could you add a prefetch instruction for WASM?
The syntax would be similar to a load
The load instruction uses a local.get for the base pointer, and an immediate offset:, and then puts the result back on the stack.
local.get 1 # src
f32.load 0 # load float from src
local.set 4
a prefetch has a base and offset, but no result
local.get 1 # src
i32.prefetch 448 # prefetch src with offset of 448 bytes
On arm64 this implements with
prfm pldl1keep, [src, 448]
On x64
prefetcht0 448(src)
Performance improvement varies by cpu and function. On Cortex A53 a typical speed up is 20%.
On Atom Cederview, an ssse3 function to convert rgb to grey scale speeds up 9%.
Reactions are currently unavailable