Skip to content

prefetch instruction #1364

@fbarchard

Description

@fbarchard

Could you add a prefetch instruction for WASM?
The syntax would be similar to a load

The load instruction uses a local.get for the base pointer, and an immediate offset:, and then puts the result back on the stack.

  local.get    1        # src
  f32.load     0        # load float from src
  local.set    4

a prefetch has a base and offset, but no result

  local.get        1        # src
  i32.prefetch   448     # prefetch src with offset of 448 bytes

On arm64 this implements with
prfm pldl1keep, [src, 448]
On x64
prefetcht0 448(src)

Performance improvement varies by cpu and function. On Cortex A53 a typical speed up is 20%.
On Atom Cederview, an ssse3 function to convert rgb to grey scale speeds up 9%.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions