-
Notifications
You must be signed in to change notification settings - Fork 84
Investigate SVE acceleration for RPO hash function #158
Description
Rounds of RPO hash function have a very regular structure which should be amenable to vectorized computation. This is especially true for the inverse alpha portion which applies ~70 identical operations to a state of 12 elements. This portion is by far the most time-consuming part of the hash function.
By using vectorized instructions, it may be possible to speed the hash function up by 2x - 3x (though, this needs to be confirmed). As one of our target machines for Miden VM is Graviton 3, which supports SVE extension, it would be great to see if can get this type of speed up there.
Ideally, we'd want to add a feature to this crate which, when enabled, would replace the current pure Rust code for either the entire RPO permutation, a single RPO round, are even just the inverse alpha computation.