Proposal to add fixed-point multiplication instructions

(Branched from Issue #175)

It would be useful to have fixed-point multiplication instructions, e.g. 32x32=32 and 16x16=16, similar to ARM SQRDMULH.

Some may think that the availability of a 32x32=64 integer multiplication (Issue #175) would remove the need for that, but that would be sub-optimal: staying within 32bit means doing 4 scalar operations per 128-bit vector operation, and most applications want to use the rounding flavor (SQRDMULH not SQDMULH) which would require a few more instructions to emulate if the instruction is missing, which in practice would result in applications making compromises between accuracy and performance.

(This is critical to integer-quantized neural network applications as performed by TensorFlow Lite using the ruy matrix multiplication library, see e.g. the usage of these instructions here, https://github.com/google/ruy/blob/57e64b4c8f32e813ce46eb495d13ee301826e498/ruy/kernel_arm64.cc#L517 )

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal to add fixed-point multiplication instructions #221

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal to add fixed-point multiplication instructions #221

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions