Codegen for `LoadVector128` for a field of a struct is "poor"

As per the comment here: https://github.com/dotnet/corefx/pull/31779/files#r210758497

The `SSE` implementation of `Matrix4x4.Transpose` is doing:
```csharp
var row1 = Sse.LoadVector128(&matrix.M11);
var row2 = Sse.LoadVector128(&matrix.M21);
var row3 = Sse.LoadVector128(&matrix.M31);
var row4 = Sse.LoadVector128(&matrix.M41);
```

Which leads to the following codegen:
```asm
mov      rax, rdx
vmovups  xmm0, xmmword ptr [rax]
lea      rax, bword ptr [rdx+16]
mov      r8, rax
vmovups  xmm1, xmmword ptr [r8]
lea      r8, bword ptr [rdx+32]
mov      r9, r8
vmovups  xmm2, xmmword ptr [r9]
lea      r9, bword ptr [rdx+48]
mov      r10, r9
vmovups  xmm3, xmmword ptr [r10]
```

Ideally, we should be generating the following instead:
```asm
vmovups  xmm0, xmmword ptr [rdx]
vmovups  xmm1, xmmword ptr [rdx+16]
vmovups  xmm2, xmmword ptr [rdx+32]
vmovups  xmm3, xmmword ptr [rdx+48]
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Codegen for `LoadVector128` for a field of a struct is "poor" #10915

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Codegen for LoadVector128 for a field of a struct is "poor" #10915

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Codegen for `LoadVector128` for a field of a struct is "poor" #10915