Skip to content

ARM64: Suboptimal codegen for addressing modes #93263

@EgorBo

Description

@EgorBo
void Test(byte* dst, byte* src, nuint i)
{
    var vec1 = Vector128.Load(src + i);
    var vec2 = Vector128.Load(src + i + 16);
    vec1.Store(dst + i);
    vec2.Store(dst + i + 16);
}

x64 codegen:

; Assembly listing for method Benchmarks:Test(ulong,ulong,ulong):this (FullOpts)
       vzeroupper 
       vmovups  xmm0, xmmword ptr [r8+r9]
       vmovups  xmm1, xmmword ptr [r8+r9+0x10]
       vmovups  xmmword ptr [rdx+r9], xmm0
       vmovups  xmmword ptr [rdx+r9+0x10], xmm1
       ret      
; Total bytes of code 30

ARM64 codegen:

; Assembly listing for method Benchmarks:Test(ulong,ulong,ulong):this (FullOpts)
            stp     fp, lr, [sp, #-0x10]!
            mov     fp, sp

            ldr     q16, [x2, x3]
            add     x0, x2, x3
            ldr     q17, [x0, #0x10]
            str     q16, [x1, x3]
            add     x0, x1, x3
            str     q17, [x0, #0x10]

            ldp     fp, lr, [sp], #0x10
            ret     lr
; Total bytes of code 40

Expected codegen for ARM64:

; Assembly listing for method Benchmarks:Test(ulong,ulong,ulong):this (FullOpts)
            stp     fp, lr, [sp, #-0x10]!
            mov     fp, sp

            add     x0, x2, x3
            ldp     q16, q17, [x0]
            add     x0, x1, x3
            stp     q16, q17, [x0]

            ldp     fp, lr, [sp], #0x10
            ret     lr
; Total bytes of code 32

It seems like Morph is marking as NO_CSE for (BASE + INDEX) + CNS while on ARM64 it's not possible to have both index and offset in the same addressing mode anyway.

Metadata

Metadata

Assignees

No one assigned

    Labels

    arch-arm64area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions