Cranelift: Generate load/store using `AMode::RegScaled` on aarch64

#### Feature

Currently, on aarch64 backend, the following piece of CLIF instructions...

```
; Equivalent to: int64_t *v9; int64_t v10; v4 = v9[v10];
v1 = iconst.i64 3
v2 = ishl.i64 v10, v1  ; v1 = 3
v3 = iadd v9, v2
v4 = load.i64 v3
```

... will generate the assembly like below:

```
adrp    x4, 0x780000
ldr     x4, [x4]
lsl     x5, x3, #3
ldr     x4, [x4, x5]
```

However, the assembly can be converted into more efficient one like this:

```
adrp    x4, 0x780000
ldr     x4, [x4]
ldr     x4, [x4, x3, lsl #3]
```

#### Benefit

The shorter instruction sequence will help improve the performance.
In fact, this problem was found when I was diffing the assembly generated by cranelift and llvm, where llvm was around 10% faster than cranelift in my case.

#### Implementation

I've walked through the cranelift codebase and figured out that such addressing mode seems to be represented as `AMode::RegScaled`, but not sure how I can teach the code generator to use `RegScaled` for `ldr`.
Editing isle rules or something like that?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Cranelift: Generate load/store using `AMode::RegScaled` on aarch64 #6742

Feature

Benefit

Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Cranelift: Generate load/store using AMode::RegScaled on aarch64 #6742

Description

Feature

Benefit

Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Cranelift: Generate load/store using `AMode::RegScaled` on aarch64 #6742