Manual bounds checks are less efficient

From https://github.com/dotnet/runtime/issues/80129, several of the `Vector2/3/4` and `Vector<T>` APIs involving reading from or writing to a span were changed from intrinsics to managed methods.

For the most part this is generally correct/good. However, there are some notable performance differences due to the IR that gets created.

Notably, consider the following minimal and self-contained example:
```csharp
private static (int, int) Load(int[] array, int index)
{
    if ((index < 0) || ((array.Length - index) < 2))
    {
        throw new ArgumentOutOfRangeException();
    }

    return (array[index + 0], array[index + 1]);
}
```

This creates two notable trees (similarly if a throw helper is used):
```
STMT00000 ( 0x000[E-] ... ??? )
               [000003] -----------                         *  JTRUE     void  
               [000002] -----------                         \--*  LT        int   
               [000000] -----------                            +--*  LCL_VAR   int    V01 arg1         
               [000001] -----------                            \--*  CNS_INT   int    0
```

and

```
STMT00004 ( 0x004[E-] ... ??? )
               [000018] ---X-------                         *  JTRUE     void  
               [000017] ---X-------                         \--*  GE        int   
               [000015] ---X-------                            +--*  SUB       int   
               [000013] ---X-------                            |  +--*  ARR_LENGTH int   
               [000012] -----------                            |  |  \--*  LCL_VAR   ref    V00 arg0         
               [000014] -----------                            |  \--*  LCL_VAR   int    V01 arg1         
               [000016] -----------                            \--*  CNS_INT   int    2
```

This is significantly different from the intrinsic handling which directly created `BOUNDS_CHECK` nodes:
```
               [000067] ---X-------                            +--*  COMMA     ref   
               [000061] ---X-------                            |  +--*  BOUNDS_CHECK_ArgRng void  
               [000055] -----------                            |  |  +--*  LCL_VAR   int    V11 loc5         
               [000060] ---X-------                            |  |  \--*  ARR_LENGTH int   
               [000054] -----------                            |  |     \--*  LCL_VAR   ref    V08 loc2         
               [000066] ---X-------                            |  \--*  COMMA     ref   
               [000065] ---X-------                            |     +--*  BOUNDS_CHECK_ArgRng void  
               [000063] -----------                            |     |  +--*  ADD       int   
               [000062] -----------                            |     |  |  +--*  LCL_VAR   int    V11 loc5         
               [000056] -----------                            |     |  |  \--*  CNS_INT   int    3
               [000064] ---X-------                            |     |  \--*  ARR_LENGTH int   
               [000058] -----------                            |     |     \--*  LCL_VAR   ref    V08 loc2         
               [000057] -----------                            |     \--*  LCL_VAR   ref    V08 loc2         
               [000059] -----------                            \--*  LCL_VAR   int    V11 loc5 
```

Because these aren't `BOUNDS_CHECK` nodes, the JIT throughput is not only "less efficient" but the optimizations that kick in are as well and it results in overall worse codegen.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Manual bounds checks are less efficient #80256

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Manual bounds checks are less efficient #80256

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions