-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Generic virtual methods
In .NET we have generic virtual methods, which stand for virtual methods that have method instantiations. For example,
IProcessor p = new MyValueProcessor();
p.Process(42);
public interface IProcessor
{
void Process<T>(T item);
}
public class MyValueProcessor : IProcessor
{
public void Process<T>(T item)
{
Console.WriteLine(item?.ToString());
}
}We made the method in the interface a generic method in order to avoid boxing, but unfortunately today all generic virtual methods go through a virtual function lookup followed by an indirect call:
mov rdi, rbx ; this
mov rsi, 0x7A468E5A8758 ; Program+IProcessor
mov rdx, 0x7A468E6F0F88 ; token handle
call [CORINFO_HELP_VIRTUAL_FUNC_PTR]
mov rdi, rbx
mov esi, 42
call raxThis can make the code even slower than not making the method generic because we are not able to devirtualize any indirect call today.
Devirtualization story
Today we assert the base method desc must not have a method instantiation because with today's devirtualization, we will end up with a method desc with invalid method instantiations as they live in the base method desc. So we can create an associated method desc for it to put the instantiation information on the devirted method desc as well, after we find the exact method table:
pDevirtMD = pDevirtMD->FindOrCreateAssociatedMethodDesc(
/* pDefMD */ pDevirtMD,
/* pExactMT */ pExactMT,
/* forceBoxedEntryPoint */ pExactMT->IsValueType() && !pDevirtMD->IsStatic(),
/* methodInst */ pBaseMD->GetMethodInstantiation(),
/* allowInstParam */ false
);Note that we need to handle unboxing stub, so for instance struct receivers, we need to force boxed entry point. And because the method itself is generic, allowInstParam should be false.
Then with the devirted method desc, we can call it with InstParam directly without going through the virtual function pointer lookup. While there's a case where it can eventually end up with a canonical method table as the exact method table, in which case we need to bail.
But if we take a deeper look:
STMT00002 ( 0x005[--] ... ??? )
[000012] DACXG------ * STORE_LCL_VAR long V02 tmp2
[000011] --CXG------ \--* CALL help long CORINFO_HELP_VIRTUAL_FUNC_PTR
[000005] ----------- arg0 +--* LCL_VAR ref V01 tmp1
[000009] H---------- arg1 +--* CNS_INT(h) long 0x7ffe88447668 class IProcessor
[000010] H---------- arg2 \--* CNS_INT(h) long 0x7ffe884479c0 token
STMT00003 ( ??? ... ??? )
[000007] --CXG------ * CALL ind void
[000008] ----------- this +--* LCL_VAR ref V01 tmp1
[000006] ----------- arg1 +--* CNS_INT int 42
[000013] ----------- calli tgt \--* LCL_VAR long V02 tmp2
we can find that, although we have all the necessary information, we spilled the ldvirtftn so that we lost those information when we do the indirect call, so we don't have the method desc we want when we do the devirtualization.
And furthermore, even we have all the necessary information we need to devirtualize the call, the devirted method may be an instantiating stub that requires a runtime lookup, in this case we cannot use the instantiating stub from WrappedMethodDesc we created by FindOrCreateAssociatedMethodDesc before as the InstParam, so we still need to put the runtime lookup node as an InstParam arg.
The solution to this is to not spill it early, so that we can end up trees like
STMT00002 ( 0x005[--] ... ??? )
[000007] --CXG------ * CALL ind void
[000008] ----------- this +--* LCL_VAR ref V01 tmp1
[000006] ----------- arg1 +--* CNS_INT int 42
[000011] --CXG------ calli tgt \--* CALL help long CORINFO_HELP_VIRTUAL_FUNC_PTR
[000005] ----------- arg0 +--* LCL_VAR ref V01 tmp1
[000009] H---------- arg1 +--* CNS_INT(h) long 0x7ffed0cd7478 class IProcessor
[000010] H---------- arg2 \--* CNS_INT(h) long 0x7ffed0cd79c0 token
Then we will have all the necessary information for devirtualization. After devirtualization, we can push the necessary method InstParam to the call. In the above case we don't need a method InstParam so it will end up
[000007] --CXG------ * CALL nullcheck void PrintProcessor:Process[int](int):this
[000008] ----------- this +--* LCL_VAR ref V01 tmp1
[000006] ----------- arg1 \--* CNS_INT int 42
for cases where we need a method InstParam, it may end up:
[000007] --CXG------ * CALL nullcheck void PrintProcessor:Process[System.__Canon](System.__Canon):this
[000008] ----------- this +--* LCL_VAR ref V01 tmp1
[000012] H---------- gctx +--* CNS_INT(h) long 0x7ffe9e4e7cf8 method PrintProcessor:Process[System.String](System.String):this
[000006] ----------- arg2 \--* CNS_STR ref <string constant>
or when a runtime lookup is required (a real-world example):
[000051] --CXG------ * CALL ind ref
[000052] ----------- this +--* LCL_VAR ref V08 tmp5
[000050] ----------- arg1 +--* LCL_VAR ref V06 tmp3
[000060] --CXG------ calli tgt \--* CALL help long CORINFO_HELP_VIRTUAL_FUNC_PTR
[000049] ----------- arg0 +--* LCL_VAR ref V08 tmp5
[000053] H---------- arg1 +--* CNS_INT(h) long 0x7ffe8942a520 class Microsoft.Extensions.Options.OptionsBuilder`1[Microsoft.Extensions.Options.StartupValidatorOptions]
[000059] ----------- arg2 \--* RUNTIMELOOKUP long 0x7ffe8942ba50 method
[000058] ----------- \--* LCL_VAR long V09 tmp6
we can devirt it into
[000051] --CXG------ * CALL nullcheck ref Microsoft.Extensions.Options.OptionsBuilder`1[System.__Canon]:Configure[System.__Canon](System.Action`2[System.__Canon,System.__Canon]):Microsoft.Extensions.Options.OptionsBuilder`1[System.__Canon]:this
[000052] ----------- this +--* LCL_VAR ref V08 tmp5
[000059] ----------- gctx +--* RUNTIMELOOKUP long 0x7ffe8942ba50 method
[000058] ----------- | \--* LCL_VAR long V09 tmp6
[000050] ----------- arg2 \--* LCL_VAR ref V06 tmp3
So far we managed to devirtualize generic virtual methods, then we can unblock the inlining, even for cases where runtime lookup is needed for method inst.
However, the JIT backend doesn't handle well when the call address is a CALL in an indirect call, that is, if we failed to devirtualize a generic virtual call, we will end up
STMT00002 ( 0x005[--] ... ??? )
[000007] --CXG------ * CALL ind void
[000008] ----------- this +--* LCL_VAR ref V01 tmp1
[000006] ----------- arg1 +--* CNS_INT int 42
[000011] --CXG------ calli tgt \--* CALL help long CORINFO_HELP_VIRTUAL_FUNC_PTR
[000005] ----------- arg0 +--* LCL_VAR ref V01 tmp1
[000009] H---------- arg1 +--* CNS_INT(h) long 0x7ffed0cd7478 class IProcessor
[000010] H---------- arg2 \--* CNS_INT(h) long 0x7ffed0cd79c0 token
where the backend is not handling it well.
The prototype has done in #112353, and it shows many interesting optimization opportunities across logging, json parsing, LINQ/PLINQ, collections, hosting, dependency injection that being extensively adopted by all kinds of apps today and etc., see MihuBot/runtime-utils#1004 (code size regression are due to more inlining). And this work is also a prerequisites of enabling devirtualizing delegates that require a closure (capture locals).
NativeAOT uses a fat pointer for this so it need to be handled separately.
Taking the above code as an example,
before:
G_M24375_IG01: ;; offset=0x0000
push rbp
push rbx
push rax
lea rbp, [rsp+0x10]
;; size=8 bbWeight=1 PerfScore 3.50
G_M24375_IG02: ;; offset=0x0008
mov rdi, 0x7F1A079C8828 ; Program+MyValueProcessor
call CORINFO_HELP_NEWSFAST
mov rbx, rax
mov rdi, rbx
mov rsi, 0x7F1A079C8758 ; Program+IProcessor
mov rdx, 0x7F1A07B10F88 ; token handle
call [CORINFO_HELP_VIRTUAL_FUNC_PTR]
mov rdi, rbx
mov esi, 42
call rax
xor eax, eax
;; size=59 bbWeight=1 PerfScore 9.00
G_M24375_IG03: ;; offset=0x0043
add rsp, 8
pop rbx
pop rbp
ret
;; size=7 bbWeight=1 PerfScore 2.25after:
G_M27646_IG01: ;; offset=0x0000
sub rsp, 40
;; size=4 bbWeight=1 PerfScore 0.25
G_M27646_IG02: ;; offset=0x0004
mov ecx, 42
call [System.Number:Int32ToDecStr(int):System.String]
mov rcx, rax
call [System.Console:WriteLine(System.String)]
nop
;; size=21 bbWeight=1 PerfScore 6.75
G_M27646_IG03: ;; offset=0x0019
add rsp, 40
ret
;; size=5 bbWeight=1 PerfScore 1.25Plans
- 0. (Prerequisite) Refactor the devirtualizer to stop relying on
IsVirtualfor candidating method to devirtualize - JIT: Refactor around impDevirtualizeCall for GVM devirt #112610 - 1. Stop splling
ldvirtftn- Stop spilling ldvirtftn #120866 - 2. Handle
CALLas call address of an indirect call in the backend - Stop spilling ldvirtftn #120866 - 3. Make changes to the VM to unblock devirtualization for non-shared virtual generics - JIT: Devirtualize non-shared generic virtual methods #122023
- 4. Enable late devirt for virtual generics - JIT: Devirtualize non-shared generic virtual methods #122023
- 5. Enable inlining for devirted virtual generics
- 6. Support NativeAOT and R2R
- R2R for non-shared virtual generics - JIT: Devirtualize non-shared GVMs in R2R #123183
- AOT for non-shared virtual generics
- 7. Figure out what we need to unblock shared GVM devirt
- Shared GVM that doesn't require runtime lookup - JIT: Devirtualize shared GVM that doesn't need a runtime lookup #123323
- Shared GVM that requires runtime lookup
- 8. MethodDesc probing for virtual generics in PGO (guarded devirt)
cc @dotnet/jit-contrib
cc @jkotas @MichalStrehovsky for review and suggestions on the VM part