[x86/Linux] Enable FEATURE_EH_FUNCLETS#8889
Conversation
|
\CC @seanshpark |
| // Add space for personality routine, it must be 4-byte aligned. | ||
| // Everything in the UNWIND_INFO up to the variable-sized UnwindCodes | ||
| // array has already had its size included in unwindSize by the caller. | ||
| unwindSize += sizeof(ULONG); |
There was a problem hiding this comment.
We should not need the personality routine on x86. Also, the x64 unwind info codes referred to in next comment do not make sense on x86.
| @@ -10914,27 +10939,7 @@ void CEEJitInfo::reserveUnwindInfo(BOOL isFunclet, BOOL isColdCode, ULONG unwind | |||
|
|
|||
There was a problem hiding this comment.
Will the JIT ever pass non-zero unwindSize here for x86 Linux? If the unwindSize is always zero on x86 Linux (that I think should be the case), this method should do nothing for x86.
There was a problem hiding this comment.
@jkotas I'm not sure. Compiler::unwindReserveFuncHelper seems to pass sizeof(func->unwindCodes) - func->unwindCodeSlot as unwindSize into CEEJitInfo::reserveUnwindInfo.
Is there any documented standard related with x86? I read https://github.com/dotnet/coreclr/blob/master/Documentation/botr/clr-abi.md , but some x86-related stuffs are missing.
There was a problem hiding this comment.
There is no standard unwind information on Windows x86. In CoreCLR, we just store the unwind information as part of the GCInfo on x86.
There was a problem hiding this comment.
increaseUnwindInfoSize now does nothing for x86.
There was a problem hiding this comment.
@janvorli Is there any restriction on THUMB code generation? It seems that the direct encoding is related with THUMB code.
There was a problem hiding this comment.
@janvorli While reading ARM implementation, I found the following code from UnwindFragmentInfo::Finalize:
#if defined(_TARGET_ARM_)
DWORD header = headerFunctionLength | (headerVers << 18) | (headerXBit << 20) | (headerEBit << 21) |
(headerFBit << 22) | (headerEpilogCount << 23) | (headerCodeWords << 28);
#elif defined(_TARGET_ARM64_)
DWORD header = headerFunctionLength | (headerVers << 18) | (headerXBit << 20) | (headerEBit << 21) |
(headerEpilogCount << 22) | (headerCodeWords << 27);
#endif // defined(_TARGET_ARM64_)
ufiPrologCodes.AddHeaderWord(header);
It seems that ARM also stores the length of functions as unwind info as x86 does. IMO, it would be better to introduce a compact encoding as a separate PR.
There was a problem hiding this comment.
@janvorli Oh! Thank for confirm (but still I'm wondering why direct encoding works for ARM/ARM64).
There was a problem hiding this comment.
It is not used for ARM / ARM64 in CoreCLR, the information is never encoded using the compact encoding.
|
@dotnet-bot test Ubuntu x64 Checked Build and Test please |
src/vm/jitinterface.cpp
Outdated
| EE_TO_JIT_TRANSITION(); | ||
| } | ||
|
|
||
| void increaseUnwindInfoSize(ULONG &unwindSize) |
There was a problem hiding this comment.
This function adds space for the personality routine only, so it would be better to name it to reflect that.
There was a problem hiding this comment.
Something like reservePresonalityRoutineInfo?
| // a PSPSym for functions with any EH. | ||
| bool ehNeedsPSPSym() const | ||
| { | ||
| #ifdef _TARGET_X86_ |
There was a problem hiding this comment.
Why is this change needed? We use the PSPSym for for ARM32, ARM64 and amd64, so why should it be different for x86 Linux?
There was a problem hiding this comment.
It depends on the intricate implementation details. The current Win x86 exception handling or CoreRT exception handling on all platforms do not need PSPSym, so it is certainly possible to get away without having it ... I do not know whether it is a good idea for Linux x86.
There was a problem hiding this comment.
@janvorli According to CLR ABI, all the variables are accessed via frame pointer (EBP) and the managed code always stores frame pointer in its frame. That is one reason that I disable this.
In addition, I encountered various assertions failures that I could't understand when I tried to enable PSPSym-related code.
There was a problem hiding this comment.
The problem is that the CLR ABI doesn't consider the FEATURE_EH_FUNCLETS to be enabled for x86 and the funclets are actually the thing that uses the PSPSym.
But since this is just a one line change, I won't block you on this. Once we start handling exceptions in complex nested try / catch scenarios, it will quickly show up whether it is better to enable the PSPSym or to fix some place in the stack walker instead.
There was a problem hiding this comment.
@janvorli I agree. Later we may need to revise this again, but I currently have no idea.
There was a problem hiding this comment.
The main question is whether the unwinder on Linux/x86 is guaranteed to restore the EBP of the parent frame, such that it doesn't need to be restored by the code itself. That is the main reason for the PSPSym: we can't trust that EBP (or the frame pointer, on other architectures) is restored, especially in nested EH scenarios. CoreRT does this properly, and thus doesn't need the PSPSym.
In reply to: 95909398 [](ancestors = 95909398)
There was a problem hiding this comment.
I currently have no evidence whether Linux/x86 unwider can guarantee the correctness of EBP. We need more investigation.
In addition, I encountered the following assertion failure while codegen when I tried to enable PSPSym:
Assert failure(PID 2080 [0x00000820], Thread: 2080 [0x0820]): Assertion failed 'u2.emitArgTrackTop <= u2.emitArgTrackTab + emitMaxStackDepth' in 'System.AppDomain:SetupDomain(bool,ref,ref,ref,ref):this' (IL size 72)
File: /home/parjong/projects/dotnet/coreclr/src/jit/emit.cpp Line: 6819
Image: /root/dotnet-test/overlay/current/corerun
It would be better to enable PSPSym-related features in a separate PR.
src/jit/compiler.h
Outdated
|
|
||
| #if defined(_TARGET_AMD64_) | ||
| #if FEATURE_EH_FUNCLETS | ||
| #if defined(_TARGET_XARCH_) |
There was a problem hiding this comment.
Why is this needed? It seems to be AMD64 specific.
There was a problem hiding this comment.
As x86/Linux uses RUNTIME_FUNCTION with EndAddress, we need to update its value inside CEEJitInfo::allocUnwindInfo.
For that purpose, we need to store and deliver various emitLocation variables from JIT to EE.
src/jit/codegencommon.cpp
Outdated
|
|
||
| #if FEATURE_EH_FUNCLETS | ||
|
|
||
| #ifndef _TARGET_X86_ |
There was a problem hiding this comment.
I think we need it for x86 with EH funclets too. What was the reason for excluding it?
src/inc/clrnt.h
Outdated
| // | ||
| #ifdef _TARGET_AMD64_ | ||
|
|
||
| #define COMPACT_RUNTIME_FUNCTION 0 |
There was a problem hiding this comment.
The name is confusing to me. And it also seems we don't need the EndAddress for x86, or do we? If not, we can just revert the ifdef around the EndAddress to TARGET_AMD64 and remove this define.
There was a problem hiding this comment.
As I understand, we need EndAddress.
The implementation of RtlpGetFunctionEndAddress for ARM/ARM64 computes the end address via multiplying the length of function (which seems to be the number of instructions) and the size of each instruction, which means that this logic is valid only when the instruction size is fixed.
There was a problem hiding this comment.
Ah, that makes sense then. So can you please just modify the COMPACT_RUNTIME_FUNCTION to something like RUNTIME_FUNCTION_HAS_ENDADDRESS or RUNTIME_FUNCTION_NEEDS_ENDADDRESS or something like that?
There was a problem hiding this comment.
You do not need the EndAddress on x86. Having it for x86 is just wasting space - encoding the length in xdata is more compact. EndAddress on Win x64 was a mistake. It would not be there if we can go back in time.
It has nothing to do with instruction with. E.g. ARM32 has variable instruction width as well (the instruction is either 2 or 4 bytes long) and it is able to run without EndAddress just fine.
The multiplier has to do with instruction size granularity. For x86, the granularity is 1 byte so you should make the multiplier 1.
There was a problem hiding this comment.
@jkotas Could you let me know where UNWIND_INFO is generated? I tested with the following implementation of RtlpGetFunctionEndAddress, but it shows some strange behavior:
// This function returns the length of a function using the new unwind info on arm.
// Taken from minkernel\ntos\rtl\arm\ntrtlarm.h.
FORCEINLINE
ULONG
RtlpGetFunctionEndAddress (
__in PT_RUNTIME_FUNCTION FunctionEntry,
__in ULONG ImageBase
)
{
ULONG FunctionLength;
FunctionLength = FunctionEntry->UnwindData;
if ((FunctionLength & 3) != 0) {
FunctionLength = (FunctionLength >> 2) & 0x7ff;
} else {
FunctionLength = *(PTR_ULONG)(ImageBase + FunctionLength) & 0x3ffff;
}
return FunctionEntry->BeginAddress + FunctionLength;
}
FunctionLength is 0 at the return state. It seems that relevant-information is currently not emiited.
There was a problem hiding this comment.
The unwind info format is different for each platform. The Windows ABI documents linked at the start of https://github.com/dotnet/coreclr/blob/master/Documentation/botr/clr-abi.md have the specs of unwind info format for x64 and ARM.
There is no standalone unwind info defined for Windows x86. Windows x86 ABI was defined before the concept of standalone unwind info came to Windows and nobody bothered to define it retroactively.
On Windows x86, the unwind info is encoded as part of GCInfo in CoreCLR. The unwinder for it is at https://github.com/dotnet/coreclr/blob/master/src/vm/eetwain.cpp#L3897. For Linux 86, you have basically two options:
- Keep using the existing scheme where the unwind info is encoded as part of GCInfo, maybe with some minimal modifications. These modifications have to be written down so that everybody is on the same page.
or - Define your own new x86 unwind info format that is similar to the unwind info formats used on non-x86 platforms. The spec for this format has to be written down so that everybody is on the same page.
I would recommend going with option 1, at least for now.
There was a problem hiding this comment.
I tried with the following implementation of RtlpGetFunctionEndAddress
This implementation of RtlpGetFunctionEndAddress specific to the current ARM unwind info format. That's why it does not just work on x86.
|
Can you please update |
|
@BruceForstall I currently attempt not to change the existing ABI as much as possible, but I'll update CLR ABI when any changes on ABI are required. |
src/jit/unwindxarch.cpp
Outdated
| UNATIVE_OFFSET startOffset, | ||
| UNATIVE_OFFSET endOffset, | ||
| const UNWIND_INFO* const pHeader) | ||
| { |
There was a problem hiding this comment.
It seems strange to me to move the x64 unwind code-related functions (including DumpUnwindInfo) to a common place to compile for x86, since (I assume) they won't be used for x86.
| else | ||
| { | ||
| #ifndef JIT32_GCENCODER | ||
| // Because of the way the flowgraph is connected, the liveness info for this one instruction |
There was a problem hiding this comment.
This will get it to compile, but is there a TODO issue here related to GC handling of this nop on x86?
There was a problem hiding this comment.
Please add a "TODO" comment to remind us to look at this, e.g.:
// TODO-Linux-x86: Do we need to handle the GC information for this NOP or JMP specially, as is done for other architectures?
In reply to: 95926754 [](ancestors = 95926754)
|
|
||
| #if defined(_TARGET_X86_) | ||
| #if defined(_TARGET_X86_) && !FEATURE_EH_FUNCLETS | ||
|
|
There was a problem hiding this comment.
Why do you need to change legacy backend? Isn't Linux/x86 using RyuJIT/x86?
There was a problem hiding this comment.
This change is introduced to fix build failure. lvaShadowSPslotsVar is not available when FEATURE_EH_FUNCLETS is set.
There was a problem hiding this comment.
Yes, but why is this file getting built with FEATURE_EH_FUNCLETS defined on x86?
There was a problem hiding this comment.
Do you mean that FEATURE_EH_FUNCLETS should be enabled only for RyuJIT?
There was a problem hiding this comment.
For x86, yes. (FEATURE_EH_FUNCLETS is defined for legacy JIT for ARM32.)
| * | ||
| * Generates code for an EH funclet prolog. | ||
| */ | ||
|
|
There was a problem hiding this comment.
Can you add a TODO here to say what needs to be done for Linux/x86?
There was a problem hiding this comment.
I currently have no idea on which action will be required for prolog/epilog. Could you let me know if you have any idea?
There was a problem hiding this comment.
typically, it would be (1) set up PSPSym (if required), (2) save/restore callee-save registers, (3) handle incoming args/return value. I assume you'll at least need to do (2).
There was a problem hiding this comment.
936ebb7 adds TODO (I'm not sure about (1) and (3), and thus it adds only (2) as TODO).
There is no current spec for Linux/x86 (that I am aware of), so any decisions made here constitute an ABI definition. E.g., the clr-abi.md document in the "Funclets" section states, "For non-x86 platforms", and then later, "For x86". These statements are no longer strictly true given that this change creates some amount of funclet functionality for Linux/x86. |
|
@dotnet-bot test Windows_NT x64 Debug Build and Test please |
src/pal/inc/pal.h
Outdated
| typedef struct _RUNTIME_FUNCTION { | ||
| DWORD BeginAddress; | ||
| #ifdef _AMD64_ | ||
| #if RUNTIME_FUNCTION_HAS_ENDADDRESS |
There was a problem hiding this comment.
Now that we have the EndAddress on amd64 only again, could you please revert the change here and get rid of the RUNTIME_FUNCTION_HAS_ENDADDRESS?
There was a problem hiding this comment.
@janvorli As I understand, the following code in CEEJitInfo::allocUnwindInfo should be aligned with the above _RUNTIME_FUNCTION definition:
#if RUNTIME_FUNCTION_HAS_ENDADDRESS
pRuntimeFunction->EndAddress = currentCodeOffset + endOffset;
#endif
For me, it was hard to find this relationship when the above definition is written using _TARGET_AMD64_.
I would like not to revert this change if there is no further issue. Could you let me know your opinion?
There was a problem hiding this comment.
Based on what @jkotas said, this is really AMD64 specific and we don't ever want to use it for anything else. That's why I would really prefer keeping it as it was to not to make false expectations that it might be enabled for something else than AMD64.
src/jit/compiler.h
Outdated
| void unwindEmitFunc(FuncInfoDsc* func, void* pHotCode, void* pColdCode); | ||
|
|
||
| #if defined(_TARGET_AMD64_) | ||
| #ifdef _TARGET_XARCH_ |
There was a problem hiding this comment.
I wonder if this should rather be #if defined(_TARGET_AMD64_) || (defined(_TARGET_X86_) && FEATURE_EH_FUNCLETS)
|
@jkotas @janvorli @BruceForstall I revised PR per feedback. Please take a look. |
| // EH-related funclet: FUNC_HANDLER or FUNC_FILTER | ||
|
|
||
| #if FEATURE_EH_FUNCLETS | ||
| #if defined(_TARGET_AMD64_) |
There was a problem hiding this comment.
It looks like you've added #if FEATURE_EH_FUNCLETS, but x86 won't get any of the enclosed declarations since there is no _TARGET_X86_ section. So can you remove the (unnecessary) added #if FEATURE_EH_FUNCLETS?
| #if FEATURE_EH_FUNCLETS | ||
|
|
||
| #ifndef _TARGET_X86_ | ||
| genSetPSPSym(initReg, &initRegZeroed); |
There was a problem hiding this comment.
This #ifdef is unnecessary: genSetPSPSym() will bail out early because lvaPSPSym == BAD_VAR_NUM which is true because ehNeedsPSPSym() will return false (so lvaMarkLocalVars() won't create one).
There was a problem hiding this comment.
Thank you for comment! fe513b3 also removes this #if statement.
|
LGTM |
|
@dotnet-bot test Ubuntu x64 Checked Build and Test please |
This commit defines RUNTIME_FUNCTION_HAS_ENDADDRESS via clrdefinition.cmake as RUNTIME_FUNCTION_HAS_ENDADDRESS affects both Runtime and PAL.
cce306c to
1fbb811
Compare
|
@dotnet-bot test Linux ARM Emulator Cross Debug Build please |
|
@janvorli PR is updated. Please take a look. |
|
@parjong we still have to make one change here which @jkotas has just mentioned to me. At the moment, the GCInfo is attached only to the main function. So it is possible to get GCInfo for any address in the main function. But if you try to get it for one of the funclets, you won't get anything. |
|
@janvorli Could you let me know related methods? It would be very helpful if I could know the starting point. |
|
@janvorli Could you let me know some detail? It seems that JIT encodes GCInfo using Is it required to add a field to UNWIND_INFO structure that points to GCInfo of main function, and update this field while emitting unwind info? |
|
@parjong I need to figure out the exact details. I will get back to you later today. But as for the pointer to the GC info of the main function, it should not be part of the UNWIND_INFO, since it has nothing to do with unwind. It should rather be just stored in memory after the UNWIND_INFO for funclets. If you put it into the UNWIND_INFO, you would unnecessarily increase memory space used by each function while only funclets need it. |
|
@parjong I have found I got somehow confused. The truth is that we don't need to do anything special for the GC info for the jitted code. We will need to solve the problem for the crossgen-ed binaries where the GC info is stored differently. The EEJitManager is used only for jitted code. For the crossgen-ed code, the NativeImageJitManager (or ReadyToRunJitManager for code crossgen-ed with the readytorun switch) is used instead. |
|
@parjong the title of this issue says (Partially) enable ... Is there other work that you want to do here now or do you want to merge it? |
|
@janvorli I currently have no further work related with this issue. Please merge this PR if it has no problem. |
* [x86/Linux] (Partially) Enable FEATURE_EH_FUNCLETS * Update CLR ABI Document * Add TODO (for Funclet Prolog/Epilog Gen) Commit migrated from dotnet/coreclr@347243f
This commit revises JIT to generate code based on EH FUNCLETS for x86/Linux that could call
DispatchManagedException.The main changes are as follows:
This commit (partially) fixes #8887