Capture AVX context when redirecting threads for suspension#47212
Capture AVX context when redirecting threads for suspension#47212VSadov merged 10 commits intodotnet:masterfrom
Conversation
src/coreclr/vm/threadsuspend.cpp
Outdated
| _ASSERTE(!success); | ||
|
|
||
| // So now allocate a buffer of that size and call IntitializeContext again | ||
| LPVOID buffer = malloc(contextSize); |
There was a problem hiding this comment.
Nit: In VM, the typically pattern is to new (nothrow) BYTE[....] instead of calling malloc directly
There was a problem hiding this comment.
I had in fact tried this, but I observed an assert saying new [] isnt allowed here. Wonder what was different then.
|
About the context size savings due to compaction: On my machine:
I see 128 bytes savings from compaction, which is ~10%. Not a lot. Also - somehow we do not have |
|
I think this is ready to be reviewed. |
| { | ||
| success = pfnInitializeContext2 ? | ||
| pfnInitializeContext2(buffer, context, &pOSContext, &contextSize, xStateCompactionMask): | ||
| InitializeContext(buffer, context, &pOSContext, &contextSize); |
There was a problem hiding this comment.
does our CI validate on non-AVX machine? Would be good to find one to ensure this change is not breaking on those.
There was a problem hiding this comment.
It would be interesting to try on AVX512 machine as well. I doubt the CI machines/VMs have that.
There was a problem hiding this comment.
Nearly everything has AVX now and AVX512 is fairly rare.
There was a problem hiding this comment.
VMs in Azure should support some 512 variant:
GenuineIntel
Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
3DNOW not supported
3DNOWEXT not supported
ABM not supported
ADX supported
AES supported
AVX supported
AVX2 supported
AVX512CD supported
AVX512ER not supported
AVX512F supported
AVX512PF not supported
src/coreclr/vm/threadsuspend.cpp
Outdated
|
|
||
| if (!success) | ||
| { | ||
| free(buffer); |
src/coreclr/vm/threadsuspend.cpp
Outdated
| *contextBuffer = buffer; | ||
|
|
||
| #else | ||
| pOSContext = new (nothrow) CONTEXT(); |
There was a problem hiding this comment.
| pOSContext = new (nothrow) CONTEXT(); | |
| pOSContext = new (nothrow) CONTEXT; |
I believe that new (nothrow) CONTEXT() will zero-initialize the context that is not necessary. This change should omit the zero-initialization.
There was a problem hiding this comment.
That's true - verified with the compiler explorer
src/coreclr/vm/threadsuspend.cpp
Outdated
|
|
||
| _ASSERTE(!success); | ||
|
|
||
| // So now allocate a buffer of that size and call IntitializeContext again |
There was a problem hiding this comment.
| // So now allocate a buffer of that size and call IntitializeContext again | |
| // So now allocate a buffer of that size and call InitializeContext again |
src/coreclr/vm/threadsuspend.cpp
Outdated
| pfnInitializeContext2(NULL, context, NULL, &contextSize, xStateCompactionMask) : | ||
| InitializeContext(NULL, context, NULL, &contextSize); | ||
|
|
||
| _ASSERTE(!success); |
There was a problem hiding this comment.
This should be _ASSERTE(success), right?
There was a problem hiding this comment.
No, InitializeContext fails with ERROR_INSUFFICIENT_BUFFER but returns contextSize. Different than other win32 apis I guess.
There was a problem hiding this comment.
Ah, ok. Then it would be nice to add checking the GetLastError()==ERROR_INSUFFICIENT_BUFFER to the _ASSERTE condition.
src/coreclr/vm/threadsuspend.cpp
Outdated
| InitializeContext(buffer, context, &pOSContext, &contextSize); | ||
|
|
||
| // if AVX is supported set the appropriate features mask in the context | ||
| if (success == TRUE && supportsAVX) |
There was a problem hiding this comment.
A nit - why success==TRUE and not just success?
src/coreclr/vm/threadsuspend.cpp
Outdated
| *contextBuffer = buffer; | ||
|
|
||
| #else | ||
| pOSContext = new (nothrow) CONTEXT(); |
There was a problem hiding this comment.
That's true - verified with the compiler explorer
| PINITIALIZECONTEXT2 pfnInitializeContext2 = NULL; | ||
|
|
||
| #ifdef TARGET_X86 | ||
| #define CONTEXT_COMPLETE (CONTEXT_FULL | CONTEXT_FLOATING_POINT | \ |
There was a problem hiding this comment.
A nit - you can use CONTEXT_ALL | CONTEXT_EXCEPTION_REQUEST for all targets
There was a problem hiding this comment.
Turns out this is not a case on ARM64.
On ARM64 CONTEXT_ALL also includes x18, which always contains TEB pointer in user mode. We are not getting it in RedirectCurrentThreadAtHandledJITCase case and that causes assertions, since we expect more.
Ill make this as it was originally.
There was a problem hiding this comment.
Hmm, interesting. On Unix, x18 is not handled in a special way and is always part of the context.
| // This should not normally fail. | ||
| // The system silently ignores any feature specified in the FeatureMask | ||
| // which is not enabled on the processor. | ||
| success = SetXStateFeaturesMask(pOSContext, XSTATE_MASK_AVX); |
There was a problem hiding this comment.
Won't the context initialized by the InitializeContext already have this set?
There was a problem hiding this comment.
Samples in the docs make this call. (https://docs.microsoft.com/en-us/windows/win32/debug/working-with-xstate-context)
I think InitializeContext only knows that we want XSTATE, but more exact feature mask must be set separately.
|
|
||
| if (!pCtx) | ||
| { | ||
| // Even when our caller is YieldTask, we can find a NULL |
There was a problem hiding this comment.
Isn't this case possible anymore?
There was a problem hiding this comment.
YieldTask scenario no longer exists. Looks like it has been mostly removed, but some pieces remained.
It is tempting to do more clean up, but this change might need to be back-ported, so I tried to limit the size.
There was a problem hiding this comment.
Ok, thank you for the explanation.
src/coreclr/vm/threadsuspend.cpp
Outdated
| // Get and save the thread's context | ||
|
|
||
| #if defined(TARGET_X86) || defined(TARGET_AMD64) | ||
| _ASSERTE((pCtx->ContextFlags & CONTEXT_XSTATE) == |
There was a problem hiding this comment.
Cannot we use the CopyContext API?
There was a problem hiding this comment.
I do not know if it would be always safe to use CopyContext. CopyContext doc mentions that the context must be initialized with InitializeContext. We fall back to regular CONTEXT when we do not have AVX.
There was a problem hiding this comment.
Actually I noticed that we still use InitializeContext even if AVX is disabled and it does not cost us much extra. Therefore I will use CopyContext. It will make this part a simpler.
| // If the thread is at a GC safe point, push a RedirectedThreadFrame with | ||
| // the interrupted context and pulse the GC mode so that GC can proceed. | ||
| FrameWithCookie<RedirectedThreadFrame> frame(interruptedContext); | ||
| pThread->SetSavedRedirectContext(NULL); |
There was a problem hiding this comment.
Don't we need to mark the redirect context as in use here?
There was a problem hiding this comment.
No, in this case interruptedContext is stack-allocated in the signal handler. It is not the redirect instance, which we are guarding for under/over use. Marking "in use" here will detect a different instance and trigger an assert.
The original SetSavedRedirectContext(NULL) was a bit suspicious since it was not balanced with a "returning" set. If we ever had an actual redirect instance allocated, it might have leaked here.
|
it looks like AVX in windows can be disabled via
|
|
I think I have addressed all the concerns on this. Please let me know if I missed something. |
|
Should we validate with avx512? I have an Azure VM so can do a sanity check if you have a self-contained repro available. |
|
@mangod9 - Thanks! I wanted to run tests and do some debugging, so I just acquired a similar VM Here are my observations:
Hardware: |
|
Ok, nice. Good to go then :). Thx! |
|
Thanks!!! |
|
/backport to release/5.0 |
|
Started backporting to release/5.0: https://github.com/dotnet/runtime/actions/runs/504558594 |
|
@VSadov backporting to release/5.0 failed, the patch most likely resulted in conflicts: $ git am --3way --ignore-whitespace --keep-non-patch changes.patch
Applying: Save avx regs during SuspendThread
error: sha1 information is lacking or useless (src/coreclr/vm/threadsuspend.cpp).
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Save avx regs during SuspendThread
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".
Error: The process '/usr/bin/git' failed with exit code 128Please backport manually! |
Resolves:#47016