Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Fix GH Issue 16892 - GC hole due to GT_INDEX_ADDR#16928

Merged
briansull merged 1 commit intodotnet:masterfrom
briansull:fix-16892
Mar 15, 2018
Merged

Fix GH Issue 16892 - GC hole due to GT_INDEX_ADDR#16928
briansull merged 1 commit intodotnet:masterfrom
briansull:fix-16892

Conversation

@briansull
Copy link

Fixes #16892

@briansull
Copy link
Author

@AndyAyersMS PTAL

Also review #13245 for more info

@briansull
Copy link
Author

I will try to obtain a test case:
The method must be compiled for Debug,
Have GC args in the outgoing stack area, (i.e as the 5 or 6 argument)
The first four args should be side-effect free (of calls and assignments) and one of the first four args should use an array element (with a bounds check operation).

@briansull
Copy link
Author

@dotnet-bot test Windows_NT x64 Checked gcstress0xc


case GT_INDEX:
case GT_INDEX_ADDR:
// These two call CORINFO_HELP_RNGCHKFAIL for Debug code
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually is the only line necessary to correct this bug.

@AndyAyersMS
Copy link
Member

Kind of scary that this has been in the code so long without causing trouble.

Do we commonly run GC stress on debug codegen? If not, perhaps we should.

Change LGTM but let's see what Carol / Bruce have to say.

@danmoseley
Copy link
Member

danmoseley commented Mar 14, 2018

I have asked vendors to try running CoreFX tests with GCStress=3 on chk builds as well as the usual ret (at least chk of CoreFX, we don't test chk of CoreCLR over in CoreFX)

Do you guys run GCStress using CoreFX tests at all? Eg in CI?

@briansull
Copy link
Author

briansull commented Mar 14, 2018

Yes we can/do run gc stress, but not AFAIK on CoreFX tests.
I ran GCStress on this PR with the text: (see above)
AtSign test Windows_NT x64 Checked gcstress0xc

Note that you need to run gcstress 4 or C to get coverange on every instruction in the JIT code.
GCStress 1 or 3 isn't really going to catch this kind of a GC issue.

Note that using gcstress 4 or C apparently also requires access to a Disassembler library for x64 instructions which wasn't available automatically for me when I tried running it by hand.

@jkotas
Copy link
Member

jkotas commented Mar 14, 2018

I have asked vendors to try running CoreFX tests with GCStress=3 on chk builds as well as the usual ret (at least chk of CoreFX, we don't test chk of CoreCLR over in CoreFX)

@danmosemsft I think it is a lot more valuable to run GCStress on CoreCLR checked builds. Checked CoreCLR builds have a lot of important asserts that catch problems sooner, and even in situations that do not lead to crashes.

I have traced down this GC hole using CoreCLR checked build. It resulted into assert on checked CoreCLR build that was hit pretty close to the place where the problem happened. If I was trying to trace it down using CoreCLR release build, I would be probably still scratching my head where it might be coming from.

@danmoseley
Copy link
Member

@jkotas is that just better for diagnosing or also for finding? If former then I would rather test on retail so it doesn't take a life age to complete then repeat the run using Chk on whatever set of tests failed. Right now is really slow on ret and found nothing so far.

@jkotas
Copy link
Member

jkotas commented Mar 14, 2018

I think it is also better for finding. I recommend you chat with Scott Wadsworth about it. He has a lot more insights and opinions on how this kind of testing should be done.

@danmoseley
Copy link
Member

OK. Also will talk to @valenis. Probably a few random vendor VMs is not going to cut it.

@jkotas
Copy link
Member

jkotas commented Mar 14, 2018

Right now is really slow on ret and found nothing so far.

It is slow because of the default xunit driver is not suitable for runtime testing like this. The default xunit driver is using a ton of reflection, and so you are basically GC stress testing the heck out of the one reflection path again and again, and a little bit of the actual tests.

Runtime testing like this needs a different xunit driver that does the reflection at "build time" and transforms the test into a bunch of static calls without reflection. (ProjectN tests that live in TFS use driver like that.)

@AndyAyersMS
Copy link
Member

I get that you can run gc stress in CI like you did above. Just wondering how much gc stress testing we get for the case where the jit is generating debug code.

I know some JIT tests have _d variants so we get a some coverage that way, but the core library and framework assemblies we use in CI are probably not requesting debug codegen.

Looks like we can sort of simulate this via COMPlus_JitDebuggable (like we do for minopts) and then running stress on a CHK build. It won't quite have the same IL though.

@briansull
Copy link
Author

Shall I check this in now?
@AndyAyersMS @CarolEidt @BruceForstall

@BruceForstall
Copy link

FWIW, CoreCLR runs lots of CoreFX test jobs, and runs lots of GCStress test jobs, but has no CoreFX + GCStress jobs (e.g., see https://ci.dot.net/job/dotnet_coreclr/job/master/job/jitstress/ for all "stress" jobs). Partially this is because CoreFX jobs are not clean, and GCStress jobs are not clean, and the combination is probably even worse (less likely to run clean).

@BruceForstall
Copy link

@AndyAyersMS Looks like we currently set debuggable via:

    opts.compDbgCode = jitFlags->IsSet(JitFlags::JIT_FLAG_DEBUG_CODE);

Maybe we need to add a debuggable stress mode, e.g.

STRESS_MODE(DEBUGGABLE)

and then use:

    opts.compDbgCode = jitFlags->IsSet(JitFlags::JIT_FLAG_DEBUG_CODE) || compStressCompile(STRESS_DEBUGGABLE, 10));

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM

@briansull
Copy link
Author

briansull commented Mar 14, 2018

I added a test case that fails without this fix, when run under GCStress=4

@briansull
Copy link
Author

@dotnet-bot test Windows_NT x64 Checked gcstress0xc

</CodeAnalysisDependentAssemblyPaths>
</ItemGroup>
<PropertyGroup>
<DebugType>Full</DebugType>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build this test with Debug==Full

@briansull
Copy link
Author

@dotnet-bot test Windows_NT x64 Checked gcstress0xc

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants