Stop using LIST nodes for CALL arg lists#26392
Conversation
|
This is pretty large but most changes are rather simple:
There are few more notable changes:
Another notable change is that this (and all similar |
|
@dotnet/jit-contrib |
|
@jashook @BruceForstall Do you have any idea what exactly failed in the CoreFX Linux check? Apparently all jobs have failed but I can't find much about what happened. In a Helix log (https://helix.dot.net/api/2019-06-17/jobs/2d58505d-56e6-46a6-a07b-4bc8754fc017/workitems/System.Threading.Channels.Tests/console) I see entries like: Test run started and failed immediately without a trace? |
|
The real failure looks to be higher up: Looks like for some reason the dotnet CLI didn't get copied/installed properly. I set the failures to rerun |
|
@mikedn is this PR ready for review or it is wip? |
|
It should be good for review, I was just hoping that the CI will get green but I see that it failed again. Still. it looks like those tests didn't even run so it's probably a CI issue rather than something wrong with my change. |
|
Yeah, I'm not 100% sure how it all works but it looks to me that the normal and CoreFX test builds have conflicting artifact names so they overwrite each over. On a successful PR build I see in the run tests jobs: but here it's so failed CoreFX run probably ended up downloading the wrong artifact. |
|
The VN memory usage improvement in the "largest method" reported in mem stats is more than 2x. I was expecting improvements due to VNs for LIST nodes no longer being generated but this seemed just amazing so I tried to figure which method is this. It looks like it's coreclr/src/System.Private.CoreLib/shared/System/Globalization/CultureData.cs Lines 153 to 285 in 4aa65f0 So yeah, it's basically a long list of calls. And the memory stats are collected from a checked corelib (the rel VM doesn't dump the stats even if the rel JIT is compiled with mem stats enabled) so there are even more calls in that method thanks to some debug stuff. --- D:\Projects\jit\foo.asm 2019-08-31 19:14:14.000000000 +-0300
+++ D:\Projects\jit\boo.asm 2019-08-31 19:19:30.000000000 +-0300
@@ -171826,21 +169650,14 @@
-N003 [007204] ADD => $1c5 {norm=$1c4 {ADD($47, $181)}, exc=$100 {HelperMultipleExc()}}
- fgCurMemoryVN[GcHeap] assigned for GTF_IND_VOLATILE - read at [007205] to VN: $2083.
-N004 [007205] IND => $2006 {norm=$2084 {2084}, exc=$10c( {NullPtrExc($1c4)}, {HelperMultipleExc()})}
-N005 [010024] LCL_VAR V515 tmp515 d:1 => $2084 {2084}
-N006 [010025] ASG => $2006 {norm=$2084 {2084}, exc=$10c( {NullPtrExc($1c4)}, {HelperMultipleExc()})}
-N007 [010028] ARGPLACE => $2086 {2086}
-N008 [010030] ARGPLACE => $2087 {2087}
-N009 [007207] LIST => $1fbe {LIST($2087, $0)}
-N010 [007208] LIST => $1fbf {LIST($2086, $1fbe)}
-N011 [010026] LCL_VAR V515 tmp515 u:1 (last use) => $2084 {2084}
-N012 [007192] LCL_VAR V386 tmp386 u:1 (last use) => <l:$1c3f {ByrefExposedLoad($43, $109, $1f7e)}, c:$1f7f {1f7f}>
-N013 [007193] LCL_VAR V387 tmp387 u:1 (last use) => <l:$1c3f {ByrefExposedLoad($43, $109, $1f7e)}, c:$2081 {2081}>
-N014 [010031] LIST => <l:$2141 {LIST($1c3f, $0)}, c:$2140 {LIST($2081, $0)}>
-N015 [010029] LIST => <l:$2143 {LIST($1c3f, $2141)}, c:$2142 {LIST($1f7f, $2140)}>
-N016 [010027] LIST => <l:$2145 {LIST($2084, $2143)}, c:$2144 {LIST($2084, $2142)}>
-VN of ARGPLACE tree [010028] updated to $2084 {2084}
-VN of ARGPLACE tree [010030] updated to <l:$1c3f {ByrefExposedLoad($43, $109, $1f7e)}, c:$1f7f {1f7f}>
-N009 [007207] LIST => <l:$2141 {LIST($1c3f, $0)}, c:$2146 {LIST($1f7f, $0)}>
-N010 [007208] LIST => <l:$2148 {LIST($2084, $2141)}, c:$2147 {LIST($2084, $2146)}>
- fgCurMemoryVN[GcHeap] assigned for CALL at [007206] to VN: $2088.
-N017 [007206] CALLV ind => $VN.Void
+N003 [005539] ADD => $1c5 {norm=$1c4 {ADD($47, $181)}, exc=$100 {HelperMultipleExc()}}
+ fgCurMemoryVN[GcHeap] assigned for GTF_IND_VOLATILE - read at [005540] to VN: $1483.
+N004 [005540] IND => $1406 {norm=$1484 {1484}, exc=$10c( {NullPtrExc($1c4)}, {HelperMultipleExc()})}
+N005 [007461] LCL_VAR V515 tmp515 d:1 => $1484 {1484}
+N006 [007462] ASG => $1406 {norm=$1484 {1484}, exc=$10c( {NullPtrExc($1c4)}, {HelperMultipleExc()})}
+N007 [007464] ARGPLACE => $1486 {1486}
+N008 [007465] ARGPLACE => $1487 {1487}
+N009 [007463] LCL_VAR V515 tmp515 u:1 (last use) => $1484 {1484}
+N010 [005529] LCL_VAR V386 tmp386 u:1 (last use) => <l:$11ff {ByrefExposedLoad($43, $109, $13be)}, c:$13bf {13bf}>
+N011 [005530] LCL_VAR V387 tmp387 u:1 (last use) => <l:$11ff {ByrefExposedLoad($43, $109, $13be)}, c:$1481 {1481}>
+VN of ARGPLACE tree [007464] updated to $1484 {1484}
+VN of ARGPLACE tree [007465] updated to <l:$11ff {ByrefExposedLoad($43, $109, $13be)}, c:$13bf {13bf}>
+ fgCurMemoryVN[GcHeap] assigned for CALL at [005541] to VN: $1488.
+N012 [005541] CALLV ind => $VN.Void |
| newArgs = &newArgObjp; | ||
| oldArgObjp.Current() = oldCall->gtCallObjp; | ||
| oldArgs = &oldArgObjp; | ||
| newArgObjp.SetNode(newCall->gtCallObjp); |
There was a problem hiding this comment.
Old code was broken, it was setting Current to gtCallArgs instead of gtCallObjp which made no sense. I don't think this code is ever hit, it's only used when calls are cloned and that's not exactly common. The only case I know about is loop hoisting but only a few helper calls can be considered invariant and hoisted. And those helper calls do not have this...
Speaking of this, I find it extremely odd that it's stored separately from rest of the args, it gtCallObjp. It complicates things and as seen here this can lead to potential bugs.
seems odd. maybe all output is suppressed for some reason? |
It looks like I was (not) seeing things, works fine now. With a release corelib memory usage improvement is a bit lower but similar - 3.05%. Anyway, I tried with other assemblies (System.Windows.Form, Microsoft.CodeAnalysis.CSharp) to be sure that corelib isn't somehow special and the number are similar or even higher - 4.55% on Microsoft.CodeAnalysis.CSharp. It's normal that smaller IR results in less memory and CPU usage but I wasn't expecting to be over 3%. Oh well, simply removing the |
sandreenko
left a comment
There was a problem hiding this comment.
LGTM, TP and memory win is amazing, thank you!
| }; | ||
|
|
||
| GenTree* gtCallObjp; // The instance argument ('this' pointer) | ||
| Use* gtCallArgs; // The list of arguments in original evaluation order |
There was a problem hiding this comment.
I was a bit confused by:
Use* gtCallArgs; // The list of arguments
and
UseList Args()
they are both lists but they are used differently, there are few places where we access gtCallArgs and it takes me a few secs to understand why.
We could think of how to separate them better.
There was a problem hiding this comment.
Well, I would very much prefer to make gtCallArgs private but there are a lot more places that need to be changed to achieve that. I also don't like the "manual" linked list manipulation that's all over the place but that again requires a lot more changes.
I hope I'll be able to make other improvements around GenTreeCall data structures (get rid of GTF_LATE_ARG, ReplaceCallOperand, GT_ARGPLACE and gtCallLateArgs) and then perhaps do some more polishing as well because some things would just be easier.
Here it's probably also worth pointing out that keeping the args in an array rather than a linked list would make parts of the code cleaner. Unfortunately we have a few places were we need to insert "hidden" args so keeping the existing linked list approach was simpler and safer.
| } | ||
| } | ||
|
|
||
| #ifdef DEBUG |
There was a problem hiding this comment.
Nice catch, looks like the code here is old and the comment is not valid anymore, because we are not counting anything. The same for lsraxarch.
// Return Value:
// The number of sources consumed by this node.
//
int LinearScan::BuildCall(GenTreeCall* call)
Is it the number of registers consumed by this node?
@CarolEidt ?
There was a problem hiding this comment.
Right, I forgot to mention this part and the similar one in lsraxarch.cpp. It does look like at least the comment is out of date and maybe the code is no longer necessary as well. But it doesn't seem like a good idea to remove asserts while doing a large refactoring so for now I made only minimal changes.
| } | ||
| break; | ||
|
|
||
| { |
There was a problem hiding this comment.
Hmm, don't know. I usually don't like to add else when the if branch returns. Besides, the subsequent scoping is only required because we're in a switch. Perhaps I should just pulled out all the comapre code in a GenTreeCall::Equals function like I did for PHIs.
There was a problem hiding this comment.
I agree that this needn't be an else, and it might be nice to pull it out, but I don't feel that's necessary.
| #endif | ||
| { | ||
| args = gtNewArgList(op1, op2); | ||
| GenTreeCall::Use* args = gtNewCallArgs(op1, op2); |
There was a problem hiding this comment.
GitHub doesn't really support reviewing changes in a file with more than 12000 lines, especially at the end of such files. It shows me that args is unused here and you have to check the raw file to see the use.
There was a problem hiding this comment.
Ha ha, didn't noticed that an entire section of the file is actually missing from the diff. GitHub is great in many ways but there's still room for improvement :)
I should have reverted this variable declaration change. Initially I did more cleanup but then I saw the size of the PR I reverted a bunch of stuff but missed a few cases. Oh well, I absolutely hate variables declared on the other side of the planet. This is probably one the worst "features" of the old C versions that required variables to be declared upfront.
There was a problem hiding this comment.
GitHub doesn't really support reviewing changes in a file with more than 12000 lines
Maybe we need to figure out how to break this file up a bit!
CarolEidt
left a comment
There was a problem hiding this comment.
My suggestions are all requested comment updates. If you're willing to make those I think it would be great (and assuming they're the only changes made we wouldn't need to wait for the full round of CI tests to complete before merging).
| * | ||
| */ | ||
| AssertionIndex Compiler::optCreateAssertion(GenTree* op1, GenTree* op2, optAssertionKind assertionKind) | ||
| AssertionIndex Compiler::optCreateAssertion(GenTree* op1, |
There was a problem hiding this comment.
While this method didn't have a header to begin with, it would be good to add one, especially since you've added an arg here.
There was a problem hiding this comment.
I added comment headers to all affected assertionprop functions. However, I also cleaned up things a bit because I don't want to waste my time adding comments about useless overloads and parameters - assertion parameter isn't needed because everyone just passes the address of a local variable created solely for this purpose.
Of course, it would be even better to split optCreateAssertion into smaller functions such that helperCallArgs parameter isn't needed but that doesn't belong in this PR.
| } | ||
| break; | ||
|
|
||
| { |
There was a problem hiding this comment.
I agree that this needn't be an else, and it might be nice to pull it out, but I don't feel that's necessary.
| #endif | ||
| { | ||
| args = gtNewArgList(op1, op2); | ||
| GenTreeCall::Use* args = gtNewCallArgs(op1, op2); |
There was a problem hiding this comment.
GitHub doesn't really support reviewing changes in a file with more than 12000 lines
Maybe we need to figure out how to break this file up a bit!
src/jit/importer.cpp
Outdated
| BOOL Compiler::impInlineIsGuaranteedThisDerefBeforeAnySideEffects(GenTree* additionalTreesToBeEvaluatedBefore, | ||
| GenTree* variableBeingDereferenced, | ||
| InlArgInfo* inlArgInfo) | ||
| BOOL Compiler::impInlineIsGuaranteedThisDerefBeforeAnySideEffects(GenTree* additionalTreeToBeEvaluatedBefore, |
There was a problem hiding this comment.
It might be worth updating the function header of this method to clarify that both additionalTreeToBeEvaluatedBefore as well as the argList will be examined for side-effects. (Also the comment needs to be changed to use additionalTreeToBeEvaluatedBefore (singular 'Tree').
There was a problem hiding this comment.
Added comment header and renamed some parameters in the process.
additionalTreeToBeEvaluatedBeforeis unnecessarily long as "before" isn't really relevant - the order in which various trees are searched for side effects doesn't matter.variableBeingDereferencedwas misleading, there was no guarantee that it is actually a variable. That's checked inimpInlineIsThis.
| @@ -428,19 +428,19 @@ class IndirectCallTransformer | |||
| // | |||
| void AddHiddenArgument(GenTreeCall* fatCall, GenTree* hiddenArgument) | |||
There was a problem hiding this comment.
Pre-existing, but it would be worthwhile to mention in the function header that the hidden argument is always add just after the return buffer, if there is one, and otherwise prepended to the list.
There was a problem hiding this comment.
It can also be appended to the list depending on USER_ARGS_COME_LAST. I believe that the function would be clear enough and did not require a comment, if it weren't for the "manual" linked list manipulation that happens in the ret buffer arg case. So I cleaned up the "manual" linked list manipulation instead of adding a comment. Anyway it was messed up - instead of simply inserting a new use for the hidden argument it was also recreating the use of the ret buffer arg.
| GenTreeArgList** insertionPoint = &call->gtCallArgs; | ||
| for (; *insertionPoint != nullptr; insertionPoint = &(*insertionPoint)->Rest()) | ||
| GenTreeCall::Use** insertionPoint = &call->gtCallArgs; | ||
| for (; *insertionPoint != nullptr; insertionPoint = &((*insertionPoint)->NextRef())) |
There was a problem hiding this comment.
It took a second look to determine that, by construction, insertionPoint cannot be null. It would definitely be worth a comment to that effect (at first I thought it merited a noway_assert until I realized that couldn't happen.
There was a problem hiding this comment.
I agree that this kind of code isn't easy to read but it's only 3 lines and this is not the only place where it occurs so I don't think adding a comment is a solution. The solution would be to add a function that does this operation. I considered adding something like gtAppendNewCallArg but I don't think it's a good idea due to its potential for misuse - it has to search for the end of the list every time it is called so if it ends up being used to append multiple arguments it's not going to be good for perf.
I hope to make something better once I also change GT_FIELD_LIST to not be a list. It's a bit of a problem to abstract linked list operations in this case (single linked list with only the head being stored) but there are some ways to do this that are better than having this kind of low level list manipulation code all over the place (something similar to a back_inserter STL iterator could help for example).
* Stop using GT_LIST in GenTreeCall * Clarify optAssertionGenJtrue code * Cleanup optCreateAssertion * Cleanup AddHiddenArgument * Cleanup impInlineIsGuaranteedThisDerefBeforeAnySideEffects Commit migrated from dotnet/coreclr@667222e
Similar to #20266 but for
GenTreeCallnodes.This saves 3.4% used memory and 3.2% instructions:
Mem diff: https://gist.github.com/mikedn/3d34aebb9fc3da218cf23f16ab821d01
PIN data: https://1drv.ms/x/s!Av4baJYSo5pjgs4637g1JfAO3jOlIQ?e=FhGJ15
Contributes to #19876
No x64/x86/arm32(altjit)/arm64(altjit) FX diffs.