Reduce allocations in CSharp command line parsing#54675
Reduce allocations in CSharp command line parsing#54675jaredpar merged 11 commits intodotnet:mainfrom
Conversation
Context dotnet#53570 The C# parser is used by the project system in scenarios like solution open. That means the allocations in command line parsing can contribute significantly to Visual Studio performance. This PR reduces the allocations significantly. Note: in the below explanations when I refer to the "99% case" I am referring to how command lines are structured when created by MSBuild based builds. That is the **overwhelming** case for the compiler and is advantageous because it normalizes many items like lower casing all option names, having one reference per option, etc ... The remaining 1% are hand authored build files and these should not impact Visual Studio scenarios. The first fix is to simply avoid iterators in the parse hot paths. Most of the iterators in the parser return a single element in the 99% case hence the iterator is wasted allocations. These were switched to take pooled builder arguments. The next, and more siginificant fix, is to use `ReadOnlyMemory<char>` instead of `string` on our hot parsing paths. This allows us to parse the command line arguments without actually allocating memory for the values until actually needed by lower level APIs. This allowed for APIs like `RemoveQuotesAndSlashes` to become allocation free in the 99% case (it's just a slicing operation now, not a string allocation). The hot paths in our parsing were updated to employ these techniques. The one downside of the change is that I had to touch virtually every `case "someOption"` where the option value was used. The value is now held in a `ReadOnlyMemory<char>` until it's actually needed as a `string`. Hence every one of these cases either needed to use the new `ReadOnlyMemory<char>` APIs or force the allocation of the `string`. That made the change longer than I would've liked but it's also fairly mechanical. To test out the changes I created a benchmark which parsed a couple of command lines: - Command line for building Microsoft.CodeAnalysis.CSharp - Command line for a simple razor app The results are below and represent a significant win in both allocations and in simple performance. There are a few more places we could get some wins but getting into the realm of diminishing returns at this point. Before | Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated | |------------------------------- |-----------:|---------:|---------:|---------:|--------:|------:|----------:| | ParseCSharpCompilerCommandLine | 1,127.3 us | 20.52 us | 24.42 us | 250.0000 | 80.0781 | - | 876 KB | | ParseRazorCompilerCommandLine | 486.1 us | 7.19 us | 6.38 us | 350.0977 | 56.6406 | - | 571 KB | After | Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Gen 2 | Allocated | |------------------------------- |---------:|--------:|--------:|---------:|--------:|------:|----------:| | ParseCSharpCompilerCommandLine | 630.3 us | 8.54 us | 7.57 us | 147.4609 | 14.6484 | - | 223 KB | | ParseRazorCompilerCommandLine | 232.4 us | 0.73 us | 0.61 us | 86.1816 | 14.1602 | - | 149 KB | closses dotnet#53570
|
@davkean FYI |
|
@dotnet/roslyn-compiler PTAL |
src/Compilers/CSharp/Portable/CommandLine/CSharpCommandLineParser.cs
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| foreach (string path in ParseSeparatedPaths(switchValue)) | ||
| foreach (string path in ParseSeparatedPaths(switchValue.Value.ToString())) |
There was a problem hiding this comment.
Do we want to get rid of the iterator allocation here also?
There was a problem hiding this comment.
We certainly could but this is not really a hot path, or even a common path, for C# compilation. To keep the churn down I'd been focusing on areas of the argument parsing that hit hot paths. This particular argument doesn't get hit at all in modern MSBuild driven scenarios that I'm aware of.
I'm certainly happy to push further here if you think it's worth the code churn.
src/Compilers/CSharp/Portable/CommandLine/CSharpCommandLineParser.cs
Outdated
Show resolved
Hide resolved
src/Compilers/Core/CodeAnalysisTest/CommonCommandLineParserTests.cs
Outdated
Show resolved
Hide resolved
|
@chsienki thanks! Responded to the feedback. |
src/Compilers/CSharp/Portable/CommandLine/CSharpCommandLineParser.cs
Outdated
Show resolved
Hide resolved
| @@ -192,7 +217,7 @@ internal sealed override CommandLineArguments CommonParse(IEnumerable<string> ar | |||
| continue; | |||
|
|
|||
| case "langversion": | |||
There was a problem hiding this comment.
IMO, it feels like langversion could be common enough to special case as well. Did you have a reason for not doing this one?
There was a problem hiding this comment.
Mostly I was going for option names that were likely to appear multiple times in standard compilation requests. For example /reference was an easy win because there are hundreds of those per compilation. The next one after that was /analyzer. After that though pretty much every other option is used only once. Wasn't making big impact.
Don't have any objections to pushing more up to the fast path though.
There was a problem hiding this comment.
That makes sense and seems like a reasonable cutoff.
| checkOverflow = false; | ||
| continue; | ||
|
|
||
| case "nullable": |
There was a problem hiding this comment.
Nullable also feels like it would be a common option. It's a bit more complex, but did you consider doing this one ahead of time as well?
There was a problem hiding this comment.
Similar to "langversion".
|
|
||
| case "nowarn": | ||
| if (value == null) | ||
| if (valueMemory is null) |
There was a problem hiding this comment.
nowarn will also be super common, as the SDK sets this for every C# program in existance.
|
After reading through a few of the comments I did some more profiling and found I could save a lot more allocations by avoiding the
|
Co-authored-by: Fred Silberberg <fred@silberberg.xyz>
|
@chsienki any other feedback here that I missed? |
Context #53570
The C# parser is used by the project system in scenarios like solution
open. That means the allocations in command line parsing can contribute
significantly to Visual Studio performance. This PR reduces the
allocations significantly.
Note: in the below explanations when I refer to the "99% case" I am
referring to how command lines are structured when created by
MSBuild based builds. That is the overwhelming case for the compiler
and is advantageous because it normalizes many items like lower casing
all option names, having one reference per option, etc ... The remaining
1% are hand authored build files and these should not impact Visual
Studio scenarios.
The first fix is to simply avoid iterators in the parse hot paths. Most
of the iterators in the parser return a single element in the 99% case
hence the iterator is wasted allocations. These were switched to take
pooled builder arguments.
The next, and more siginificant fix, is to use
ReadOnlyMemory<char>instead of
stringon our hot parsing paths. This allows us to parsethe command line arguments without actually allocating memory for the
values until actually needed by lower level APIs. This allowed for APIs
like
RemoveQuotesAndSlashesto become allocation free in the 99% case(it's just a slicing operation now, not a string allocation).
The hot paths in our parsing were updated to employ these techniques.
The one downside of the change is that I had to touch virtually every
case "someOption"where the option value was used. The value is nowheld in a
ReadOnlyMemory<char>until it's actually needed as astring. Hence every one of these cases either needed to use the newReadOnlyMemory<char>APIs or force the allocation of thestring.That made the change longer than I would've liked but it's also fairly
mechanical.
To test out the changes I created a benchmark which parsed a couple of
command lines:
The results are below and represent a significant win in both
allocations and in simple performance. There are a few more places we
could get some wins but getting into the realm of diminishing returns at
this point.
Before
After
There is a Gen2 increase here but I believe that is just the new usage of pooled builders.
Closes #53570