Skip to content

Improve performance of SourceText.IsBinary on .NET Core#61148

Merged
jaredpar merged 1 commit intodotnet:mainfrom
Neme12:faster-is-binary
May 5, 2022
Merged

Improve performance of SourceText.IsBinary on .NET Core#61148
jaredpar merged 1 commit intodotnet:mainfrom
Neme12:faster-is-binary

Conversation

@Neme12
Copy link
Contributor

@Neme12 Neme12 commented May 5, 2022

The code is self explanatory 🙃

Tiny benchmark testing just the IsBinary method on a string from a 74 kB source file on .NET 6, where IsBinary is the old implementation and IsBinary2 is the new one:

|    Method |      Mean |     Error |    StdDev |    Median |
|---------- |----------:|----------:|----------:|----------:|
|  IsBinary | 45.144 us | 0.8997 us | 2.4780 us | 44.511 us |
| IsBinary2 |  4.814 us | 0.2391 us | 0.7049 us |  4.492 us |

Benchmark testing SourceText.From(fileStream, throwIfBinaryDetected: true) for BoundNodes.xml.Generated.cs,
before the change:

|         Method |                  Job |              Runtime |     Mean |     Error |    StdDev |
|--------------- |--------------------- |--------------------- |---------:|----------:|----------:|
| SourceTextFrom |             .NET 6.0 |             .NET 6.0 | 5.295 ms | 0.1426 ms | 0.4068 ms |
| SourceTextFrom |        .NET Core 3.1 |        .NET Core 3.1 | 5.412 ms | 0.1082 ms | 0.2962 ms |
| SourceTextFrom | .NET Framework 4.7.2 | .NET Framework 4.7.2 | 5.712 ms | 0.1141 ms | 0.2480 ms |

after the change:

|         Method |                  Job |              Runtime |     Mean |     Error |    StdDev |
|--------------- |--------------------- |--------------------- |---------:|----------:|----------:|
| SourceTextFrom |             .NET 6.0 |             .NET 6.0 | 3.975 ms | 0.1002 ms | 0.2906 ms |
| SourceTextFrom |        .NET Core 3.1 |        .NET Core 3.1 | 4.046 ms | 0.0827 ms | 0.2424 ms |
| SourceTextFrom | .NET Framework 4.7.2 | .NET Framework 4.7.2 | 5.481 ms | 0.1149 ms | 0.3389 ms |

EDIT: It seems the difference is exaggerated in the SourceTextFrom benchmarks, I guess because it's completely different runs. The actual difference shouldn't be more than 500ms because that seems to be the difference between using throwIfBinaryDetected true vs false with the original code.

@Neme12 Neme12 requested a review from a team as a code owner May 5, 2022 17:07
@ghost ghost added Community The pull request was submitted by a contributor who is not a Microsoft employee. Area-Compilers labels May 5, 2022
@Neme12 Neme12 force-pushed the faster-is-binary branch from 399a330 to efdabec Compare May 5, 2022 17:11
@Neme12 Neme12 force-pushed the faster-is-binary branch from efdabec to f59bb5e Compare May 5, 2022 17:12
@CyrusNajmabadi
Copy link
Contributor

That's awesome!

@jaredpar
Copy link
Member

jaredpar commented May 5, 2022

Can you share out the benchmark you wrote here? Mostly for my own education.

@Neme12
Copy link
Contributor Author

Neme12 commented May 5, 2022

@jaredpar

This is the second benchmark:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFrameworks>net6.0;netcoreapp3.1;net472</TargetFrameworks>
    <ImplicitUsings>enable</ImplicitUsings>
    <Nullable>enable</Nullable>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="BenchmarkDotNet" Version="0.13.1" />
  </ItemGroup>

  <ItemGroup Condition="'$(TargetFramework)' != 'net472'">
    <PackageReference Include="System.Collections" Version="4.3.0" />
    <PackageReference Include="System.IO.FileSystem.Primitives" Version="4.3.0" />
    <PackageReference Include="System.Runtime.Extensions" Version="4.3.0" />
    <PackageReference Include="System.Runtime.Handles" Version="4.3.0" />
    <PackageReference Include="System.Runtime.InteropServices" Version="4.3.0" />
  </ItemGroup>

  <ItemGroup>
    <ProjectReference Include="..\src\Compilers\CSharp\Portable\Microsoft.CodeAnalysis.CSharp.csproj" />
  </ItemGroup>

</Project>
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNet.Running;
using Microsoft.CodeAnalysis.Text;

BenchmarkRunner.Run<SourceTextBenchmark>();

[SimpleJob(RuntimeMoniker.Net472)]
[SimpleJob(RuntimeMoniker.NetCoreApp31)]
[SimpleJob(RuntimeMoniker.Net60)]
public class SourceTextBenchmark
{
    private readonly string _largerTextPath =
        @"C:\Users\{username}\source\repos\dotnet\roslyn\src\Compilers\CSharp\Portable\Generated\BoundNodes.xml.Generated.cs"; // 902 KB

    private FileStream? _fileStream;

    [IterationSetup]
    public void Setup()
    {
        _fileStream = File.OpenRead(_largerTextPath);
    }

    [IterationCleanup]
    public void Cleanup()
    {
        _fileStream!.Dispose();
        _fileStream = null;
    }

    [Benchmark]
    public SourceText SourceTextFrom()
    {
        return SourceText.From(_fileStream!, throwIfBinaryDetected: true);
    }
}

I don't have the source for the first benchmark anymore. First, I coped the IsBinary method to check the new implementation and the benchmark just had 2 methods, one called the original IsBinary method and the other one called IsBinary2 with the same string.

@jaredpar
Copy link
Member

jaredpar commented May 5, 2022

@Neme12 thanks for sharing!

@jaredpar jaredpar merged commit bd35cdc into dotnet:main May 5, 2022
@ghost ghost added this to the Next milestone May 5, 2022
@Neme12 Neme12 deleted the faster-is-binary branch May 5, 2022 19:18
@Cosifne Cosifne modified the milestones: Next, 17.3 P2 May 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area-Compilers Community The pull request was submitted by a contributor who is not a Microsoft employee.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants