Skip to content

Conversation

@stephentoub
Copy link
Member

Recreating #109 after dotnet/runtime was made public.

Method Toolchain Options Mean Error StdDev Ratio Allocated
Email \new\corerun.exe Compiled 166.0 ns 0.29 ns 0.26 ns 0.43 -
Email \old\corerun.exe Compiled 389.7 ns 0.81 ns 0.72 ns 1.00 -
Date \new\corerun.exe Compiled 105.2 ns 0.26 ns 0.23 ns 0.28 -
Date \old\corerun.exe Compiled 380.1 ns 2.16 ns 2.02 ns 1.00 -
IP \new\corerun.exe Compiled 404.6 ns 1.05 ns 0.93 ns 0.34 -
IP \old\corerun.exe Compiled 1,180.7 ns 2.36 ns 2.20 ns 1.00 -
Uri \new\corerun.exe Compiled 101.3 ns 0.44 ns 0.41 ns 0.27 -
Uri \old\corerun.exe Compiled 380.9 ns 2.11 ns 1.77 ns 1.00 -
EmailStatic \new\corerun.exe Compiled 214.4 ns 0.94 ns 0.78 ns 0.49 104 B
EmailStatic \old\corerun.exe Compiled 437.1 ns 1.15 ns 1.07 ns 1.00 104 B
DateStatic \new\corerun.exe Compiled 160.5 ns 1.18 ns 1.10 ns 0.38 104 B
DateStatic \old\corerun.exe Compiled 422.6 ns 1.11 ns 1.04 ns 1.00 104 B
IPStatic \new\corerun.exe Compiled 444.3 ns 1.36 ns 1.27 ns 0.37 104 B
IPStatic \old\corerun.exe Compiled 1,198.5 ns 7.72 ns 7.22 ns 1.00 104 B
UriStatic \new\corerun.exe Compiled 153.1 ns 0.61 ns 0.57 ns 0.36 104 B
UriStatic \old\corerun.exe Compiled 425.6 ns 3.39 ns 3.17 ns 1.00 104 B
Ctor \new\corerun.exe Compiled 45,213.4 ns 263.58 ns 233.66 ns 1.42 34184 B
Ctor \old\corerun.exe Compiled 31,899.8 ns 71.09 ns 63.02 ns 1.00 36264 B
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Diagnosers;
using BenchmarkDotNet.Running;
using System.Text.RegularExpressions;

public class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssemblies(new[] { typeof(Program).Assembly }).Run(args);
}

[MemoryDiagnoser]
public class Regexes
{
    [Params(RegexOptions.None, RegexOptions.Compiled)]
    public RegexOptions Options { get; set; }

    private Regex _email, _date, _ip, _uri;

    private const string EmailPattern = @"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,12}|[0-9]{1,3})(\]?)$";
    private const string DatePattern = @"\b(?<month>\d{1,2})/(?<day>\d{1,2})/(?<year>\d{2,4})\b";
    private const string IPPattern = @"(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9])\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9])";
    private const string UriPattern = @"[\w]+://[^/\s?#]+[^\s?#]+(?:\?[^\s#]*)?(?:#[^\s]*)?";

    [GlobalSetup]
    public void Setup()
    {
        _email = new Regex(EmailPattern, Options);
        _date = new Regex(DatePattern, Options);
        _ip = new Regex(IPPattern, Options);
        _uri = new Regex(UriPattern, Options);
    }

    [Benchmark] public void Email() => _email.IsMatch("yay.performance@dot.net");
    [Benchmark] public void Date() => _date.IsMatch("Today is 11/18/2019");
    [Benchmark] public void IP() => _ip.IsMatch("012.345.678.910");
    [Benchmark] public void Uri() => _uri.IsMatch("http://example.org");

    [Benchmark] public void EmailStatic() => Regex.IsMatch("yay.performance@dot.net", EmailPattern, Options);
    [Benchmark] public void DateStatic() => Regex.IsMatch("Today is 11/18/2019", DatePattern, Options);
    [Benchmark] public void IPStatic() => Regex.IsMatch("012.345.678.910", IPPattern, Options);
    [Benchmark] public void UriStatic() => Regex.IsMatch("http://example.org", UriPattern, Options);

    [Benchmark] public void Ctor() => new Regex(@"(^(.*)(\(([0-9]+),([0-9]+)\)): )(error|warning) ([A-Z]+[0-9]+) ?: (.*)", Options);
}

@Tornhoof
Copy link
Contributor

Why is the ctor benchmark quite a bit slower now?

@stephentoub
Copy link
Member Author

Why is the ctor benchmark quite a bit slower now?

Compiling a Regex gets slower because we're doing more work during compilation, specifically calling CharInClass 128 times to create the ASCII bitmap. But because we do so, we then get to avoid calling CharInClass for every single character we try to match against a character set every time the Regex is used. Compiling a Regex is already an order of magnitude more expensive than not, so you only do it when you're going to be using the Regex over and over and over and over, in which case this is by far worth the trade-off.

@Tornhoof
Copy link
Contributor

Thank you for the explanation.

@stephentoub stephentoub merged commit 335eeb4 into dotnet:master Nov 26, 2019
@stephentoub stephentoub deleted the regexperf branch November 26, 2019 11:19
@stephentoub stephentoub mentioned this pull request Jan 7, 2020
41 tasks
@stephentoub stephentoub added the tenet-performance Performance related issue label Jan 12, 2020
@stephentoub stephentoub added this to the 5.0 milestone Jan 12, 2020
// to avoid lock:
CachedCodeEntry? first = s_cacheFirst;
if (first?.Key == key)
if (first != null && first.Key.Equals(key))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub - this will result in a NRE if first.Key is null

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@udlose, Key is a struct; it can't be null.

Copy link

@udlose udlose Jun 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I get for not looking at the code in it's full context. Sorry for the wasted bits.

@dotnet dotnet deleted a comment from stephentoub Jun 27, 2020
MichalStrehovsky pushed a commit to MichalStrehovsky/runtime that referenced this pull request Nov 4, 2020
This is experimental feature that adds about 1.7kB in binary footprint per method returning ValueTask.

Ifdef it out for native AOT until we figure out what to do with it. See dotnet#13633.
@ghost ghost locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants