Skip to content

Redesign benchmarks for culture-specific string operations#892

Merged
adamsitnik merged 7 commits intodotnet:masterfrom
adamsitnik:stringBenchmarksRedesign
Sep 23, 2019
Merged

Redesign benchmarks for culture-specific string operations#892
adamsitnik merged 7 commits intodotnet:masterfrom
adamsitnik:stringBenchmarksRedesign

Conversation

@adamsitnik
Copy link
Member

The culture-specific string benchmarks we had so far were using a very small and simple input and were testing very few cultures and CompareOptions. This is why we have missed issues like https://github.com/dotnet/corefx/issues/40674

This is my proposal for fixing it.

Instead of using some made-up text, we are using part of "Alice's Adventures in Wonderland" book. It contains mostly simply ASCII characters, but also some "high" chars that get special treatment and hit the slow path.

The tested matrix now is:

(new CultureInfo("en-US"), CompareOptions.Ordinal)
(new CultureInfo("en-US"), CompareOptions.OrdinalIgnoreCase)
(new CultureInfo("en-US"), CompareOptions.None)
(new CultureInfo("en-US"), CompareOptions.IgnoreCase)
(new CultureInfo("en-US"), CompareOptions.IgnoreSymbols)
(CultureInfo.InvariantCulture, CompareOptions.None)
(CultureInfo.InvariantCulture, CompareOptions.IgnoreCase)
(new CultureInfo("pl-PL"), CompareOptions.None) // as an example of complex language hitting the slow path on Unix

I've removed the old benchmarks that would now be duplicated.

Moreover I've realized that Perf_CompareInfo had a serious bug inside - the strings were always identical because source argument was never used..

private static string GenerateInputString(char source, int count, char replaceChar, int replacePos)
{
char[] str = new char[count];
for (int i = 0; i < count; i++)
{
str[i] = replaceChar;
}
str[replacePos] = replaceChar;
return new string(str);
}

Fixes #885

Copy link
Member

@tarekgh tarekgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo @jkotas comments, LGTM

@adamsitnik adamsitnik merged commit cc73a01 into dotnet:master Sep 23, 2019
@adamsitnik adamsitnik deleted the stringBenchmarksRedesign branch September 23, 2019 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement more realistic string benchmarks

4 participants