Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Uri.UnescapeDataString and Uri.UnescapeString reuse the original string when possible#41684

Merged
stephentoub merged 19 commits intodotnet:masterfrom
alnikola:alnikola/16040-pooled-arr-in-unescape-string
Oct 24, 2019
Merged

Uri.UnescapeDataString and Uri.UnescapeString reuse the original string when possible#41684
stephentoub merged 19 commits intodotnet:masterfrom
alnikola:alnikola/16040-pooled-arr-in-unescape-string

Conversation

@alnikola
Copy link

@alnikola alnikola commented Oct 9, 2019

Uri.UnespaceDataString and Uri.UnescapeString retrieve an array from the pool to use as a temporary data storage for unescaping algorithm instead of a new string or array allocation on each call. After the processing completes, they compare the result stored in the buffer against the original string and return the original object without extra allocation if no changes have been made.
Fixes #16040

Method Toolchain Mean Error StdDev Median Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
UnescapeRandomStrings master 290.9 ms 5.810 ms 12.75 ms 288.9 ms 1.00 0.00 44000.0000 - - 178.54 MB
UnescapeRandomStrings This PR 263.3 ms 5.260 ms 14.40 ms 271.5 ms 0.89 0.07 15000.0000 - - 63.82 MB

Benchmark code

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Text;

namespace UriEscapeUnescapeTest
{
    public class Program
    {
        static void Main(string[] args) => BenchmarkSwitcher.FromTypes(new[] { typeof(Test) }).Run(args);

        [MemoryDiagnoser]
        public class Test
        {
            private string[] escapedStrings = new string[10000];

            [GlobalSetup]
            public void Setup()
            {
                for(var i = 0; i < escapedStrings.Length; i++)
                {
                    escapedStrings[i] = GenenerateRandomEscapedString();
                }
            }

            public static string GenenerateRandomEscapedString()
            {
                StringBuilder sb = new StringBuilder();
                Random r = new Random();
                for (int i = 0; i < 1000; i++)
                {
                    sb.Append("aa");
                    int val = r.Next(0, 3);
                    switch (val)
                    {
                        case 0:
                            sb.Append("%2F%2F");
                            break;
                        case 1:
                            sb.Append("%3A");
                            break;
                        case 2:
                            sb.Append("%21");
                            break;
                        default:
                            break;
                    }
                }
                return sb.ToString();
            }

            [Benchmark]
            public void UnescapeRandomStrings()
            {
                for(var i = 0; i < escapedStrings.Length; i++)
                {
                    Uri.UnescapeDataString(escapedStrings[i]);
                }
            }
        }
    }
}

This PR is based on the original idea of @safern

@davidsh davidsh requested a review from a team October 9, 2019 16:41
@davidsh davidsh added this to the 5.0 milestone Oct 9, 2019
@davidsh
Copy link
Contributor

davidsh commented Oct 9, 2019

Can we also see some benchmarks with the before/after with this PR change?

See PR #41640 as an example of how we write little benchmarks to help validate a PR and perf gains.

@alnikola
Copy link
Author

The current perf results. It became ~10% slower after the moving logic out of ValueStringBuilder.

Method Toolchain Mean Error StdDev Median Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
UnescapeRandomStrings Before VSB change 252.5 ms 5.041 ms 14.7038 ms 260.5 ms 0.88 0.05 15000.0000 - - 63.82 MB
UnescapeRandomStrings After VSB change 287.2 ms 1.115 ms 0.8704 ms 287.4 ms 1.00 0.00 15000.0000 - - 63.81 MB

@stephentoub
Copy link
Member

It became ~10% slower after the moving logic out of ValueStringBuilder.

What exactly are you comparing? The previous commit you published was buggy, e.g. it wasn't clearing memory it needed to be clearing. If you're now clearing that memory and you weren't before, you're comparing apples to oranges.

@alnikola
Copy link
Author

alnikola commented Oct 11, 2019

The last perf measurement didn't include pooled array cleaning. I just wanted to see if extracting methods out VSB brought in some perf change.
Now, I have pushed the cleaning logic and here are the final results including it. I agree we cannot really compare them with the previous change because those were buggy. So, I include comparison to master. The memory consumption is 2x less, but the performance is about 5.7% slower.

Method Toolchain Mean Error StdDev Median Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
UnescapeRandomStrings master 296.4 ms 5.919 ms 14.180 ms 301.6 ms 1.00 0.00 44000.0000 - - 178.52 MB
UnescapeRandomStrings This PR 313.5 ms 3.465 ms 3.072 ms 314.3 ms 1.12 0.09 15000.0000 - - 63.82 MB

@stephentoub
Copy link
Member

I agree we cannot really compare them with the previous change because those were buggy. So, I include comparison to master. The memory consumption is 2x less, but the performance is about 5.7% slower.

Ok, thanks for following up on it. I will take another look on Monday.

- Explicit pooled array cleaning is removed
@alnikola
Copy link
Author

alnikola commented Oct 18, 2019

After I replaced indexer with Append method and remove the explicit pooled array cleaning as unnecessary (Append guarantees that data is always written before it can be read), the performance improved and got slightly better than in my master version. (Note. My master doesn't include the latest @stephentoub changes, I will merge with them in the next commit)

Method Toolchain Mean Error StdDev Ratio RatioSD Gen 0 Gen 1 Gen 2 Allocated
UnescapeRandomStrings master 271.5 ms 5.389 ms 12.163 ms 1.00 0.00 44000.0000 - - 178.52 MB
UnescapeRandomStrings This PR 262.4 ms 1.584 ms 1.482 ms 0.98 0.07 15000.0000 - - 63.82 MB

@alnikola
Copy link
Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@stephentoub stephentoub dismissed their stale review October 23, 2019 17:08

Feedback addressed

@alnikola
Copy link
Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@alnikola
Copy link
Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@alnikola
Copy link
Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@stephentoub stephentoub merged commit dacc255 into dotnet:master Oct 24, 2019
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* UriExt.UnescapeDataString avoids a new string allocation for an unmodified unescaped string

* Original dest array reuse fixed

* ValueStringBuilder is used instead of PooledCharArray

* Remove unused type

* Changes to ValueStringBuilder reverted

* IsStringSame replaced with SequenceEqual

* Array returned from buffer is cleared

* - Indexer-based assignment is replaced with Append method
- Explicit pooled array cleaning is removed

* .sln file changes reverted

* Build fixed

* UnescapeString always copies the result to dest array

* Unused code removed

* Original string reuse fixed

* Heap validation fixed

* - Small optimizations
- Code style fixes

* By-char append is replaced with span-based one

* SequenceEqual is invoked against the original string

* Revert "By-char append is replaced with span-based one"

This reverts commit dotnet/corefx@e5919e3.


Commit migrated from dotnet/corefx@dacc255
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve Uri.UnescapeDataString codepath -> UriHelper.UnescapeString

6 participants