Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Optimise HtmlEncode and HtmlDecode to produce less garbage#27250

Merged
ViktorHofer merged 2 commits intodotnet:masterfrom
benjamin-hodgson:HtmlEncode-optimisation
Jun 4, 2018
Merged

Optimise HtmlEncode and HtmlDecode to produce less garbage#27250
ViktorHofer merged 2 commits intodotnet:masterfrom
benjamin-hodgson:HtmlEncode-optimisation

Conversation

@benjamin-hodgson
Copy link

@benjamin-hodgson benjamin-hodgson commented Feb 19, 2018

This fixes https://github.com/dotnet/corefx/issues/13893. I'd noticed in a project of mine that the TextWriter overload of WebUtility.HtmlEncode was unexpectedly generating garbage.

I added a System.Buffers project reference to the System.Runtime.Extensions csproj and wrote a simple implementation of StringBuilder based on ArrayPool, as suggested by @jkotas in #13875.

I benchmarked this implementation and while it appears to produce no garbage (except for the string that it returns) it does seem to run around 10-20% slower. I'm keen to get some input from you guys on what might be going on here.

Benchmark code (using BenchmarkDotNet):

[MemoryDiagnoser]
public class Bench
{
    public static readonly string _toEncode = string.Concat(Enumerable.Repeat("<>\"'", 100));

    [Benchmark(Baseline = true)]
    public void String_Old()
    {
        WebUtility.HtmlEncode(_toEncode);
    }

    [Benchmark]
    public void String_New()
    {
        WebUtility.HtmlEncodeNew(_toEncode);
    }

    [Benchmark]
    public void TextWriter_Old()
    {
        using (var writer = TextWriter.Null)
        {
            WebUtility.HtmlEncode(_toEncode, writer);
        }
    }

    [Benchmark]
    public void TextWriter_New()
    {
        using (var writer = TextWriter.Null)
        {
            WebUtility.HtmlEncodeNew(_toEncode, writer);
        }
    }
}

And the results:

         Method |     Mean |     Error |    StdDev | Scaled | ScaledSD |  Gen 0 |  Gen 1 | Allocated |
--------------- |---------:|----------:|----------:|-------:|---------:|-------:|-------:|----------:|
     String_Old | 5.394 us | 0.0987 us | 0.0924 us |   1.00 |     0.00 | 2.0828 | 0.0229 |   13120 B |
     String_New | 6.571 us | 0.1101 us | 0.1030 us |   1.22 |     0.03 | 0.7629 |      - |    4832 B |
 TextWriter_Old | 5.431 us | 0.1078 us | 0.1153 us |   1.01 |     0.03 | 2.0828 | 0.0229 |   13120 B |
 TextWriter_New | 6.080 us | 0.1041 us | 0.0974 us |   1.13 |     0.03 |      - |      - |       0 B |

HtmlEncode(value, index, sb);
return StringBuilderCache.GetStringAndRelease(sb);
var sb = new PooledStringBuilder(value.Length);
try
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need try/catch. It is ok to turn the pooled into garbage on exceptions.


#endregion

private struct PooledStringBuilder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a helper class like this here: https://github.com/dotnet/corefx/blob/master/src/Common/src/CoreLib/System/Text/ValueStringBuilder.cs that has a few more optimizations. Try using that instead.

Copy link
Author

@benjamin-hodgson benjamin-hodgson Feb 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like ValueStringBuilder is backed by a Span - what's the best way of writing a Span<char> to a TextWriter? I don't want to just loop over the span and call Write because that'll be slow, and I don't want to copy the span into an array because that'll generate garbage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

void Write(System.ReadOnlySpan<char> buffer) method.

return value;
}

StringBuilder sb = StringBuilderCache.Acquire(value.Length);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the project still using StringBuilderCache after this change? If not, it can be removed from the .csproj. If yes, should those places be changed to ValueStringBuilder as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used elsewhere (eg). I'd be happy to fix those other usages too, in a separate PR, if you like.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Other usages have been removed in #30072

{
output.Write(value);
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How valuable is this check? In cases where we do have something to encode, we end up paying for the extra checking.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied this from the existing string overload of HtmlEncode above. My gut tells me that the case when value doesn't have any encoding characters is quite common, probably more common than when it does. Plus index is used to copy the prefix of the string quickly, see line 106

@davidfowl
Copy link
Member

}
}

public void WriteTo(TextWriter writer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is very specialized. It should rather be .AsReadOnlySpan() property. It will handle all future uses that need to get the string as Span.

Copy link
Author

@benjamin-hodgson benjamin-hodgson Feb 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? That sounds like it could be a bit risky: when the underlying array gets resized the old one gets returned to the pool but the ReadOnlySpan will still have a reference to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is internal type, not a public API. There are many ways you can use it in a wrong way. For example, you can accidentally pass it by value and create a second copy of it to introduce the same problem as if you use AsReadOnlySpan incorrectly by accident.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjamin-hodgson
Copy link
Author

@davidfowl Thanks, I didn't know about HtmlEncoder.

@benjamin-hodgson benjamin-hodgson changed the title Optimise HtmlEncode to produce less garbage Optimise HtmlEncode and HtmlDecode to produce less garbage Feb 19, 2018

[MethodImpl(MethodImplOptions.AggressiveInlining)]
private void Clear()
public void Clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It may be better to call it Dispose now that it is public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume going further and adding IDisposable would be too heavyweight for this -- especially since "leaking" is not a big deal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Also, interfaces do not really make sense for ref types - you cannot cast the ref type to the interface.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A tangent: I haven't tried this, but I imagine a ref struct with an interface might have some limited use in the context of constrained generics. That is, if you had

ref struct Foo : IBar {}
void Baz<T>(T baz) where T : IBar

you should be able to write

Baz(new Foo());

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although: I guess you're probably not allowed to use a ref struct as a type argument, because you don't know whether the method will use the T in a way that you can't use a ref struct (eg casting it to object).

<ProjectReference Include="..\..\System.Diagnostics.Debug\src\System.Diagnostics.Debug.csproj" />
<ProjectReference Include="..\..\System.Security.Principal\src\System.Security.Principal.csproj" />
<ProjectReference Include="..\..\System.Runtime\src\System.Runtime.csproj" />
<ProjectReference Include="..\..\System.Buffers\src\System.Buffers.csproj" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you actually need to add this reference? Span is in System.Runtime for netcoreapp - you do not need to reference System.Buffers to get it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good catch, thanks! I needed in an earlier commit, because I was using ArrayPool directly, but that's been removed now.

_chars = initialBuffer;
_pos = 0;
}
public ValueStringBuilder(int initialCapacity)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add some test(s) to ValueStringBuilderTests for your new functionality?


if (string.IsNullOrEmpty(value))
{
output.Write(value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will presumably throw if value is null. But, I see previously it would have also thrown doing value.Length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems we should probably throw ANE if value is null.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, will do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now throwing an ArgumentNullException, but I'm concerned that this is technically a breaking change (see my comment). What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are generally OK with changing NRE->ANE.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see you changed back. That's fine too.

return;
}

if (!StringRequiresHtmlDecoding(value))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was not previously doing this check. If we have to do it, it might be nice for StringRequiresHtmlDecoding to return the index of the first character to decode, so HtmlDecode can copy up to that point verbatim, the same optimization happening with IndexOfHtmlEncodingChars.
Without that, I am not sure whether this will make it slower or not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was doing this check before, in the HtmlDecode(string value) method (which was being called by this one).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. Nevertheless it seems that HtmlDecode could have the same ability to start at an index returned from StringRequiresHtmlDecoding. Future change maybe.

Copy link
Member

@danmoseley danmoseley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted. Thanks!

vsb.Append(s);
}

var resultSpan = vsb.AsReadOnlySpan();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We renamed AsReadOnlySpan to AsSpan.

Also, nit: we don't use resultSpan anywhere. Change to:
var resultString = new string(vsb.AsSpan());

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

_chars = initialBuffer;
_pos = 0;
}
public ValueStringBuilder(int initialCapacity)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add new line between the ctors

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks!

Copy link

@ahsonkhan ahsonkhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, lgtm


using System.Buffers;
using System.Diagnostics;
using System.IO;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This using should not be needed anymore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks

sb.Dispose();
}

[SuppressMessage("Microsoft.Usage", "CA1806:DoNotIgnoreMethodResults", MessageId = "System.UInt16.TryParse(System.String,System.Globalization.NumberStyles,System.IFormatProvider,System.UInt16@)", Justification = "UInt16.TryParse guarantees that result is zero if the parse fails.")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you are optimizing allocations, you can also replace uint.TryParse(value.Substring below with uint.TryParse(value.AsSpan below; and delete this SuppressMessage - it does not seem to be relevant anymore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, good spot, thanks!

@benjamin-hodgson
Copy link
Author

benjamin-hodgson commented Mar 7, 2018

Rebased. Apologies for the delay in getting this back up to speed.

The tests were failing because I changed the code to throw ArgumentNullException when the input string was null, per @danmosemsft's suggestion. Previously HtmlEncode/HtmlDecode would return null (or write nothing to the TextWriter, depending on which overload you called) if the input was null. For now I've updated the tests to match the new behaviour, but this is technically a breaking change; it's likely that client code is relying on being able to pass null to that method. Do you think I should keep the new behaviour or put it back to how it was?

using System.IO;
using System.Runtime.CompilerServices;
using System.Text;
using System.Buffers;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: sort in alphabetical order

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this import isn't used any more actually. Good catch, thanks!

@ahsonkhan
Copy link

@benjamin-hodgson, the System.Web.Tests.HttpUtilityTest/HtmlEncode_TextWriter(decoded: null, encoded: "") test is failing in CI

https://mc.dot.net/#/user/benjamin-hodgson/pr~2Fjenkins~2Fdotnet~2Fcorefx~2Fmaster~2F/test~2Ffunctional~2Fcli~2F/127d2ab60cced49647237cd9c6774f5102fb1095/workItem/System.Web.HttpUtility.Tests/analysis/xunit/System.Web.Tests.HttpUtilityTest~2FHtmlEncode_TextWriter(decoded:%20null,%20encoded:%20%5C%22%5C%22)

Message :
System.ArgumentNullException : Value cannot be null.
Parameter name: value
Stack Trace :
   at System.Net.WebUtility.HtmlEncode(String value) in D:\j\workspace\windows-TGrou---74aa877a\src\System.Runtime.Extensions\src\System\Net\WebUtility.cs:line 42
   at System.Web.Util.HttpEncoder.HtmlEncode(String value, TextWriter output) in D:\j\workspace\windows-TGrou---74aa877a\src\System.Web.HttpUtility\src\System\Web\Util\HttpEncoder.cs:line 139
   at System.Web.HttpUtility.HtmlEncode(String s, TextWriter output) in D:\j\workspace\windows-TGrou---74aa877a\src\System.Web.HttpUtility\src\System\Web\HttpUtility.cs:line 140
   at System.Web.Tests.HttpUtilityTest.HtmlEncode_TextWriter(String decoded, String encoded) in D:\j\workspace\windows-TGrou---74aa877a\src\System.Web.HttpUtility\tests\HttpUtility\HttpUtilityTest.cs:line 273

@benjamin-hodgson
Copy link
Author

@ahsonkhan Yep I know. I think I'm going to undo the ArgumentNullException change because it's a breaking change (see my earlier comment). I haven't had the time yet though, will try and get to it over the weekend.

@ViktorHofer
Copy link
Member

I'm looking into further optimizations in the WebUtility class. Please stand by.

@benjamin-hodgson
Copy link
Author

@ViktorHofer Ah thanks very much for looking into that. Those results look really good - is the improvement due to https://github.com/dotnet/coreclr/issues/18005 ? If so should #29921 be closed now?

Copy link
Member

@danmoseley danmoseley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe open an issue for htmldecode to be passed the index found by StringRequiresHtmlDecodin, as an optimization

@ViktorHofer
Copy link
Member

ViktorHofer commented Jun 3, 2018

The latest commits remove another string allocation and also unnecessary unsafe code where boundary checks are eliminated by the CLR. I benchmarked with my commits and without them and they improve throughput by about 10%.

I noticed that the test string wasn't really very expressive and changed it to something which hits all the different paths in the implementation: Hello! '\"<&>\u2665\u00E7 World \u0021\u0023\u003D\u003F <>\"\\&. Unfortunately with this test string I see the big regression @benjamin-hodgson talked about. See below...

Before

Method Mean Error StdDev Gen 0 Allocated
EncodeString 49.16 ms 0.9494 ms 1.1659 ms 19062.5000 38.15 MB
EncodeStringAndTextWriter 53.59 ms 0.3883 ms 0.3442 ms 19062.5000 38.15 MB

After

Method Mean Error StdDev Gen 0 Allocated
EncodeString 75.99 ms 0.3655 ms 0.3052 ms 16000.0000 33600000 B
EncodeStringAndTextWriter 97.22 ms 1.0154 ms 0.8479 ms - 0 B

Copy link
Member

@ViktorHofer ViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big regression (see results)

output.Append("&#");

// Use the buffer directly and reserve a conservative estimate of 10 chars.
Span<char> s = output.AppendSpan(10);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: avoid single character variable names.

{
parsedSuccessfully = uint.TryParse(value.AsSpan(entityOffset + 1, entityLength - 1), NumberStyles.Integer, CultureInfo.InvariantCulture, out parsedValue);
}
bool parsedSuccessfully = value[entityOffset + 1] == 'x' || value[entityOffset + 1] == 'X'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is difficult to parse and read. Can we cache some of these in a local? Like value[entityOffset + 1]?

var sb = new ValueStringBuilder(value.Length);

// Prepend non encoding part
ReadOnlySpan<char> prefixSpan = value.AsSpan(0, index);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think it would look better without an extra local:

// Prepend non encoding part
sb.Append(value.AsSpan(0, index));

}

private static unsafe int IndexOfHtmlEncodingChars(string s, int startPos)
private static int IndexOfHtmlEncodingChars(string input)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ReadOnlySpan<char> here instead of string? Especially since we already call value.AsSpan multiple times at the call site.

@jkotas
Copy link
Member

jkotas commented Jun 3, 2018

big regression

What do the numbers look like for Decoding?

ValueStringBuilder that is created for encoding is guaranteed to have insufficient size (modulo bucketing done by ArrayPool) because of encoding always grows. Try changing the line that creates the ValueStringBuilder to var sb = new ValueStringBuilder(value.Length + 200) and see whether it makes any difference.

@ViktorHofer
Copy link
Member

ValueStringBuilder that is created for encoding is guaranteed to have insufficient size (modulo bucketing done by ArrayPool) because of encoding always grows. Try changing the line that creates the ValueStringBuilder to var sb = new ValueStringBuilder(value.Length + 200) and see whether it makes any difference.

15ms average improvement! Nice!

@ViktorHofer
Copy link
Member

ViktorHofer commented Jun 3, 2018

With @jkotas suggestion of increasing the buffer size (15ms) and switching to stackalloc for small inputs (10ms) I was able to make it faster than it currently is in master.

Before (master)

Method Mean Error StdDev Gen 0 Allocated
EncodeString 48.99 ms 0.9706 ms 1.1178 ms 19062.5000 38.15 MB
EncodeStringAndTextWriter 50.45 ms 0.9785 ms 1.0469 ms 19062.5000 38.15 MB
DecodeString 81.27 ms 0.4768 ms 0.4227 ms 5812.5000 11.75 MB
DecodeStringAndTextWriter 83.12 ms 1.1897 ms 1.1129 ms 5812.5000 11.75 MB

After (update: latest commit now)

Method Mean Error StdDev Gen 0 Allocated
EncodeString 45.90 ms 0.1646 ms 0.1374 ms 16000.0000 33600000 B
EncodeStringAndTextWriter 51.24 ms 0.6761 ms 0.6324 ms - 0 B
DecodeString 74.11 ms 0.4565 ms 0.4270 ms 5812.5000 12320000 B
DecodeStringAndTextWriter 81.72 ms 1.0488 ms 0.9811 ms - 0 B

HtmlDecode(value, sb);
return StringBuilderCache.GetStringAndRelease(sb);
char[] rawBuffer = null;
// We expect the encoded pattern to be at least twice as large as the decoded one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is wrong assumption. I think that there are going to be just a very few escaped characters in the typical case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, I will trim the check down to 256. do you think the 80 for the encode check is conservative enough or should we maybe even increase it to 100 / 128?

HtmlDecode(value, sb);
return StringBuilderCache.GetStringAndRelease(sb);
char[] rawBuffer = null;
Span<char> buffer = value.Length < 256 ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be value.Length <= 256

Copy link
Member

@ViktorHofer ViktorHofer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's not much point in approving my own changes but here I go :octocat:

@ViktorHofer
Copy link
Member

ViktorHofer commented Jun 3, 2018

Should we remove the ValueStringBuilder ctor overload with its tests now that we are not using it anymore or keep it as it might bring value for future scenarios?

@jkotas
Copy link
Member

jkotas commented Jun 4, 2018

Should we remove the ValueStringBuilder ctor overload with its tests now that we are not using it anymore

I think so.

@ViktorHofer
Copy link
Member

Thanks @benjamin-hodgson for your work, I'm done my changes here now. Both throughput and allocation figures are now better than on master therefore I believe this PR is now in a mergeable state. Thanks @jkotas and others for hints.

@stephentoub it would be great if you could glance over the changes before we merge this. thanks!

char[] rawBuffer = null;
Span<char> buffer = value.Length < 80 ?
stackalloc char[256] :
(rawBuffer = ArrayPool<char>.Shared.Rent(value.Length + 200));
Copy link
Member

@stephentoub stephentoub Jun 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment on how these constants were chosen would be helpful. In particular, I don't understand the relationship between 80 <-> 256 and value.Length <-> value.Length + 200. Do we have any corpus of inputs we can use to determine good estimates here, e.g. that on average 1% of characters need to be encoded, or something similar, and then do the math on value.Length?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added comments to explain why we chose these values. The fixed 200 additional characters could/should be transformed into a relative value but for that we first need more information on the distribution of input strings for HtmlEncode & HtmlDecode. See offline thread with Lee.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to merge this in its current state as getting data might take some time. We can refine the buffer sizes afterwards.

if (rawBuffer != null)
{
ArrayPool<char>.Shared.Return(rawBuffer);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing it this way rather than the previous approach of passing the initial capacity into VSB and having it do the ArrayPool access? With this approach, we're going to hold onto this ArrayPool buffer no matter how many times the VSB grows, whereas if the VSB did it, it would return the original buffer to the pool when it first grew out of it.

Span<char> buffer = value.Length < 80 ?
stackalloc char[256] :
(rawBuffer = ArrayPool<char>.Shared.Rent(value.Length + 200));
var sb = new ValueStringBuilder(buffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments/questions as above.

// Use the buffer directly and reserve a conservative estimate of 10 chars.
Span<char> encodingBuffer = output.AppendSpan(10);
valueToEncode.TryFormat(encodingBuffer, out int charsWritten, provider: CultureInfo.InvariantCulture);
output.Length -= (10 - charsWritten);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be helpful to put the 10 that's used multiple times into a const that describes how it was picked, e.g. const int MaxInt32Digits = 10 or something like that. I assume valueToEncode can't be negative at this point? If it could, max digits would need to be 11.


// Use the buffer directly and reserve a conservative estimate of 10 chars.
Span<char> encodingBuffer = output.AppendSpan(10);
valueToEncode.TryFormat(encodingBuffer, out int charsWritten, provider: CultureInfo.InvariantCulture);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be necessary to specify the culture here, and doing so will make it slower.

Benjamin Hodgson and others added 2 commits June 4, 2018 14:57
do the same optimisation for HtmlDecode

remove now-unneeded reference

Add tests for new ValueStringBuilder methods
Reduce allocation and remove unsafe code

Use bigger buffer size and use stackallocation for small inputs

Optimize decode code path and output writing
@ViktorHofer
Copy link
Member

@benjamin-hodgson I squashed yours commits together so that the PR is ready to be merged, not squashed.

@ViktorHofer ViktorHofer added the * NO SQUASH * The PR should not be squashed label Jun 4, 2018
@benjamin-hodgson
Copy link
Author

@ViktorHofer OK, no problem. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-System.Net * NO SQUASH * The PR should not be squashed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize WebUtility.HtmlEncode and WebUtility.HtmlDecode