Optimize percent-encoded UTF8 processing in Uri#32552
Optimize percent-encoded UTF8 processing in Uri#32552MihaZupan merged 17 commits intodotnet:masterfrom
Conversation
src/libraries/System.Private.Uri/src/System/PercentEncodingHelper.cs
Outdated
Show resolved
Hide resolved
| "http://con.tosoco.ntosoc.com/abcdefghi/jk" + "%C8%F3%B7%A2%B7%BF%B2%FA", | ||
| resultUri.ToString()); | ||
| resultUri.ToString(), | ||
| StringComparer.OrdinalIgnoreCase); |
There was a problem hiding this comment.
This is an example of where the output would be different - the casing from the input would be preserved instead of upper-casing the hex chars
src/libraries/System.Private.Uri/tests/UnitTests/IriEscapeUnescapeTest.cs
Show resolved
Hide resolved
| { | ||
| value = (value | 0x20) - 'a' + 10; | ||
| } | ||
| else if ((value - '8') <= ('9' - '8')) |
There was a problem hiding this comment.
You could use your helpers in the other file for this.
There was a problem hiding this comment.
I manually inlined it here, as we are only interested in non-ascii ones
| { | ||
| value = ((value << 4) + second) - '0'; | ||
| } | ||
| else if ((uint)((second - 'A') & ~0x20) <= ('F' - 'A')) |
There was a problem hiding this comment.
And other places in here.
src/libraries/System.Private.Uri/src/System/PercentEncodingHelper.cs
Outdated
Show resolved
Hide resolved
PR generally looks good, thanks! :) Waiting for the re-introduction of the removed tests in the meantime. |
|
I should note that the current behavior of upper-casing hex is only done for non-ascii characters. If we only have Ascii, the input-casing is preserved, so the behavior is the same for Ascii and non-ascii after this change. |
| dest.Append(pInput[i++]); | ||
| dest.Append(pInput[i++]); | ||
| dest.Append(pInput[i]); |
There was a problem hiding this comment.
Each Append will perform a bounds check. Can you use the Append(char *, int) overload here?
There was a problem hiding this comment.
That overload will do a Span slice as well. I was thinking of adding Append(char, char) and Append(char, char, char) overloads as they can have a measurable perf impact (that would be part of a separate PR adressing #22903).
There was a problem hiding this comment.
Append(char, char) and Append(char, char, char) seem quite awkward API choices, IMHO. Maybe make the Append(char *, int) overload not create a Span slice?
src/libraries/System.Private.Uri/tests/FunctionalTests/PercentEncodingHelperTests.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.Uri/tests/FunctionalTests/PercentEncodingHelperTests.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.Uri/tests/FunctionalTests/PercentEncodingHelperTests.cs
Outdated
Show resolved
Hide resolved
|
@stephentoub @GrabYourPitchforks @dotnet/ncl |
src/libraries/System.Private.Uri/src/System/PercentEncodingHelper.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.Uri/src/System/PercentEncodingHelper.cs
Outdated
Show resolved
Hide resolved
|
|
||
| if (Rune.DecodeFromUtf8(fourByteSpan.Slice(0, bytesLeftInBuffer), out Rune rune, out bytesConsumed) == OperationStatus.Done) | ||
| { | ||
| Debug.Assert(bytesConsumed >= 2); |
There was a problem hiding this comment.
It is theoretically possible to violate this assertion. I'm having trouble following the logic of this method so I don't know if an assertion violation will end up possibly buffer overrunning or otherwise having a negative effect.
Entry to method:
char* input = "%FA%FB%00"
fourByteBuffer = 0
bytesLeftInBuffer = 0
totalCharsConsumed = 0
charsToCopy = 0
bytesConsumed = 0
<after a few rounds of ReadByteFromInput>
fourByteBuffer = 0xFBFA0000 (buffer = [ 00 00 FA FB ])
bytesLeftInBuffer = 2
<go to NoMoreOrInvalidInput>
fourByteBuffer = 0x0000FBFA (buffer = [ FA FB 00 00 ])
bytesLeftInBuffer = 2
<go to DecodeRune>
DecodeFromUtf8 = InvalidData
bytesConsumed = 1
charsToCopy = 3
<go to AfterDecodeRune>
bytesLeftInBuffer = 1
totalCharsConsumed = 3
## meanwhile, another thread changes 'input' to read "%FA%FB%FC" ##
<go to RefillBuffer>
i = 3 + (1 * 3) = 6
<go to ReadByteFromInput>
fourByteBuffer = 0xFC0000FB (buffer = [ FB 00 00 FC ])
bytesLeftInBuffer = 2
<go to NoMoreOrInvalidInput>
## recall: bytesConsumed is still 1 from earlier DecodeRune call
fourByteBuffer = 0x00FC0000 (buffer = [ 00 00 FC 00 ])
bytesLeftInBuffer = 2
<go to DecodeRune>
DecodeFromUtf8 = Done
bytesConsumed = 1
Debug.Assert(bytesConsumed >= 2); // assertion would fail but not present in release branch
dest.Append(input + 3 - 3, 3); // copy 3 chars
charsToCopy = 0
charsToCopy = 3
<go to AfterDecodeRune>
...There was a problem hiding this comment.
I believe there would be no negative side-effects in this case. While the assert would fail if present, we rely on Rune.DecodeFromUtf8 to always consume at least 1 byte in each iteration, so the loop will eventually exit. bytesConsumed being 1 instead of >= 2 will not impact memory accesses into the input.
I added a comment to this assert indicating the scenario under which it may fail.
In general, is the scenario of a string instance being modified after creation something that we should be/are considering? I would be surprised if there aren't other APIs across runtime that make the assumption of string immutability with worse side-effects.
|
The recent changes improve the throughput by another 10-20% depending on input.
|
|
@MihaZupan, what's the next step here? Are you waiting for reviews? Is it ready to merge? |
|
@GrabYourPitchforks, I assume no response means you're good with this now? Thanks. |
|
@MihaZupan, can you rebase this to resolve the conflicts? Thanks. |
70606d0 to
2a78dc8
Compare
|
Noticed this is the oldest Microsoft PR in the repo now. @GrabYourPitchforks any remaining feedback or is this ready to go? |
|
@scalablecory is your feedback addressed? Trying to get our 90th percentile PR age down.. |
|
Test failures are unrelated. Sent an email to @GrabYourPitchforks about finalizing the review here, otherwise I will be closing the PR until we can get someone to take a look at it. |
When unescaping percent-encoded non-ascii we currently:
This PR changes it into performing a single pass, writing to the ValueStringBuilder without temporary buffers.
Currently there is a behavioral change where before all hex characters would be upper-cased, now their input-casing is preserved. Keeping the old behavior is a trivial change with a bit of a perf penalty.
I should note that the current behavior of upper-casing hex is only done for non-ascii characters. If we only have Ascii, the input-casing is preserved, so the behavior is the same for Ascii and non-ascii after this change.
Perf goes up significantly whenever this unescaping path is hit
(The allocation win is hit whenever there is a single non-ascii char in the input)
Updated benchmarks: #32552 (comment)