Address misc feedback and issues from recent perf changes#41414
Address misc feedback and issues from recent perf changes#41414steveharter merged 4 commits intodotnet:masterfrom
Conversation
src/System.Text.Json/src/System/Text/Json/Serialization/WriteStackFrame.cs
Outdated
Show resolved
Hide resolved
| { | ||
| int len = JsonClassInfo.PropertyCacheArray.Length; | ||
| if (PropertyEnumeratorIndex < len) | ||
| if (PropertyEnumeratorIndex > len) |
There was a problem hiding this comment.
I am not fully understanding this logic. Can you explain what we are trying to do here?
There was a problem hiding this comment.
The extension property is always the last property in PropertyCacheArray so this is checking to see if the extension property is writing or whether it is finished.
The extension property which is different than other properties because although it is a dictionary property we want to serialize each element in the dictionary as a property of the POCO instead of a dictionary element thus we need some extra state to track whether it is being serialized.
src/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.String.cs
Outdated
Show resolved
Hide resolved
src/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.String.cs
Outdated
Show resolved
Hide resolved
src/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.String.cs
Show resolved
Hide resolved
| if (maxBytes > options.DefaultBufferSize) | ||
| { | ||
| // Get the actual byte count in order to handle large input. | ||
| maxBytes = JsonReaderHelper.GetUtf8ByteCount(json.AsSpan()); |
There was a problem hiding this comment.
nit: Given this throws for certain large string lengths (when the return count won't fit int.MaxValue), should this be documented as an exception on the method?
There was a problem hiding this comment.
If so we should also document every other Deserialize method as well. Perhaps create a doc issue?
There was a problem hiding this comment.
Do the other deserialize methods have this transcoding-based exception though? If so, then yes, we should. Will create an issue on https://github.com/dotnet/dotnet-api-docs.
...ystem.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.HandlePropertyName.cs
Show resolved
Hide resolved
src/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.String.cs
Outdated
Show resolved
Hide resolved
src/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.String.cs
Outdated
Show resolved
Hide resolved
src/System.Text.Json/src/System/Text/Json/Serialization/JsonPropertyInfo.cs
Show resolved
Hide resolved
src/System.Text.Json/src/System/Text/Json/Serialization/JsonSerializer.Read.String.cs
Outdated
Show resolved
Hide resolved
| // higher than the threshold which is options.DefaultBufferSize. | ||
| Span<byte> utf8 = json.Length <= (ArrayPoolMaxSizeBeforeUsingNormalAlloc / JsonConstants.MaxExpansionFactorWhileTranscoding) ? | ||
| tempArray = ArrayPool<byte>.Shared.Rent(json.Length * JsonConstants.MaxExpansionFactorWhileTranscoding) : | ||
| new byte[JsonReaderHelper.GetUtf8ByteCount(json.AsSpan())]; |
There was a problem hiding this comment.
Why not just always use Rent? If it's too large for the pool, it'll just allocate it itself.
There was a problem hiding this comment.
From the previous commit (which removed the comment):
https://github.com/dotnet/corefx/pull/41414/files/a8cd02e62d6bd2d4686ff86124fdc31e233ca42d..218e1c0d306dde4441927fdcecacd071295a145c
// and because we can avoid calling Clear().
And from above:
// For performance, avoid obtaining the actual byte count unless the memory usage may be
// higher than the threshold
There was a problem hiding this comment.
avoid obtaining the actual byte count unless the memory usage may be higher than the threshold
Computing an accurate byte count doesn't prevent you from still using ArrayPool.
and because we can avoid calling Clear()
The benefit analysis of clearing all of these arrays is still not obvious to me.
There was a problem hiding this comment.
Computing an accurate byte count doesn't prevent you from still using ArrayPool.
I believe the intention here is to avoid doing the extra work to get an accurate byte count when we are using the arraypool anyway and save that cost (asking for 4k from the pool vs 12k isn't that much different, so asking for the worst case is good enough). However, when allocating a regular array, over-asking by up to 3x is probably too expensive, so asking for the exact amount needed by incurring the cost of getting the exact count is worth the trade-off.
With your suggestion, we'd always have to always get the accurate byte count, regardless of whether the pool was allocating itself (because it got exhausted/we need over 1 MB), or not. Maybe that approach is good enough but there are certainly some trade-offs here. @steveharter, what are your thoughts on this?
There was a problem hiding this comment.
Yes there are two things going on here:
- Avoid asking for a transcoded byte count when a threshold is surpassed
- The threshold selected happens to be the same as ArrayPool's current implementation where it just does a normal alloc, so a call to Clear() can be avoided
Given that, yes the call to Clear() is insignificant so we could just use ArrayPool always.
|
It is hard to review just the latest changes. Can you avoid force pushing and keep the individual commits going forward (otherwise, I end up re-reviewing the exact same lines of code)? |
src/System.Text.Json/src/System/Text/Json/Serialization/WriteStackFrame.cs
Show resolved
Hide resolved
ahsonkhan
left a comment
There was a problem hiding this comment.
Other than the threshold typo in tests and some test nits/questions, looks good.
|
Test failures not related: |
Address late feedback from previous perf PR #41098 and other misc perf-related deserialization changes.
Results in deserialization perf increase of ~1% - ~2% for simple object.