Reduce allocations in {Concat,SelectMany,Append,Prepend}.ToArray by close to 50%#14675
Reduce allocations in {Concat,SelectMany,Append,Prepend}.ToArray by close to 50%#14675VSadov merged 3 commits intodotnet:masterfrom
Conversation
There was a problem hiding this comment.
By using a for-loop we save 1 MoveNext call in Release, since we know exactly when the sequence will end. e.g. For 0-length enumerables we would normally call MoveNext once, but here we will skip the loop entirely.
There was a problem hiding this comment.
Since this requires that MoveNext() will return true exactly count times anyway, we could have a conventional foreach, leave count out of the call and ++arrayIndex rather than ++i and doing the addition.
Which of these micro-opts is the greater is hard to guess at from first principles rather than profiling though (and would depend on MoveNext() complexity).
Losing the ability to test count in debug would make some flaws harder to spot, though they should still be obvious enough to the tests.
There was a problem hiding this comment.
@JonHanna While it is true that count is only of debug value, I don't think think this function semantically makes sense without it. If you're saying it's safe to copy this enumerable to this array, then you would have to know the count of the enumerable (or at least an upper bound thereof) in advance. I doubt there will ever be a callsite for this function that could not provide a count.
There was a problem hiding this comment.
I have decided to remove this optimization and just go with the regular foreach, however. I don't think calling side-effecting code like MoveNext only in debug is a good idea, and who knows-- maybe some iterators/enumerators will not work the same if disposed too early.
There was a problem hiding this comment.
Self-note: This doesn't really need to be a mutable struct. Maybe it would be better to make it immutable. (edit 1 mo. later: done)
There was a problem hiding this comment.
Maybe profile that. The effect either way can be surprising sometimes (at least I find so).
There was a problem hiding this comment.
@JonHanna 👍 I doubt it will make much of a difference since this is an 8-byte struct that fits in an x64 register, and it's only being copied once per CopyTo call. Regardless, I will take your advice.
(edit 1 mo. later: I have had trouble building the repo recently and so am unable to run perf tests. I very much doubt making this struct immutable will have a noticeable impact because an extra copy is only made once per method call. Besides, the struct is only 8 bytes.)
There was a problem hiding this comment.
Just a bit of cleanup-- added a TryMove function that attempts to get this builder as a single array without copying.
There was a problem hiding this comment.
Some cleanup-- I stumbled across this snippet of code, which looks like it has the exact same semantics as the Copy function I've introduced.
There was a problem hiding this comment.
The general pattern is
var builder = new SparseArrayBuilder<T>(initialize: true);
...
if (CanPredetermineSizeOfSomeItems)
{
// Semantically equivalent to builder.AddRange(
// Enumerable.Repeat(default(T), PredeterminedSize)),
// except we don't add anything to the buffer.
builder.Reserve(PredeterminedSize);
}
...
// Creates an array with a bunch of zero-initialized
// gaps in it where `Reserve` has been called
T[] array = builder.ToArray();
// Get the list of locations where gaps were added.
ArrayBuilder<Marker> markers = builder.Markers;
for (int i = 0; i < markers.Count; i++)
{
// Each Marker tells us the index of the gap in
// the array, as well as how wide the gap is.
Marker marker = markers[i];
// Copy the corresponding collection/nodes/etc.
// to this gap
...
}This is not the best example because we don't have to use the markers (we know exactly where the gaps are here, at the start & at the end), but you can see the above code structure in Concat & SelectMany.
There was a problem hiding this comment.
I haven't looked into it extensively yet, but I'm conjecturing that some of the speed-wise regressions in the tests for small collections could be abated if we could switch to the : this()-style pattern here and in LargeArrayBuilder. The JIT does not handle large structs very well yet, and additionally will not be able to inline either of these methods. (edit: Made the changes)
|
It certainly seems soundly reasoned. I think passing the |
|
Test Innerloop Ubuntu14.04 Debug Build and Test |
I posted some benchmarks for |
|
@VSadov @OmarTawfik can you please review the change when you get a chance? |
|
LGTM. @VSadov? |
e59cae1 to
f838d32
Compare
e193592 to
c410a22
Compare
|
@VSadov PTAL when you have time-- I have tidied up this PR and posted perf results at #14675 (comment) |
|
@VSadov @OmarTawfik ping? |
|
LGTM |
Reduce allocations in {Concat,SelectMany,Append,Prepend}.ToArray by close to 50%
Commit migrated from dotnet/corefx@a75d133
Background: Some
ToArrayimplementations employ a double-buffering strategy if we don't know the exact length of the output array. First the items are copied into some sort of buffer, then they are transferred into an array. For example, in the following snippetIn
ToArray, we know the size of the lhs enumerable but not how many elements are in the rhs one. So we allocate a buffer with 100+ elements to hold everything, calculate the final size of the array, and re-copy those elements.Description: Instead of copying everything twice, we can buffer only the enumerables we can't get the count of. If we reach an
ICollectionor anIIListProviderand it has N items, we queue it to be copied later, and tell the builder to leave a gap of N items in the output array. Finally, whenToArrayis called, we revisit all of those queued collections and copy them directly to the output.Here is how the above example works now:
Performance results: An initial test is showing almost 50% decrease in GCs in
SelectMany(e => e).ToArray(), assumingeis always anICollection. results / testMethods this affects:
Concat().ToArray(),SelectMany().ToArray(),Append/Prepend().ToArray()I had to pull in the changes from #13942 because this affects
SelectMany, so the LOC in the diffs is exaggerated. Once/if that's merged, I'll rebase.cc @stephentoub @VSadov @JonHanna