Conversation
|
cc @vancem |
|
@jkotas @stephentoub I managed to workaround the issue I was having. Here are the performance results I got: https://gist.github.com/jamesqo/5ac880d3e46d719c99ec388128ac50a6 Test code: https://github.com/jamesqo/Dotnet/blob/57417b93ed0290444b461b8e7dea2082b441b149/Program.cs#L49 They are not universal, but there is generally about a 10% improvement. I think all of the allocations involved may be dominating the results, and that's why the results are not always consistent. |
|
Won't this introduce a behavior change? Consider an enumerator that reads lines from a file or from the network that may throw from Or a contrived example: public static IEnumerable<int> GetItems()
{
yield return 1;
yield return 2;
yield return 3;
yield return 4;
throw new Exception();
}Before this change, each item successfully enumerated before After this change, those initial items will be inserted in the array, but then |
|
@jkotas Sorry, can you please review when you have time? Thanks. |
|
@jamesqo I am leaving tweaks of convenience methods like this one to area maintainers to deal with. |
| { | ||
| // Make a final update to _size and _version after we've finished adding. | ||
| _size = size; | ||
| _version++; |
There was a problem hiding this comment.
These attempted optimizations around _size, _items, _version, etc. break code like:
var list = new List<int>();
list.AddRange(Enumerable.Range(0, 5).Select(i => { list.Add(i); return i; }));There was a problem hiding this comment.
@stephentoub Interesting. I don't think it's going to be very common to modify the list in a selector, but seeing it in an invalid state/having items overwritten is pushing it.
I can remove the optimizations around _size, and write to _version before we do any copying. AFAICS caching _items is harmless though, because the field is always up-to-date, it's just the local variable we're updating.
There was a problem hiding this comment.
AFAICS caching _items is harmless though
How so? If the iterator adds an element that causes the list to grow, or otherwise increases its capacity, _items will end up pointing to a different array than items points to, and AddRange will start storing the values into a garbage array.
There was a problem hiding this comment.
@stephentoub EnsureCapacity always updates _items, and we always update the local variable via items = _items after that. Unless an exception of some sort is thrown between those lines, in which case we stop copying anyways, the two are always equal.
There was a problem hiding this comment.
EnsureCapacity always updates _items, and we always update the local variable via items = _items after that
But those lines aren't relevant to the case I'm talking about. If the call to MoveNext increases the capacity of the list, items will now be different from _items, we won't enter that if block (as we will already have enough capacity since it was just resized), and we'll store the items into the wrong array.
There was a problem hiding this comment.
@stephentoub Add is not inlined. That is why the numbers show improvements.
There was a problem hiding this comment.
cc @AndyAyersMS
Manual inlining is double edged sword. You can get a small improvement in a lot of places by manually inlining and always show there is a performance improvement in microbenchmark. However if you have done manual inlining enough times, the real world workloads would likely get slower because of the code size increase from manual inlining would overwhelm the benefits.
We may want to be more data driven for manual inlining - only do it if there is real-world workload where it makes a dent, and leave profile guided optimizations to take care of the rest.
There was a problem hiding this comment.
However if you have done manual inlining enough times, the real world workloads would likely get slower because of the code size increase from manual inlining would overwhelm the benefits.
I think it makes sense in this case. This in a loop and it's part of the bottleneck. It's also frequently used (directly, or indirectly by Linq).
We may want to be more data driven for manual inlining - only do it if there is real-world workload where it makes a dent,
It just came up in corefx in a frequently-used method. See the PR I linked to: dotnet/corefx#13942 (comment)
There was a problem hiding this comment.
Ok, looks reasonable in this case.
BTW: I have checked traces from full .NET Framework, collected on mix of real world workloads (ASP.NET, WFP, WinForms, ...), for the List constructor that you are changing:
- ~8500 out of 11200 samples is called on empty ICollection
- ~2600 out of 11200 samples is called on non-empty ICollection
- ~100 out of 11200 samples is the rest (IEnumerable that is not ICollection), the average length is 9
This data suggests that the most impactful part of this change on real world workloads is not what you think it is. The path that you are changing is not executed that frequently. However, since you are moving the enumerator out into separate method, the prolog/epilog of the frequently hit path gets tighter and it is where the most improvement for the real world workloads will likely come from.
There was a problem hiding this comment.
@jamesqo Did you happen look into why Add is not getting inlined? It would be worth understanding that since at least from source it looks small enough that it should have been considered.
Also would be nice to see before/after disassembly in case the jit is just missing out on some obvious optimization.
|
Test OSX x64 Checked Build and Test |
|
|
||
| using (IEnumerator<T> en = enumerable.GetEnumerator()) | ||
| { | ||
| _version++; // Even if the enumerable has no items, we can update _version. |
There was a problem hiding this comment.
InsertRange already increments _version. Can we remove the increment here?
The other caller of this method is the .ctor, which shouldn't need the _version increment.
There was a problem hiding this comment.
@justinvp It is just in case if this is called by another method later on, or e.g. the increment to _version is accidentally removed in InsertRange. It is just one write anyway, so I don't think it should cost much relative to all the other stuff going on in the method.
|
@ianhays, @Priya91, @stephentoub Sorry, do you think you could merge these changes when you have time? Thanks. edit: Also, @vancem |
Better bulk adds for List. Commit migrated from dotnet/coreclr@78d2ea0
…clr#20471) Commit migrated from dotnet/coreclr@4b98caf
This PR optimizes the
List(IEnumerable<T>).ctor andAddRangemethod for lazy enumerables.For the constructor, we were previously looping through the sequence and calling
Addon each item. This is not very efficient sinceAddis not inlined. Instead, we can have a tighter loop where we essentially writeAddinline and only update fields when we run out of space.For
InsertRange, we check if we are adding at the end of the list first. This is a common case, sinceAddRangejust forwards toInsertRange. If we are, then we callAddEnumerableinstead ofInsertin a loop.I ran tests on corefx's System.Collections and everything passes.
Note: I can't test the perf impact these changes have at the moment because of #8266. If/when I am able to, however, I will update this PR.
cc @stephentoub @jkotas
Related: dotnet/corefx#13942 (comment)