Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Better bulk adds for List.#8306

Merged
ianhays merged 6 commits intodotnet:masterfrom
jamesqo:better-bulk-add
Dec 5, 2016
Merged

Better bulk adds for List.#8306
ianhays merged 6 commits intodotnet:masterfrom
jamesqo:better-bulk-add

Conversation

@jamesqo
Copy link

@jamesqo jamesqo commented Nov 25, 2016

This PR optimizes the List(IEnumerable<T>) .ctor and AddRange method for lazy enumerables.

  • For the constructor, we were previously looping through the sequence and calling Add on each item. This is not very efficient since Add is not inlined. Instead, we can have a tighter loop where we essentially write Add inline and only update fields when we run out of space.

  • For InsertRange, we check if we are adding at the end of the list first. This is a common case, since AddRange just forwards to InsertRange. If we are, then we call AddEnumerable instead of Insert in a loop.

I ran tests on corefx's System.Collections and everything passes.

Note: I can't test the perf impact these changes have at the moment because of #8266. If/when I am able to, however, I will update this PR.

cc @stephentoub @jkotas

Related: dotnet/corefx#13942 (comment)

@jamesqo
Copy link
Author

jamesqo commented Nov 25, 2016

cc @vancem

@jamesqo
Copy link
Author

jamesqo commented Nov 25, 2016

@jkotas @stephentoub I managed to workaround the issue I was having. Here are the performance results I got: https://gist.github.com/jamesqo/5ac880d3e46d719c99ec388128ac50a6 Test code: https://github.com/jamesqo/Dotnet/blob/57417b93ed0290444b461b8e7dea2082b441b149/Program.cs#L49

They are not universal, but there is generally about a 10% improvement. I think all of the allocations involved may be dominating the results, and that's why the results are not always consistent.

@justinvp
Copy link

Won't this introduce a behavior change?

Consider an enumerator that reads lines from a file or from the network that may throw from MoveNext().

Or a contrived example:

public static IEnumerable<int> GetItems()
{
    yield return 1;
    yield return 2;
    yield return 3;
    yield return 4;
    throw new Exception();
}

Before this change, each item successfully enumerated before MoveNext() throws will be added to the list and _size will reflect the number of items successfully added.

After this change, those initial items will be inserted in the array, but then MoveNext() throws, so _size won't be updated to reflect the items that were successfully inserted.

@jamesqo
Copy link
Author

jamesqo commented Nov 26, 2016

@jkotas Sorry, can you please review when you have time? Thanks.

@jkotas
Copy link
Member

jkotas commented Nov 26, 2016

@jamesqo I am leaving tweaks of convenience methods like this one to area maintainers to deal with.

@Priya91 @ianhays PTLA

{
// Make a final update to _size and _version after we've finished adding.
_size = size;
_version++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These attempted optimizations around _size, _items, _version, etc. break code like:

var list = new List<int>();
list.AddRange(Enumerable.Range(0, 5).Select(i => { list.Add(i); return i; }));

Copy link
Author

@jamesqo jamesqo Nov 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub Interesting. I don't think it's going to be very common to modify the list in a selector, but seeing it in an invalid state/having items overwritten is pushing it.

I can remove the optimizations around _size, and write to _version before we do any copying. AFAICS caching _items is harmless though, because the field is always up-to-date, it's just the local variable we're updating.

Copy link
Member

@stephentoub stephentoub Nov 27, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICS caching _items is harmless though

How so? If the iterator adds an element that causes the list to grow, or otherwise increases its capacity, _items will end up pointing to a different array than items points to, and AddRange will start storing the values into a garbage array.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub EnsureCapacity always updates _items, and we always update the local variable via items = _items after that. Unless an exception of some sort is thrown between those lines, in which case we stop copying anyways, the two are always equal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnsureCapacity always updates _items, and we always update the local variable via items = _items after that

But those lines aren't relevant to the case I'm talking about. If the call to MoveNext increases the capacity of the list, items will now be different from _items, we won't enter that if block (as we will already have enough capacity since it was just resized), and we'll store the items into the wrong array.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stephentoub Add is not inlined. That is why the numbers show improvements.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @AndyAyersMS

Manual inlining is double edged sword. You can get a small improvement in a lot of places by manually inlining and always show there is a performance improvement in microbenchmark. However if you have done manual inlining enough times, the real world workloads would likely get slower because of the code size increase from manual inlining would overwhelm the benefits.

We may want to be more data driven for manual inlining - only do it if there is real-world workload where it makes a dent, and leave profile guided optimizations to take care of the rest.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However if you have done manual inlining enough times, the real world workloads would likely get slower because of the code size increase from manual inlining would overwhelm the benefits.

I think it makes sense in this case. This in a loop and it's part of the bottleneck. It's also frequently used (directly, or indirectly by Linq).

We may want to be more data driven for manual inlining - only do it if there is real-world workload where it makes a dent,

It just came up in corefx in a frequently-used method. See the PR I linked to: dotnet/corefx#13942 (comment)

Copy link
Member

@jkotas jkotas Nov 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks reasonable in this case.

BTW: I have checked traces from full .NET Framework, collected on mix of real world workloads (ASP.NET, WFP, WinForms, ...), for the List constructor that you are changing:

  • ~8500 out of 11200 samples is called on empty ICollection
  • ~2600 out of 11200 samples is called on non-empty ICollection
  • ~100 out of 11200 samples is the rest (IEnumerable that is not ICollection), the average length is 9

This data suggests that the most impactful part of this change on real world workloads is not what you think it is. The path that you are changing is not executed that frequently. However, since you are moving the enumerator out into separate method, the prolog/epilog of the frequently hit path gets tighter and it is where the most improvement for the real world workloads will likely come from.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamesqo Did you happen look into why Add is not getting inlined? It would be worth understanding that since at least from source it looks small enough that it should have been considered.

Also would be nice to see before/after disassembly in case the jit is just missing out on some obvious optimization.

@jamesqo
Copy link
Author

jamesqo commented Nov 29, 2016

Test OSX x64 Checked Build and Test


using (IEnumerator<T> en = enumerable.GetEnumerator())
{
_version++; // Even if the enumerable has no items, we can update _version.
Copy link

@justinvp justinvp Dec 1, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InsertRange already increments _version. Can we remove the increment here?

The other caller of this method is the .ctor, which shouldn't need the _version increment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinvp It is just in case if this is called by another method later on, or e.g. the increment to _version is accidentally removed in InsertRange. It is just one write anyway, so I don't think it should cost much relative to all the other stuff going on in the method.

@jamesqo
Copy link
Author

jamesqo commented Dec 3, 2016

@ianhays, @Priya91, @stephentoub Sorry, do you think you could merge these changes when you have time? Thanks.

edit: Also, @vancem

@ianhays ianhays merged commit 78d2ea0 into dotnet:master Dec 5, 2016
@jamesqo jamesqo deleted the better-bulk-add branch December 5, 2016 22:02
@karelz karelz modified the milestone: 2.0.0 Aug 28, 2017
stephentoub added a commit to stephentoub/coreclr that referenced this pull request Oct 18, 2018
stephentoub added a commit to stephentoub/coreclr that referenced this pull request Oct 18, 2018
jkotas pushed a commit that referenced this pull request Oct 18, 2018
A-And pushed a commit to A-And/coreclr that referenced this pull request Nov 20, 2018
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants