Skip to content

System.Linq performance improvement suggestions #14366

@ikopylov

Description

@ikopylov

With Linq-to-Objects it is quite common practice to perform a series of transformations, and then materialize sequence to a concrete collection type by calling ToArray(), ToList(), ToDictionary(). These operations would work much faster if they knew the number of elements in the sequence.

Currently System.Linq.Enumerable has special treatment for ICollection interface only.
I suppose that additional support for IReadOnlyCollection can improve performance in some cases, because through it we can figure out the total number of elements.

Another problem with System.Linq is that in many cases the information about the number of elements is lost. One of the most common example:

List<int> source = ...;
List<int> transformed = source.Select(o => o + 1).ToList();

Obviously Select does not change the number of elements in the sequence, but ToList method cannot take advantage of that. Information is lost.
In this particular scenario, it would be great for Select to return some SelectIterator instance, which implements IReadOnlyCollection, and thereby passes the number of elements to the subsequent methods.

Steps to measure performance gain:

  1. Find in System.Linq.Tests\Performance the test for the method you've changed;
  2. Uncomment [Fact] attribute above that method;
  3. Build test project by Visual Studio in Release mode;
  4. Go to the folder with tests binaries: bin\tests\Windows_NT.AnyCPU.Release\System.Linq.Tests\aspnetcore50\;
  5. Open command prompt (cmd.exe or PowerShell);
  6. Run command: CoreRun.exe xunit.console.netcore.exe System.Linq.Tests.dll -parallel none -trait "Perf=true";
  7. Wait for results that will be printed right in the command window;
  8. Don't forget to run tests with different collection sizes. This can be done by varying elementCount argument of Measure method.

Casting to IReadOnlyCollection<T> is a slow operation so it is not a good idea to check if this interface implemented. The performance drop will be most noticeable on the small collections.

Tasks:

  • Select: add iterators for List<T>, Array[T], ICollection<T>;
  • ToArray, ToDictionary: add special support for ICollection<T> (and, very carefully, for IReadOnlyCollection<T>) to get the initial capacity;
  • ToList (???): add special support for IReadOnlyCollection<T> to get the initial capacity (separated as it will affect System.Collections.Generic);
  • OrderBy(Descending)/ThanBy(Descending): implement special iterator for ICollection<T> to propagate Count;
  • Cast, Reverse: add iterators for ICollection<T> to propagate Count;
  • Range, Repeat: add an iterator that implements ICollection<T>;
  • Skip, Take: add an iterator that handle ICollection<T>;
  • Add performance tests for System.Linq

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions