Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Do Where and OfType preserve List capacity?
Unless I am mistaken, myList.Select(a => a).ToList() initializes the resulting list to the capacity of myList.Count. Does myList.Where(a => true).ToList() do so or does it build up from the initial capacity? What about OfType?
2 answers
TL;DR: for .Where<T>(), no, for .Cast<T>(), yes. At least, when possible, when the prior operator in the chain passes count info along.
Looking at the implementation of .ToList(), we can see how its optimizations work.
public static List<TSource> ToList<TSource>(this IEnumerable<TSource> source)
{
if (source is null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.source);
}
if (!IsSizeOptimized && source is Iterator<TSource> iterator)
{
return iterator.ToList();
}
return new List<TSource>(source);
}
There's some optimizations within the List<T> constructor for initializing the capacity when source is an ICollection<T>, but the real optimizations within LINQ are built around that Iterator<T> type we're checking for. It's an abstract class with it's own ToList() implementation, as we can see, which different operators can subclass and override.
For example, we can see the optimization in play within ArraySelectIterator<T>.ToList(), which is what .Select<T>() returns when its input is a List<T>:
public override List<TResult> ToList()
{
TSource[] source = _source;
Debug.Assert(source.Length > 0);
var results = new List<TResult>(source.Length);
Fill(source, SetCountAndGetSpan(results, source.Length), _selector);
return results;
}
The operator not only constructs the new List<T> with an initial capacity to match the source array, but also bypasses normal safety checks to populate it, with that internal Fill() helper method. The implementation for ListSelectIterator<T>.ToList() looks basically identical.
Now, looking at .ArrayWhereIterator<T>.ToList(), we can see there's no such capacity optimizations:
public override List<TSource> ToList() => ToList(_source, _predicate);
public static List<TSource> ToList(ReadOnlySpan<TSource> source, Func<TSource, bool> predicate)
{
SegmentedArrayBuilder<TSource>.ScratchBuffer scratch = default;
SegmentedArrayBuilder<TSource> builder = new(scratch);
foreach (TSource item in source)
{
if (predicate(item))
{
builder.Add(item);
}
}
List<TSource> result = builder.ToList();
builder.Dispose();
return result;
}
And this makes sense. We don't know how many items are actually going to pass the predicate, until after we evaluate them all, so we can't really know how much space we need to pre-allocate. Better to not over-allocate, and let the consumer do their own optimization, if that's important to them.
We can do the same inspection for .Cast<T>() and see that it all looks mostly the same as .Select<T>(). Which also makes sense, since it functionally IS just a .Select<T>().
0 comment threads
I did a quick test using LINQPad and it looks as though .Select() does not necessarily preserve capacity. For .Where() and .OfType() the capacity depends on the count, though it doesn't necessarily match it. In my case .Select() reduced the capacity to match the count. .Where() and .OfType() halved the capacity when it could contain the result, and preserved the capacity when it couldn't. This doesn't answer how it determines the exact capacity to use in any case.
List<object> fruits = new()
{
"Mango",
"Orange",
null,
"Apple",
3.0,
"Banana"
};
Console.WriteLine($"Original - Capactity: {fruits.Capacity}; Count: {fruits.Count()}");
List<object> everything = fruits.Select(x => x).ToList();
Console.WriteLine($"Selected - Capactity: {everything.Capacity}; Count: {everything.Count()}");
List<object> hasN = fruits.Where(x => x?.ToString()?.Contains("n") ?? false).ToList();
Console.WriteLine($"Whered - Capactity: {hasN.Capacity}; Count: {hasN.Count()}");
List<string> isString = fruits.OfType<string>().ToList();
Console.WriteLine($"OfTyped - Capactity: {isString.Capacity}; Count: {isString.Count()}");
List<object> notNull = fruits.Where(x => x != null).ToList();
Console.WriteLine($"Whered 2 - Capactity: {notNull.Capacity}; Count: {notNull.Count()}");
/* Output
Original - Capactity: 8; Count: 6
Selected - Capactity: 6; Count: 6
Whered - Capactity: 4; Count: 3
OfTyped - Capactity: 4; Count: 4
Whered 2 - Capactity: 8; Count: 5
*/

1 comment thread