-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
Hi! I noticed that the System.Linq.Enumerable.Chunk operator is currently (.NET 7) implemented in a way that could potentially result in delaying the garbage collection of some objects.
The issue emerges in case the processing of each element contained in each TSource[] chunk, involves allocating a large amount of memory. Although the element is removed from the chunk, it is still referenced by the underlying List<TSource>, and so it is not eligible for garbage collection until the whole chunk has been fully processed. Below is a minimal demonstration of this behavior:
public static void Main()
{
var source = Enumerable
.Range(1, 15)
.Select(n => new Item() { Id = n });
var chunkified = source.Chunk(10);
foreach (var chunk in chunkified) ProcessChunk(chunk);
Console.WriteLine($"After foreach");
}
private class Item
{
private byte[] _bytes;
public int Id { get; init; }
public void Load() => _bytes = new byte[5_000_000];
}
static void ProcessChunk(Item[] chunk)
{
Console.WriteLine($"Processing chunk of {chunk.Length} items");
for (int i = 0; i < chunk.Length; i++)
{
var item = chunk[i];
chunk[i] = null;
item.Load();
Console.WriteLine(@$"After processing item #{item
.Id,-2} chunk[{i}], Memory: {GC.GetTotalMemory(true):#,0} bytes");
}
}Output with the .NET 6 implementation:
Processing chunk of 10 items
After processing item #1 chunk[0], Memory: 5,103,968 bytes
After processing item #2 chunk[1], Memory: 5,113,624 bytes
After processing item #3 chunk[2], Memory: 5,113,592 bytes
After processing item #4 chunk[3], Memory: 5,113,560 bytes
After processing item #5 chunk[4], Memory: 5,113,528 bytes
After processing item #6 chunk[5], Memory: 5,113,496 bytes
After processing item #7 chunk[6], Memory: 5,113,464 bytes
After processing item #8 chunk[7], Memory: 5,113,432 bytes
After processing item #9 chunk[8], Memory: 5,113,400 bytes
After processing item #10 chunk[9], Memory: 5,113,368 bytes
Processing chunk of 5 items
After processing item #11 chunk[0], Memory: 5,113,392 bytes
After processing item #12 chunk[1], Memory: 5,113,504 bytes
After processing item #13 chunk[2], Memory: 5,113,472 bytes
After processing item #14 chunk[3], Memory: 5,113,440 bytes
After processing item #15 chunk[4], Memory: 5,113,408 bytes
After foreach
Output with the current (.NET 7) implementation:
Processing chunk of 10 items
After processing item #1 chunk[0], Memory: 5,104,248 bytes
After processing item #2 chunk[1], Memory: 10,122,064 bytes
After processing item #3 chunk[2], Memory: 15,117,680 bytes
After processing item #4 chunk[3], Memory: 20,122,176 bytes
After processing item #5 chunk[4], Memory: 25,122,200 bytes
After processing item #6 chunk[5], Memory: 30,122,224 bytes
After processing item #7 chunk[6], Memory: 35,122,248 bytes
After processing item #8 chunk[7], Memory: 40,122,272 bytes
After processing item #9 chunk[8], Memory: 45,122,296 bytes
After processing item #10 chunk[9], Memory: 50,122,320 bytes
Processing chunk of 5 items
After processing item #11 chunk[0], Memory: 5,120,088 bytes
After processing item #12 chunk[1], Memory: 10,113,816 bytes
After processing item #13 chunk[2], Memory: 15,113,840 bytes
After processing item #14 chunk[3], Memory: 20,113,864 bytes
After processing item #15 chunk[4], Memory: 25,113,888 bytes
After foreach
The conditions that can trigger this temporary memory leak are quite unusual, but nevertheless I am reporting it because the fix should be relatively easy.