Skip to content

The Enumerable.Chunk can leak memory (.NET 7) #72577

@theodorzoulias

Description

@theodorzoulias

Hi! I noticed that the System.Linq.Enumerable.Chunk operator is currently (.NET 7) implemented in a way that could potentially result in delaying the garbage collection of some objects.

The issue emerges in case the processing of each element contained in each TSource[] chunk, involves allocating a large amount of memory. Although the element is removed from the chunk, it is still referenced by the underlying List<TSource>, and so it is not eligible for garbage collection until the whole chunk has been fully processed. Below is a minimal demonstration of this behavior:

public static void Main()
{
    var source = Enumerable
        .Range(1, 15)
        .Select(n => new Item() { Id = n });
    var chunkified = source.Chunk(10);
    foreach (var chunk in chunkified) ProcessChunk(chunk);
    Console.WriteLine($"After foreach");
}

private class Item
{
    private byte[] _bytes;
    public int Id { get; init; }
    public void Load() => _bytes = new byte[5_000_000];
}

static void ProcessChunk(Item[] chunk)
{
    Console.WriteLine($"Processing chunk of {chunk.Length} items");
    for (int i = 0; i < chunk.Length; i++)
    {
        var item = chunk[i];
        chunk[i] = null;
        item.Load();
        Console.WriteLine(@$"After processing item #{item
            .Id,-2} chunk[{i}], Memory: {GC.GetTotalMemory(true):#,0} bytes");
    }
}

Output with the .NET 6 implementation:

Processing chunk of 10 items
After processing item #1  chunk[0], Memory: 5,103,968 bytes
After processing item #2  chunk[1], Memory: 5,113,624 bytes
After processing item #3  chunk[2], Memory: 5,113,592 bytes
After processing item #4  chunk[3], Memory: 5,113,560 bytes
After processing item #5  chunk[4], Memory: 5,113,528 bytes
After processing item #6  chunk[5], Memory: 5,113,496 bytes
After processing item #7  chunk[6], Memory: 5,113,464 bytes
After processing item #8  chunk[7], Memory: 5,113,432 bytes
After processing item #9  chunk[8], Memory: 5,113,400 bytes
After processing item #10 chunk[9], Memory: 5,113,368 bytes
Processing chunk of 5 items
After processing item #11 chunk[0], Memory: 5,113,392 bytes
After processing item #12 chunk[1], Memory: 5,113,504 bytes
After processing item #13 chunk[2], Memory: 5,113,472 bytes
After processing item #14 chunk[3], Memory: 5,113,440 bytes
After processing item #15 chunk[4], Memory: 5,113,408 bytes
After foreach

Output with the current (.NET 7) implementation:

Processing chunk of 10 items
After processing item #1  chunk[0], Memory: 5,104,248 bytes
After processing item #2  chunk[1], Memory: 10,122,064 bytes
After processing item #3  chunk[2], Memory: 15,117,680 bytes
After processing item #4  chunk[3], Memory: 20,122,176 bytes
After processing item #5  chunk[4], Memory: 25,122,200 bytes
After processing item #6  chunk[5], Memory: 30,122,224 bytes
After processing item #7  chunk[6], Memory: 35,122,248 bytes
After processing item #8  chunk[7], Memory: 40,122,272 bytes
After processing item #9  chunk[8], Memory: 45,122,296 bytes
After processing item #10 chunk[9], Memory: 50,122,320 bytes
Processing chunk of 5 items
After processing item #11 chunk[0], Memory: 5,120,088 bytes
After processing item #12 chunk[1], Memory: 10,113,816 bytes
After processing item #13 chunk[2], Memory: 15,113,840 bytes
After processing item #14 chunk[3], Memory: 20,113,864 bytes
After processing item #15 chunk[4], Memory: 25,113,888 bytes
After foreach

Online demo.

The conditions that can trigger this temporary memory leak are quite unusual, but nevertheless I am reporting it because the fix should be relatively easy.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions