Skip to content

Ignoring of memory-pool limits & OOM on large cartesian-product join #5162

@DDtKey

Description

@DDtKey

Describe the bug
There is an issue with possible OOM instead of ResourcesExhausted
Probably related to usage of unbounded channels (I believe it should be avoided actually)

To Reproduce

MRE to achieve ignoring of memory-pool with large Cartesian product:

CSV File example (250mb): GDrive link - it's random file and column to join by has the same value for all records (so it's cartesian product)

Memory pool limit: FairSpillPool::new(4 * 1024 * 1024 * 1024)

SQL:
SELECT * FROM rnd rnd1 JOIN rnd rnd2 ON rnd1."s3_drive" = rnd2."s3_drive"

Expected behavior

It should returnResourcesExhausted error with configured MemoryPool

Additional context
Add any other context about the problem here.

A part of this was described in the discussion here: #5108 (comment), but there was mentioned the regression.
This example isn't regression and it's reproducible for old versions

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions