Skip to content

Check for mark join indexes in aggregate and group by#15691

Merged
Mytherin merged 3 commits intoduckdb:v1.2-histrionicusfrom
Tmonster:internal-3922-filter-pushdown-subquery
Jan 17, 2025
Merged

Check for mark join indexes in aggregate and group by#15691
Mytherin merged 3 commits intoduckdb:v1.2-histrionicusfrom
Tmonster:internal-3922-filter-pushdown-subquery

Conversation

@Tmonster
Copy link
Contributor

@Tmonster Tmonster commented Jan 13, 2025

Fixes https://github.com/duckdblabs/duckdb-internal/issues/3922

The failing query

SET order_by_non_integer_literal=true;
SELECT DISTINCT ON ( 'string' ) 'string', GROUP BY CUBE ( 'string', ), 'string' IN ( SELECT 'string' ), HAVING 'string' IN ( SELECT 'string');

The Plan generated before optimization is below. During optimization there is an attempt to convert the mark join into a semi. Before this conversion takes place, we usually check to make sure the mark join is not used in any operators above the mark join to prevent plan verification errors. Up until this point, only logical projections were checked for mark joins. Turns out this query is planned in such a way that the mark join is in one of the expressions of the aggregate operator. This was not checked, so the mark to semi conversion would take place. The fix is to modify the filter pushdown optimization so that it stores table indexes from logical aggregate operators.

┌───────────────────────────┐
│       PROJECTION #1       │
│    ────────────────────   │
│    Expressions: #[2.0]    │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│           FILTER          │
│    ────────────────────   │
│    Expressions: #[2.1]    │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│    AGGREGATE #2, #3, #4   │
│    ────────────────────   │
│          Groups:          │
│          'string'         │
│          #[14.0]          │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│      COMPARISON_JOIN      │
│    ────────────────────   │
│      Join Type: MARK      │
│                           ├──────────────┐
│        Conditions:        │              │
│    ('string' = #[8.0])    │              │
└─────────────┬─────────────┘              │
┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│       DUMMY_SCAN #0       ││       PROJECTION #8       │
│    ────────────────────   ││    ────────────────────   │
│                           ││   Expressions: 'string'   │
└───────────────────────────┘└─────────────┬─────────────┘
                             ┌─────────────┴─────────────┐
                             │       DUMMY_SCAN #7       │
                             │    ────────────────────   │
                             └───────────────────────────┘

@duckdb-draftbot duckdb-draftbot marked this pull request as draft January 13, 2025 15:43
@Tmonster Tmonster marked this pull request as ready for review January 14, 2025 08:06
@duckdb-draftbot duckdb-draftbot marked this pull request as draft January 16, 2025 09:11
@Tmonster Tmonster changed the base branch from main to v1.2-histrionicus January 16, 2025 09:15
@Tmonster Tmonster marked this pull request as ready for review January 16, 2025 09:15
@Mytherin Mytherin merged commit deed1eb into duckdb:v1.2-histrionicus Jan 17, 2025
48 of 49 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants