Skip to content

Bug report: Incorrect ranges in key condition#13333

Closed
alesapin wants to merge 3 commits intomasterfrom
key_condition_bug
Closed

Bug report: Incorrect ranges in key condition#13333
alesapin wants to merge 3 commits intomasterfrom
key_condition_bug

Conversation

@alesapin
Copy link
Copy Markdown
Member

@alesapin alesapin commented Aug 4, 2020

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
To be added Fixes: #13248

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Aug 4, 2020
@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented Aug 4, 2020

Actually, we get correct key condition Key condition: (column 0 in [4294967280, +inf)), (column 0 in (-inf, 0]), and And it's always false for all ranges, but for some reason mayBeTrueAfter https://github.com/clickhouse/ClickHouse/blob/master/src/Storages/MergeTree/KeyCondition.cpp#L1372-L1378 return true.

@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented Aug 5, 2020

Actually our code works badly with impossible KeyConditions. In this case, we have condition:

(column 0 in [4294967280, +inf)), (column 0 in (-inf, 0]), and
            >>>>>first<<<<<                  >>>>>second<<<<<

Which is obviously always false for any interval. But for example, for interval [0, +inf) our code will consider that it intersects first interval (on [4294967280, +inf)) and second interval in point 0. After that, we will evaluate AND (https://github.com/clickhouse/ClickHouse/blob/master/src/Storages/MergeTree/KeyCondition.cpp#L1304-L1311)[link] and yes, in both ranges our key condition "can be true" and we will try to read something from disk (actually full column).

I think we can simplify KeyCondition's RPN after it was parsed from WHERE expression, and get rid of similar bugs. For example, if we have OR expression and left interval contains the right interval than we can replace this element with the left interval. Or if a left interval of AND expression doesn't intersect the right interval than we can replace AND with ALWAYS_FALSE (this case). But it can be quite complicated for a multidimensional KeyConditions.

rpn_stack.back() = arg2;
else if (arg2.function == RPNElement::ALWAYS_TRUE)
rpn_stack.back() = arg1;
else if (arg1.function == RPNElement::FUNCTION_NOT_IN_RANGE && arg2.function == RPNElement::FUNCTION_NOT_IN_RANGE && arg1.range.intersectsRange(arg2.range))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks incorrect.

range1:          *****
range2:             *****
not range1: *****     *****
not range2: ********     **
not both:   *****        **

};


void KeyCondition::optimizeRPN()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of RPN optimization is 👍

@alesapin
Copy link
Copy Markdown
Member Author

Closed in favor to #14225.

@alesapin alesapin closed this Aug 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Assertion "Cannot read out of marks range"

3 participants