Skip to content

SQL: Fix bug with optimization of null related conditionals#41355

Merged
matriv merged 3 commits intoelastic:masterfrom
matriv:fix-cond-optimization
Apr 19, 2019
Merged

SQL: Fix bug with optimization of null related conditionals#41355
matriv merged 3 commits intoelastic:masterfrom
matriv:fix-cond-optimization

Conversation

@matriv
Copy link
Copy Markdown
Contributor

@matriv matriv commented Apr 18, 2019

The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo

The `SimplifyConditional` rule is removing `NULL` literals from those
functions to simplify their evaluation. This happens in the `Optimizer`
and a new instance of the conditional function is generated. Previously,
the `dataType` was not set properly (defaulted to DataType.NULL) for
those new instances and since the `resolveType()` wasn't called again
it resulted in returning always `null`.

This issue was not visible before because the tests always used an alias
for the conditional function which caused the `resolvedType()` to be
called which sets the dataType properly.
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search

Copy link
Copy Markdown
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue here is that the dataType detection is incorrect - namely instead of using commonType, the first not-null type should be used (otherwise it default to null).
Simple and effective and works regardless of optimizations

@matriv
Copy link
Copy Markdown
Contributor Author

matriv commented Apr 18, 2019

The commonType is there for flexibility. Let's say we have an "test" index with:

"n1" : {"type": "integer"},
"n2": {"type":"long"}

and we have a document:

{"n2": 10}

(n1 is null)

and we have a query:

SELECT COALESCE(n1, n2) FROM test

if the data type is set to the 1st non-null it will be set to "integer" instead of "long'.

@costin
Copy link
Copy Markdown
Member

costin commented Apr 18, 2019

Okay, then keep looping to find the common dataType that is not null.
Once computed the DataType should remain the same for the given expression - I don't see a reason for changing it or recalculating it (not that we would want to considering it does has side-effects in this case).

@matriv matriv requested a review from costin April 18, 2019 20:43
Copy link
Copy Markdown
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matriv matriv merged commit c39980a into elastic:master Apr 19, 2019
@matriv matriv deleted the fix-cond-optimization branch April 19, 2019 16:03
matriv added a commit that referenced this pull request Apr 19, 2019
The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo

(cherry picked from commit c39980a)
matriv added a commit that referenced this pull request Apr 19, 2019
The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo

(cherry picked from commit c39980a)
matriv added a commit that referenced this pull request Apr 19, 2019
The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo

(cherry picked from commit c39980a)
@colings86 colings86 added v6.7.2 and removed v6.7.3 labels Apr 24, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
…41355)

The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants