Skip to content

SQL: Fix bug with optimization of null related conditionals #41355

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 19, 2019

Conversation

matriv
Copy link
Contributor

@matriv matriv commented Apr 18, 2019

The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo

The `SimplifyConditional` rule is removing `NULL` literals from those
functions to simplify their evaluation. This happens in the `Optimizer`
and a new instance of the conditional function is generated. Previously,
the `dataType` was not set properly (defaulted to DataType.NULL) for
those new instances and since the `resolveType()` wasn't called again
it resulted in returning always `null`.

This issue was not visible before because the tests always used an alias
for the conditional function which caused the `resolvedType()` to be
called which sets the dataType properly.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue here is that the dataType detection is incorrect - namely instead of using commonType, the first not-null type should be used (otherwise it default to null).
Simple and effective and works regardless of optimizations

@matriv
Copy link
Contributor Author

matriv commented Apr 18, 2019

The commonType is there for flexibility. Let's say we have an "test" index with:

"n1" : {"type": "integer"},
"n2": {"type":"long"}

and we have a document:

{"n2": 10}

(n1 is null)

and we have a query:

SELECT COALESCE(n1, n2) FROM test

if the data type is set to the 1st non-null it will be set to "integer" instead of "long'.

@costin
Copy link
Member

costin commented Apr 18, 2019

Okay, then keep looping to find the common dataType that is not null.
Once computed the DataType should remain the same for the given expression - I don't see a reason for changing it or recalculating it (not that we would want to considering it does has side-effects in this case).

@matriv matriv requested a review from costin April 18, 2019 20:43
Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@matriv matriv merged commit c39980a into elastic:master Apr 19, 2019
@matriv matriv deleted the fix-cond-optimization branch April 19, 2019 16:03
matriv added a commit that referenced this pull request Apr 19, 2019
The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo

(cherry picked from commit c39980a)
matriv added a commit that referenced this pull request Apr 19, 2019
The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo

(cherry picked from commit c39980a)
matriv added a commit that referenced this pull request Apr 19, 2019
The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo

(cherry picked from commit c39980a)
@colings86 colings86 added v6.7.2 and removed v6.7.3 labels Apr 24, 2019
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
…41355)

The SimplifyConditional rule is removing NULL literals from those
functions to simplify their evaluation. This happens in the Optimizer
and a new instance of the conditional function is generated. Previously,
the dataType was not set properly (defaulted to DataType.NULL) for
those new instances and since the resolveType() wasn't called again
it resulted in returning always null.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar')

COALESCE(null, 'foo', null, 'bar')
-----------------
null

This issue was not visible before because the tests always used an alias
for the conditional function which caused the resolveType() to be
called which sets the dataType properly.

E.g.:

SELECT COALESCE(null, 'foo', null, 'bar') as c

c
-----------------
foo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants