SQL: Implement CASE... WHEN... THEN... ELSE... END#41349
SQL: Implement CASE... WHEN... THEN... ELSE... END#41349matriv merged 12 commits intoelastic:masterfrom
Conversation
Implement the ANSI SQL CASE expression which provides the if/else functionality common to most programming languages. The CASE expression can have multiple WHEN branches and becomes a powerful tool for SQL queries as it can be used in SELECT, WHERE, GROUP BY, HAVING and ORDER BY clauses. Closes: elastic#36200
|
Pinging @elastic/es-search |
| SUM_OF_SQUARES |AGGREGATE | ||
| VAR_POP |AGGREGATE | ||
| HISTOGRAM |GROUPING | ||
| CASE |CONDITIONAL |
There was a problem hiding this comment.
Do you agree with listing CASE as a conditional function?
Case is quite special and doesn't follow at all the traditional functional format.
| private final Expression defaultElse; | ||
|
|
||
| @SuppressWarnings("unchecked") | ||
| public Case(Source source, List<Expression> expressions) { |
There was a problem hiding this comment.
This "weird" ctor is here as public instead of a more useful (for the ExpressionBuilder) one which would like:
public Case(Source source, List<IfConditional> conditionals, Expression defaultElse) because of FunctionRegistry which gets confused if there are 2 public ctors, and obviously the one with IfConditional cannot be used in the FunctionRegistry.
| super(source, expressions); | ||
| this.conditions = (List<IfConditional>) (List<?>) expressions.subList(0, expressions.size() - 1); | ||
| this.defaultElse = expressions.get(expressions.size() - 1); | ||
| setDataType(); |
There was a problem hiding this comment.
will apply the same logic as here: https://github.com/elastic/elasticsearch/pull/41355/files#diff-026c7791c54d1c165a2cd3f516a500beR36 once that PR is merged.
astefan
left a comment
There was a problem hiding this comment.
Nice work! I left some comments.
|
|
||
| *Input*: | ||
|
|
||
| Multiple but at least one WHEN *condition* THEN *result* clause and optional ELSE *defaultResult* clause. |
There was a problem hiding this comment.
This sentence sounds weird. Maybe get rid of "Multiple" and rephrase?
At least one _WHEN *condition* THEN *result*_ clause is required and the expression can optionally have an _ELSE *default_result*_ clause. Every *condition* must be boolean expression.
There was a problem hiding this comment.
I would rephrase with something like "One or multiple WHEN ... are used and the ..."
| Multiple but at least one WHEN *condition* THEN *result* clause and optional ELSE *defaultResult* clause. | ||
| Every *condition* should be boolean expression. | ||
|
|
||
| *Output*: one of the *result* expressions if the corresponding WHEN *condition* evaluates to `true`, |
There was a problem hiding this comment.
*Output*: one of the *result* expressions if the corresponding _WHEN *condition*_ evaluates to trueor the *default_result* if all _WHEN *condition*_ clauses evaluate tofalse. If the optional _ELSE *default_result*_ clause is missing and all _WHEN *condition*_ clauses evaluate to falsethennull is returned.
| Case c = (Case) e; | ||
|
|
||
| // Remove or foldable conditions that fold to FALSE | ||
| // Stop at the 1st foldable condition taht folds to TRUE |
|
|
||
| @Override | ||
| protected Case mutate(Case instance) { | ||
| return randomCase(); |
There was a problem hiding this comment.
I don't think this mutate logic will always have a changed, non-equal to original, instance of CASE. I think the mutation should have as many IfConditional instances as constructor values that can change (2 in this case with randomStringLiteral()). And from Case point of view, it will also need to mutate the randomIntLiteral() parameter as well. Something like this: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/sql/src/test/java/org/elasticsearch/xpack/sql/expression/function/scalar/string/ConcatFunctionPipeTests.java#L96
| } | ||
|
|
||
| @Override | ||
| public void testTransform() { |
There was a problem hiding this comment.
Why is this one empty? Same for testReplaceChildren.
There was a problem hiding this comment.
I added this Class as a quick fix to the failing NodeSubclassTests (because of the special IfConditional as an expression) and forgot to properly implement everything there.
Fixed.
|
|
||
| @Override | ||
| protected TypeResolution resolveType() { | ||
| DataType expectedResultDataType = null; |
There was a problem hiding this comment.
Why not initializing expectedResultDataType directly with defaultElse().dataType()?
There was a problem hiding this comment.
Because we want to match the results of the THEN <result> clauses to the 1st non-null result of them.
There was a problem hiding this comment.
Wouldn't that happen anyway in the following loop (that checks the conditions)?
| } | ||
|
|
||
| for (IfConditional conditional : conditions) { | ||
| dataType = DataTypeConversion.commonType(dataType, conditional.dataType()); |
There was a problem hiding this comment.
In case conditions.isEmpty() == false the first commonType check is with itself, isn't it? Maybe worth doing a for with an index that starts either at 0 or 1 depending on the previous isEmpty() check?
There was a problem hiding this comment.
Indeed, I would just start with the defaultElse() type and then iterate over the conditions - this avoid repetition.
| SELECT emp_no, CASE WHEN emp_no - 10000 < 10 THEN 'First 10' ELSE 'Second 10' END as "case" FROM test_emp WHERE emp_no >= 10005 | ||
| ORDER BY emp_no LIMIT 10; | ||
|
|
||
| emp_no | case |
There was a problem hiding this comment.
Following this PR - #41355 - can you change one of these tests to not have an alias?
costin
left a comment
There was a problem hiding this comment.
Looks good. This feature will make a lot of folks happy.
|
|
||
| *Input*: | ||
|
|
||
| Multiple but at least one WHEN *condition* THEN *result* clause and optional ELSE *defaultResult* clause. |
There was a problem hiding this comment.
I would rephrase with something like "One or multiple WHEN ... are used and the ..."
| @Override | ||
| public boolean foldable() { | ||
| return defaultElse.foldable() && (conditions.isEmpty() || | ||
| (conditions.size() == 1 && conditions.get(0).condition().foldable() && conditions.get(0).result().foldable())); |
There was a problem hiding this comment.
I'm not sure this method is correct - why is the defaultElse checked first - if a condition is foldable and true, defaultElse is not relevant.
I think it should be:
return (conditions.isEmpty() && defaultElse.foldable()) || (conditions.size() > 0 && conditions.get(0).condition().foldable());| return defaultElse.fold(); | ||
| } | ||
| if (conditions.get(0).condition().fold() == Boolean.TRUE) { | ||
| return conditions.get(0).result().fold(); |
There was a problem hiding this comment.
The defaultElse.fold() repetition should be avoided:
if (conditions.size() > 0 && conditions.get(0).condition().fold() == Boolean.TRUE) {
...
}
return defaultElse.fold();| } | ||
|
|
||
| for (IfConditional conditional : conditions) { | ||
| dataType = DataTypeConversion.commonType(dataType, conditional.dataType()); |
There was a problem hiding this comment.
Indeed, I would just start with the defaultElse() type and then iterate over the conditions - this avoid repetition.
| import java.util.Objects; | ||
|
|
||
| /** | ||
| * Helper function (cannot be directly from a query) to model a {@code WHEN <condition> ELSE <result>} |
There was a problem hiding this comment.
Considering this does not extend Function I wouldn't call it a Helper function - instead a conditional expression mapping WHEN ELSE backing CASE and potentially IIF
| this(source, Arrays.asList(condition, result)); | ||
| } | ||
|
|
||
| private IfConditional(Source source, List<Expression> children) { |
There was a problem hiding this comment.
No reason after all, it's a leftover from a different design. changing it.
| private final Expression result; | ||
|
|
||
| public IfConditional(Source source, Expression condition, Expression result) { | ||
| this(source, Arrays.asList(condition, result)); |
There was a problem hiding this comment.
this(source, asList(condition,result));
this.condition = condition;
this.result = result;| if (newChildren.size() < 2) { | ||
| throw new IllegalArgumentException("expected at least [2] children but received [" + newChildren.size() + "]"); | ||
| } | ||
| return new IfConditional(source(), newChildren); |
There was a problem hiding this comment.
IfConditional(source(), newChildren.get(0), newChildren(1));
| *Input*: | ||
|
|
||
| One or multiple _WHEN *condition* THEN *result_* clauses are used and the expression can optionally have | ||
| an _ELSE *default_result_* clause. Every *condition* should be boolean expression. |
Implement the ANSI SQL CASE expression which provides the if/else functionality common to most programming languages. The CASE expression can have multiple WHEN branches and becomes a powerful tool for SQL queries as it can be used in SELECT, WHERE, GROUP BY, HAVING and ORDER BY clauses. Closes: #36200 (cherry picked from commit 8b25774)
Add a TIP on how to use CASE to achieve custom bucketing with GROUP BY. Follows: elastic#41349
* SQL: [Docs] Add example for custom bucketing with CASE Add a TIP on how to use CASE to achieve custom bucketing with GROUP BY. Follows: #41349 * address comments * address comment
Implement the ANSI SQL CASE expression which provides the if/else functionality common to most programming languages. The CASE expression can have multiple WHEN branches and becomes a powerful tool for SQL queries as it can be used in SELECT, WHERE, GROUP BY, HAVING and ORDER BY clauses. Closes: elastic#36200
* SQL: [Docs] Add example for custom bucketing with CASE Add a TIP on how to use CASE to achieve custom bucketing with GROUP BY. Follows: elastic#41349 * address comments * address comment
Implement the ANSI SQL CASE expression which provides the if/else
functionality common to most programming languages.
The CASE expression can have multiple WHEN branches and becomes a
powerful tool for SQL queries as it can be used in SELECT, WHERE,
GROUP BY, HAVING and ORDER BY clauses.
Closes: #36200