FORK - allow EVAL/DISSECT/STATS in branches#125937
Conversation
| forkSubQueryProcessingCommand | ||
| : whereCommand | ||
| | sortCommand | ||
| : evalCommand |
There was a problem hiding this comment.
We can use processingCommand and exclude the commands that are not supported by Fork.
There was a problem hiding this comment.
I am actually having a bit of trouble with the grammar.
Even if I use processingCommand here, there are some combinations that fail with a parsing exception:
this works:
ROW a=[1,2,3]
| FORK (EVAL a = [2,3 ] | MV_EXPAND a | WHERE a == 2)
(MV_EXPAND a | WHERE a == 2 )
this fails with a parsing exception:
FROM search-movies
| FORK (STATS x = COUNT(*), y = VALUES(title) | MV_EXPAND y)
(WHERE title:"Journey")
error:
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "line 3:66: token recognition error at: ')'"
}
],
"type": "parsing_exception",
"reason": "line 3:66: token recognition error at: ')'",
"caused_by": {
"type": "lexer_no_viable_alt_exception",
"reason": null
}
},
"status": 400
}
I am able to use WHERE/LIMIT/SORT/DISSECT/EVAL/STATS without issues.
But using commands like MV_EXPAND/KEEP/RENAME/DROP/GROK in FORK branches fails with a parsing errors.
There was a problem hiding this comment.
I agree that ultimately we should be able to effectively remove this list and replace it with processingCommand (that was my original intention when I added it), but this PR is a good step forward in that direction. Let's decouple this, as it will need even more extensive and new testing which is better in a subsequent PR.
There was a problem hiding this comment.
I am able to use WHERE/LIMIT/SORT/DISSECT/EVAL/STATS without issues. But using commands like MV_EXPAND/KEEP/RENAME/DROP/GROK in FORK branches fails with a parsing errors.
GROK does not throw ParsingException for me if it is added under forkSubQueryProcessingCommand. I wonder if it is related to the mode of the commands, the commands that can be recognized under FORK have EXPRESSION_MODE.
+ curl -u elastic:password -v -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
"query": "from sample_data | fork (grok message \"%{WORD:x} %{WORD:y}\") (dissect message \"%{x} %{y}\") | keep message, x, y, _fork"
}
message | x | y | _fork
---------------------+---------------+---------------+---------------
Connected to 10.1.0.3|Connected |to |fork1
Connected to 10.1.0.2|Connected |to |fork1
Disconnected |null |null |fork1
Connection error |Connection |error |fork1
Connection error |Connection |error |fork1
Connection error |Connection |error |fork1
Connected to 10.1.0.1|Connected |to |fork1
Connected to 10.1.0.3|Connected |to 10.1.0.3 |fork2
Connected to 10.1.0.2|Connected |to 10.1.0.2 |fork2
Disconnected |null |null |fork2
Connection error |Connection |error |fork2
Connection error |Connection |error |fork2
Connection error |Connection |error |fork2
Connected to 10.1.0.1|Connected |to 10.1.0.1 |fork2
There was a problem hiding this comment.
I wonder if it is related to the mode of the commands
you are probably right - I'd like to follow up on the grammar issue separately if that's okay.
if I recall correctly for GROK I was hitting a parsing issue when the FORK subbranch contained multiple commands and not just GROK.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Fork.java
Outdated
Show resolved
Hide resolved
|
Pinging @elastic/es-analytical-engine (Team:Analytics) |
|
Pinging @elastic/kibana-esql (ES|QL-ui) |
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Fork.java
Show resolved
Hide resolved
| forkSubQueryProcessingCommand | ||
| : whereCommand | ||
| | sortCommand | ||
| : evalCommand |
There was a problem hiding this comment.
I agree that ultimately we should be able to effectively remove this list and replace it with processingCommand (that was my original intention when I added it), but this PR is a good step forward in that direction. Let's decouple this, as it will need even more extensive and new testing which is better in a subsequent PR.
bpintea
left a comment
There was a problem hiding this comment.
Had a first look.
I think the grammar changes need to be pushed a bit further: FROM employees | FORK (KEEP emp_no) fails with the same token recognition error at: ')'.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Fork.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plan/logical/Fork.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
bpintea
left a comment
There was a problem hiding this comment.
LGTM, only left optional issues.
I'd maybe update the PR description, to mention which new commands are now allowed.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
| } | ||
| } | ||
|
|
||
| List<Alias> aliases = missing.stream() |
There was a problem hiding this comment.
I would use missing.forEach() instead of heavier streams, but optional / preference.
There was a problem hiding this comment.
I kept this as it is since I felt using map is more natural here - happy to change it if you have a strong preference to use forEach.
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
|
@ioanatia which are the additional commands we are going to support with this change? All of them? |
|
@stratoula for now we just add support for DISSECT/STATS/EVAL. |
fang-xing-esql
left a comment
There was a problem hiding this comment.
LGTM, thank you @ioanatia ! Supporting the other commands(or giving a more friendly error messages of unsupported commands) can be addressed as follow ups.
#121950
The FORK branches can now have different schema outputs.
Allows more than just
WHERE,LIMITandSORTin FORK branches.The commands for which we added support in the FORK branches are:
There is a parsing issue that at the moment does not allow us to add more commands.
This will be addressed in a follow up PR.
output:
If the sub plans schemas contain a conflicting types for the same column a verification error will be returned.
FORK will align the outputs by adding null columns in each subplan.
Example:
The planner will align the columns of each subplans, the previous example is equivalent to:
Note on the grammar parser changes
As supported sub commands I only added the ones that I tested and see that work without a parsing exception.
I am not sure why but if I try to expand that list to other commands like
MV_EXPANDI get a parsing error:I also don't want to be blocked by this.
If we can fix the grammar problem here or in a follow up, I am more than happy to add all the tests we think are needed to cover the usage of all commands in FORK branches.
If we can support FORK branches to include STATS/EVALS as part of this PR, we should be able to extend support for other commands when we fix the grammar change.