EQL: implement between function#54277
Conversation
|
Pinging @elastic/es-ql (:Query Languages/EQL) |
| * you may not use this file except in compliance with the Elastic License. | ||
| */ | ||
|
|
||
| package org.elasticsearch.xpack.eql.expression.function.scalar.between; |
There was a problem hiding this comment.
Why this package name? :-)
Shouldn't this be the same as for other string functions? org.elasticsearch.xpack.eql.expression.function.scalar.string
There was a problem hiding this comment.
Yep. Will move everything under scalar.string package.
|
|
||
| public class BetweenFunctionProcessor implements Processor { | ||
|
|
||
| public static final String NAME = "sbetween"; |
There was a problem hiding this comment.
I think sbtw is shorter and serves the same purpose.
| return (String) SubstringFunctionProcessor.doProcess(s, start, end); | ||
| } | ||
|
|
||
| public static String between(String s, String left, String right, Boolean greedy, Boolean caseSensitive) { |
There was a problem hiding this comment.
Can you move this one before substring (to follow the alphabetical order), please?
| # ASCII Functions | ||
| # | ||
| String substring(String, Number, Number) | ||
| String between(String, String, String, Boolean, Boolean) |
|
|
||
| import java.util.concurrent.Callable; | ||
|
|
||
| public abstract class BetweenBaseTestCase extends ESTestCase { |
There was a problem hiding this comment.
I'll let other reviewers weigh in on the necessity of this approach here.
That isn't an usual approach with randomized testing. The Elasticsearch infra in CI does run the tests continuously and having a randomized value for parameters is usually enough.
A more common approach is to introduce randomness in tests with various random* methods and leave CI do its thing. A personal approach (that others I believe are following) is to stress test such a random test class with -Dtests.iters=1000 or such a large value before creating the PR. I'm not saying this is enough, but it gives one a greater degree of confidence over the committed code.
There was a problem hiding this comment.
This seems to be the approach utilized by serializations tests, it's just extracted into the base class there, which we could consolidate across functions tests. Easy to remove. Let me know.
There was a problem hiding this comment.
Removed the class and the multiple tests runs from randomized tests.
|
|
||
| assertThat(BetweenUtils.between("System Idle Process", "s", "e", false, false), | ||
| equalTo("yst")); | ||
| } |
There was a problem hiding this comment.
No test with last parameter set to true.
There was a problem hiding this comment.
Good catch, thanks. Added more tests.
|
|
||
|
|
||
| // Translation table for error messaging in the following function | ||
| private static final String[] NUM_NAMES = { |
There was a problem hiding this comment.
This should go somewhere at the top of this class.
There was a problem hiding this comment.
Moved to the top of this class.
| if (hasOptionalParams && (children.size() > NUM_TOTAL_PARAMS || children.size() < NUM_TOTAL_PARAMS - numOptionalParams)) { | ||
| throw new QlIllegalArgumentException("expects between " + NUM_NAMES[NUM_TOTAL_PARAMS - numOptionalParams] | ||
| + " and " + NUM_NAMES[NUM_TOTAL_PARAMS] + " arguments"); | ||
| } else if (!hasOptionalParams && children.size() != NUM_TOTAL_PARAMS) { |
There was a problem hiding this comment.
hasOptionalParams == false
| throw new QlIllegalArgumentException("expects between " + NUM_NAMES[NUM_TOTAL_PARAMS - numOptionalParams] | ||
| + " and " + NUM_NAMES[NUM_TOTAL_PARAMS] + " arguments"); | ||
| } else if (!hasOptionalParams && children.size() != NUM_TOTAL_PARAMS) { | ||
| throw new QlIllegalArgumentException("expects exactly " + NUM_NAMES[NUM_TOTAL_PARAMS] + " arguments"); |
There was a problem hiding this comment.
I understand this wants to be a generic approach for dealing with a variable number of optional parameters, but:
- this solution needs to be generic throughout this class: here it's another method accepting optional parameters.
- I don't understand the purpose of this method... it tries to be generic, but
NUM_TOTAL_PARAMSis always 5. So, this method is for functions that can have between 0 and 5 optional parameters? Apart frombetweenis there any other such function?
There was a problem hiding this comment.
The approach by the here link in you comment allows only one optional parameter.
Here I created a kind of more generic implementation allowing to specify the number of optional parameters.
In the case of between function we need to support up to two optional parameters.
There was a problem hiding this comment.
Let's revisit this based on whether between actually needs case sensitivity or not - if it doesn't it will simplify things.
Same applies for greedy.
/cc @rw-access
There was a problem hiding this comment.
the greedy is not quite the same as case sensitivity, could be useful.
| @@ -0,0 +1,11 @@ | |||
| [[queries]] | |||
There was a problem hiding this comment.
why not just add these to test_queries_supported.toml?
There was a problem hiding this comment.
Sure can rename the file and we could start using it for all the extra queries that are not included in the original test_queries.toml. Will do that.
There was a problem hiding this comment.
Renamed the file into test_queries_supported.toml so we can add more queries there to avoid touching the original EQL test_queries.toml file.
| } | ||
|
|
||
| public static Object doProcess(Object source, Object left, Object right, Object greedy, Object caseSensitive) { | ||
| if (source == null) { |
There was a problem hiding this comment.
what if left and right are null?
| if (source == null) { | |
| if (source == null || left == null || right == null) { |
There was a problem hiding this comment.
Across functions that I was implementing so far I was assuming that we are returning null only if the source data (first param) is null.
The rest of params values should provided or there there will an exception, which is consistent with Java strings function overall that throw when the parameter is null.
Saw another approach was used by @astefan where was returning null if any of params are null.
Open for discussion, but would need to make consistent across of all of the functions.
Haven't heard a definite decision on this yet.
...ql/src/main/java/org/elasticsearch/xpack/eql/expression/function/scalar/between/Between.java
Show resolved
Hide resolved
...n/eql/src/main/java/org/elasticsearch/xpack/eql/expression/function/EqlFunctionRegistry.java
Show resolved
Hide resolved
costin
left a comment
There was a problem hiding this comment.
Left a few comments otherwise LGTM
| } | ||
|
|
||
| private static void throwIfNotString(Object obj) { | ||
| if (!(obj instanceof String || obj instanceof Character)) { |
There was a problem hiding this comment.
It might be worth extracting these assertions into the Check class inside QL and reusing them through-out the code.
We've been going back and forth regarding this style - the if check is fairly simple but repeating the string is tedious (and typically tends to not be consistent).
There was a problem hiding this comment.
yeah, I tried to avoid repetition of code and error strings. Moved these into QL Check class.
|
|
||
|
|
||
| @Override | ||
| public boolean equals(Object obj) { |
There was a problem hiding this comment.
Small nit - the getNameWritable tends to be towards the top after writeTo and hashCode before equals.
There was a problem hiding this comment.
It looks like getWriteableName presently it is at the end of every FunctionProcessor so far.
Updated.
|
|
||
| int idx = matchString.indexOf(left); | ||
| if (idx == -1) { | ||
| return ""; |
There was a problem hiding this comment.
You could return org.elasticsearch.xpack.ql.util.StringUtils.EMPTY instead.
|
|
||
| package org.elasticsearch.xpack.eql.expression.function.scalar.string; | ||
|
|
||
| import org.apache.directory.api.util.Strings; |
There was a problem hiding this comment.
Good catch, thank you! Found another place in tests. Updated both places.
| } | ||
|
|
||
| // Test from EQL doc https://eql.readthedocs.io/en/latest/query-guide/functions.html | ||
| public void testBetweenBasicEQLExamples() { |
| throw new QlIllegalArgumentException("expects between " + NUM_NAMES[NUM_TOTAL_PARAMS - numOptionalParams] | ||
| + " and " + NUM_NAMES[NUM_TOTAL_PARAMS] + " arguments"); | ||
| } else if (!hasOptionalParams && children.size() != NUM_TOTAL_PARAMS) { | ||
| throw new QlIllegalArgumentException("expects exactly " + NUM_NAMES[NUM_TOTAL_PARAMS] + " arguments"); |
There was a problem hiding this comment.
Let's revisit this based on whether between actually needs case sensitivity or not - if it doesn't it will simplify things.
Same applies for greedy.
/cc @rw-access
| } | ||
|
|
||
| private static Expression toDefault(Expression exp) { | ||
| return exp != null ? exp : Literal.FALSE; |
There was a problem hiding this comment.
Better to create a new exception preservation the source - helps with tracking down the location of the argument:
new Literal(exp.source(), Boolean.FALSE, DataTypes.Boolean)
|
Scratch that - I mixed up the issues. |
| compile project(':test:framework') | ||
| compile project(path: xpackModule('core'), configuration: 'default') | ||
| compile project(path: xpackModule('core'), configuration: 'testArtifacts') | ||
| compile project(path: xpackModule('ql'), configuration: 'default') |
There was a problem hiding this comment.
Since this import is just for CollectionUtils, I'd rather keep the projects separated and simply concatenate the lists through addAll.
There was a problem hiding this comment.
Good point. Removed.
* EQL: implement between function * Address WIP TODOs. Add more tests * Fix linter complaint * Add more query folder tests to cover invalid parameter types * Update Between toDefault. Fix typo in BetweenFunctionProcessor. Add more tests. * Remove unneeded null checks in BetweenFunctionProcessor::doProcess * Address code review comments * Address additional code review comments * Remove dependency on QL from eql/qa project
Related to #54135