Esql/create DATE_DIFF function#103208
Conversation
|
Documentation preview: |
|
💚 CLA has been signed |
|
Pinging @elastic/es-ql (Team:QL) |
|
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL) |
There was a problem hiding this comment.
Thanks for your interest and contribution, @nicolasgras.
I had a first pass. My main question is about the code duplication. Any reason you didn't extract the common parts (in the ql module) and reused it here?
| return aliases; | ||
| } | ||
|
|
||
| private static int safeInt(long diff) { |
There was a problem hiding this comment.
There's DataTypeConverter#safeToInt available. But I'm mostly wondering why not return long.
There was a problem hiding this comment.
I have updated the code to use DataTypeConverter#safeToInt and removed the one I implemented that was identical.
|
|
||
| static { | ||
| NAME_TO_PART = DateTimeField.initializeResolutionMap(values()); | ||
| VALID_VALUES = DateTimeField.initializeValidValues(values()); |
There was a problem hiding this comment.
This isn't used anywhere atm. I guess the initialization of NAME_TO_PART can then be inlined too.
There was a problem hiding this comment.
I agree, VALID_VALUES List is not used. I should have cleant this part.
The initialization of NAME_TO_PART is inlined now.
| private BiFunction<ZonedDateTime, ZonedDateTime, Integer> diffFunction; | ||
| private Set<String> aliases; | ||
|
|
There was a problem hiding this comment.
I agree, they are final now.
| description = "Subtract 2 dates and return their difference in multiples of a unit specified in the 1st argument") | ||
| public DateDiff( | ||
| Source source, | ||
| @Param(name = "unit", type = { "text" }, description = "A valid date unit") Expression unit, |
There was a problem hiding this comment.
The type can also be keyword.
There was a problem hiding this comment.
I agree, I have updated the code.
| this.endTimestamp = endTimestamp; | ||
| } | ||
|
|
||
| @Evaluator |
There was a problem hiding this comment.
We'll want to catch the thrown exceptions in the evaluator implementor so that we can return a null, rather than stop the execution. Have a look at DateParse for an example.
There was a problem hiding this comment.
I agree, I have updated the code.
| ZonedDateTime zdtStart = ZonedDateTime.ofInstant(Instant.ofEpochMilli(startTimestamp), UTC); | ||
| ZonedDateTime zdtEnd = ZonedDateTime.ofInstant(Instant.ofEpochMilli(endTimestamp), UTC); | ||
|
|
||
| Part datePartField = Part.resolve(unit.utf8ToString()); |
There was a problem hiding this comment.
We could maybe have two evaluators here: in case you'd do smth like date_diff(some_string_field, ..., ...), then we'd need to evaluate the first parameter to establish what's the unit we need to apply, as done already. But if we write something like date_diff("hours", ..., ...), we can do this work upfront, not with every evaluation. And it's likely this would be the most common use case.
There was a problem hiding this comment.
I have updated the code by adding a new evaluator (I get something similar to DateParse now).
| type = { "date" }, | ||
| description = "A string representing a start timestamp" | ||
| ) Expression startTimestamp, | ||
| @Param(name = "endTimestamp", type = { "date" }, description = "A string representing a end timestamp") Expression endTimestamp |
There was a problem hiding this comment.
This typo has been fixed.
| if (datePartField == null) { | ||
| List<String> similar = Part.findSimilar(unit.utf8ToString()); | ||
| if (similar.isEmpty()) { | ||
| throw new InvalidArgumentException( |
There was a problem hiding this comment.
A nit on style: you could prepare the message in the two branches and throw just once.
There was a problem hiding this comment.
I agree, I have updated the code.
Thank you for you review @bpintea. I have also added unit tests in esql/qa/testFixtures directory to have a better code coverage. Let me know if it is ok for you. |
luigidellaquila
left a comment
There was a problem hiding this comment.
Thank you very much @nicolasgras!
Just two comments from my side:
- I see the CI still complains about CLA signature; I checked your commits and I see they appear with two different email addresses (one is invalid of course):
Author: Nicolas Gras <nicolasgras01@gmail.com>
Author: Nicolas Gras <nicolas@system76-pc.localdomain>
You have to sign the CLA with both to make the checks green, even though the second one is obviously a local one.
- I think @bpintea referred to the content of
org.elasticsearch.xpack.sql.expression.function.scalar.datetime.DateDiff(theenum Partand some logic) when he asked about common parts that can be extracted to QL.
IMHO it's not mandatory, we could also keep the two things separated and have this PR self-contained in ESQL module, but in that case I'd avoid to move DateTimeField to QL
|
Pinging @elastic/es-analytics-geo (Team:Analytics) |
0bf177d to
bb62dfd
Compare
|
Thank you @luigidellaquila for your review.
I have rebased my commits on top of the branch and updated the author of them, then the CI does not complain anymore about CLA signature.
@bpintea , what do you think about it? |
In some cases sharing code between other *ql modules within the |
|
@bpintea and @luigidellaquila, I have moved back |
bpintea
left a comment
There was a problem hiding this comment.
I've left two notes, otherwise it LGTM.
I guess with SQL's DateDiff we've follow other implementations returning int and we follow suite also here. This is OK, though we might want to update it, if we'll add support for nano timestamps (which we don't have yet), in which case an int will not cover even 3s.
...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateDiff.java
Outdated
Show resolved
Hide resolved
...sql/src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/date/DateDiff.java
Outdated
Show resolved
Hide resolved
luigidellaquila
left a comment
There was a problem hiding this comment.
Thank you very much @nicolasgras, LGTM
This reverts commit ec2e185.
|
Unfortunately this commit broke the build (checkstyle violations) so I reverted it in 943b2ea. Luigi will open a PR to reinstate it. |
This pull request resolves #101942.
A new functionality DATE_DIFF is added to ESQL. This functionaliy is similar to the ones in SQL plugin.
It calculates the diffenrence between this 2 dates using a unit specified and takes 3 input parameters: