Support offset in composite aggs#50609
Conversation
Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes elastic#48757
|
Pinging @elastic/es-docs (>docs) |
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
|
Please have a close look at this one! It has wire changes and it has been more than a year since I made any of those.... |
|
|
||
| @Override | ||
| public void innerWriteTo(StreamOutput out) throws IOException { | ||
| if (out.getVersion().before(Version.V_8_0_0)) { |
There was a problem hiding this comment.
We don't actually serialize this Rounding but I believe I've got this bit right so I kept it. I'd like to move the other offset code over to this Rounding which means that we will eventually serialize it. At that point we'll get exhaustive tests for this code.
There was a problem hiding this comment.
Modulo the version check which I discuss below.
| super(in); | ||
| dateHistogramInterval = new DateIntervalWrapper(in); | ||
| timeZone = in.readOptionalZoneId(); | ||
| if (in.getVersion().after(Version.V_8_0_0)) { |
There was a problem hiding this comment.
I think this is the way to do this, but it has been a long time. I want to land this change in master and 7.x, but these tests have no change of passing the bwc tests until I backport it.
Now that I think about it, maybe this version check should be V_7_6_0 actually. Help!
There was a problem hiding this comment.
The tests caught an error - this at least be onOrAfter but I'll switch it to V_7_6_0 while I'm there.
|
|
||
| *Offset* | ||
|
|
||
| include::datehistogram-aggregation.asciidoc[tag=offset-explanation] |
There was a problem hiding this comment.
I figured because the option is the same I should just include the docs from the normal date histogram.
|
Hurray! Integration test failure! Will debug. Good robots, catching things. |
|
@elasticmachine, run elasticsearch-ci/2 |
|
The default distro failure is due to the thing not being backported. I'll find a way to get this working. |
I believe I've got this sorted thanks to @DaveCTurner and @cbuescher. |
polyfractal
left a comment
There was a problem hiding this comment.
Left a few comments/questions, but I think this looks good. Main question was around the new Rounding class, most of the rest is cosmetic :)
|
|
||
| *Offset* | ||
|
|
||
| include::datehistogram-aggregation.asciidoc[tag=offset-explanation] |
| } | ||
| } | ||
|
|
||
| static class OffsetRounding extends Rounding { |
There was a problem hiding this comment.
Is the intention to migrate the existing users of offsets (regular date histo agg, etc) over to this wrapper? E.g. it seems a bit heavyweight to create a whole new rounding, but if the idea is to followup with changes to the existing users of offset (rather than baking the logic into the aggs) it makes sense to me.
There was a problem hiding this comment.
Also, we should probably add a javadoc explaining why/when to use this class
There was a problem hiding this comment.
Yeah, I did intend to migrate the rest over of the uses of offset over to it.
I added javadoc on the public interface to the class which is what the rest of the classes in this file do. Do you think I should duplicate it onto this one? Or add something maybe?
There was a problem hiding this comment.
Ah, missed that it was documented on the public method. Since the rest of the class does it that way, fine with me :)
Thanks for the explanation!
|
|
||
| OffsetRounding(StreamInput in) throws IOException { | ||
| delegate = Rounding.read(in); | ||
| offset = in.readLong(); |
There was a problem hiding this comment.
maybe read/write ZLong instead? I suspect offsets will be small-to-medium'ish sized and either positive or negative, so zlong might be a win?
There was a problem hiding this comment.
In other places offsets are written as long and I just copied it. But I'd be fine with zlong.
There was a problem hiding this comment.
Pushed the switch to zlong.
| case TimeUnitRounding.ID: | ||
| rounding = new TimeUnitRounding(in); | ||
| break; | ||
| return new TimeUnitRounding(in); |
There was a problem hiding this comment.
🙏 for the cleanup here :)
There was a problem hiding this comment.
I'm not sure everyone would consider that cleaner, but I sure do!
| @@ -195,6 +195,16 @@ public void testTimeUnitRoundingDST() { | |||
| assertThat(tzRounding_chg.round(time("2014-11-02T06:01:01", chg)), isDate(time("2014-11-02T06:00:00", chg), chg)); | |||
| } | |||
|
|
|||
There was a problem hiding this comment.
Hmm, do you know if we have any serialization tests for Rounding? I was looking to see if we test that the id() bytes are correct and don't accidentally change (similar to what we do with AbstractWriteableEnumTestCase enum tests)... but couldn't find any serialization tests at all.
If we have them somewhere, let's add a test for the id byte. If not, probably too much to add to this PR but we should file a ticket so we don't forget to add some tests... makes me uneasy that such a widespread class doesn't have serialization tests :)
There was a problem hiding this comment.
I believe we end up testing serialization as part of testing things like extended bounds bucket response. Adding a unit test for just this class's serialization makes sense to me though. I figured I'd wait until I used it in a context where we serialized it but since I'm writing the serialization code now I probably ought to write the test now.
There was a problem hiding this comment.
I pushed a wire test case.
| assertThat(tzRounding_chg.round(time("2014-11-02T06:01:01", chg)), isDate(time("2014-11-02T06:00:00", chg), chg)); | ||
| } | ||
|
|
||
| public void testOffsetRounding() { |
There was a problem hiding this comment.
Let's add another test that does negative offsets too?
There was a problem hiding this comment.
I pushed a test with negative offsets.
The serialization was subtly wrong, particularly in the Eire time zone.
|
@polyfractal, I've pushed everything you asked for. |
|
I reworked the serialization a bit, moving the write methods so they are near the read methods and replacing |
| } | ||
| TimeUnitRounding other = (TimeUnitRounding) obj; | ||
| return Objects.equals(unit, other.unit) && Objects.equals(timeZone, other.timeZone); | ||
| return unit == other.unit && timeZone.equals(other.timeZone); |
There was a problem hiding this comment.
I'll revert this change. It snuck in because I thought the comparison was wrong.
|
|
||
| @Override | ||
| public String toString() { | ||
| return "Rounding[" + interval + " in " + timeZone + "]"; |
There was a problem hiding this comment.
The toString implementations weren't consistent which made reading the error messages hard. So I changed them.
Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes elastic#48757
|
Thanks for the reviews @polyfractal and @jimczi ! |
Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes #48757
When deserializing time zones in the Rounding classes we used to include a tiny normalization step via `DateUtils.of(in.readString())` that was lost in elastic#50609. Its at least necessary for some tests, e.g. the cause of elastic#50827 is that when sending the default time zone ZoneOffset.UTC on a stream pre 7.0 we convert it to a "UTC" string id via `DateUtils.zoneIdToDateTimeZone`. This gets then read back as a UTC ZoneRegion, which should behave the same but fails the equality tests in our serialization tests. Reverting to the previous behaviour with an additional normalization step on 7.x.
When I implemented elastic#50609 I suppressed a few backwards compatibility checks until I finished the backport. I've now finished the backport so this enables the tests.
When I implemented #50609 I suppressed a few backwards compatibility checks until I finished the backport. I've now finished the backport so this enables the tests.
We added a new rounding in elastic#50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.
When deserializing time zones in the Rounding classes we used to include a tiny normalization step via `DateUtils.of(in.readString())` that was lost in #50609. Its at least necessary for some tests, e.g. the cause of #50827 is that when sending the default time zone ZoneOffset.UTC on a stream pre 7.0 we convert it to a "UTC" string id via `DateUtils.zoneIdToDateTimeZone`. This gets then read back as a UTC ZoneRegion, which should behave the same but fails the equality tests in our serialization tests. Reverting to the previous behaviour with an additional normalization step on 7.x. Co-authored-by: Nik Everett <nik9000@gmail.com> Closes #50827
We added a new rounding in #50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.
…0873) We added a new rounding in elastic#50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.
Adds support for the `offset` parameter to the `date_histogram` source of composite aggs. The `offset` parameter is supported by the normal `date_histogram` aggregation and is useful for folks that need to measure things from, say, 6am one day to 6am the next day. This is implemented by creating a new `Rounding` that knows how to handle offsets and delegates to other rounding implementations. That implementation doesn't fully implement the `Rounding` contract, namely `nextRoundingValue`. That method isn't used by composite aggs so I can't be sure that any implementation that I add will be correct. I propose to leave it throwing `UnsupportedOperationException` until I need it. Closes elastic#48757
When I implemented elastic#50609 I suppressed a few backwards compatibility checks until I finished the backport. I've now finished the backport so this enables the tests.
We added a new rounding in elastic#50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.
) We added a new rounding in elastic#50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset. This is a redo of elastic#50873 with more integration tests. This reverts commit d114c9d.
…51495) We added a new rounding in #50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset. This is a redo of #50873 with more integration tests. This reverts commit d114c9d.
Adds support for the
offsetparameter to thedate_histogramsourceof composite aggs. The
offsetparameter is supported by the normaldate_histogramaggregation and is useful for folks that need tomeasure things from, say, 6am one day to 6am the next day.
This is implemented by creating a new
Roundingthat knows how tohandle offsets and delegates to other rounding implementations. That
implementation doesn't fully implement the
Roundingcontract, namelynextRoundingValue. That method isn't used by composite aggs so I can'tbe sure that any implementation that I add will be correct. I propose to
leave it throwing
UnsupportedOperationExceptionuntil I need it.Closes #48757