Implement from_sort_value Parameter in Get Snapshots API#77618
Implement from_sort_value Parameter in Get Snapshots API#77618original-brownbear merged 15 commits intoelastic:masterfrom original-brownbear:after-value
Conversation
Add `after_value` parameter to allow for filtering snapshots by comparing to concrete sort column values similar to the existing `after` parameter`.
|
Pinging @elastic/es-distributed (Team:Distributed) |
| assertThat(getAllSnapshotsForPolicies(GetSnapshotsRequest.NO_POLICY_PATTERN, "*"), is(allSnapshots)); | ||
| } | ||
|
|
||
| public void testSortAfterStartTime() { |
There was a problem hiding this comment.
I did not explicitly add tests for all columns as they use the same functionality and limited myself to those that I needed for full coverage only here.
| SortOrder order | ||
| ) { | ||
| if (snapshotName == null) { | ||
| assert repoName == null : "no snapshot name given but saw repo name [" + repoName + "]"; |
There was a problem hiding this comment.
There's definitely ways of making this nicer, but since it's a temporary solution I went for the shortest+easiest-to-review version again to get this into the API (and thus complete the API side of this work) asap.
|
Jenkins run elasticsearch-ci/part-1 |
|
@henningandersen this one should be good for review now :) as discussed it's using |
| blockAllDataNodes(repoName); | ||
| final ActionFuture<CreateSnapshotResponse> snapshot2Future = startFullSnapshot(repoName, "snapshot-2"); | ||
| awaitNumberOfSnapshotsInProgress(1); | ||
| TimeUnit.MILLISECONDS.sleep(snapshot1.endTime() - snapshot1.startTime() + 1); |
There was a problem hiding this comment.
I'm not sure if this complexity really worth it? Maybe we could just snapshot indices with random docs, retrieve the duration and test after_value after that?
There was a problem hiding this comment.
Agreed, this seems overly complex. I think we could even just do one test with all 3 validations in one go?
There was a problem hiding this comment.
It's pretty easy to get collisions no matter what things you do in between snapshots if you don't make it deterministic like this. That said, this wasn't even good enough for Windows I think and I went with a different approach now, running the snapshot create in a loop until we get unique timestamps. Should be safe on all systems now :)
| // TESTRESPONSE[s/"duration_in_millis": 1/"duration_in_millis": $body.snapshots.1.duration_in_millis/] | ||
|
|
||
|
|
||
| The following request returns information for all snapshots that come after `snapshot_2` when sorted by snapshot name in the default |
There was a problem hiding this comment.
I suspect that sorting after a given start/end time will be the most common use case. Maybe it's worth showing as an example?
There was a problem hiding this comment.
I'd love to but I don't see a way of doing this with a test because the timestamp on the request would have to be dynamic right?
There was a problem hiding this comment.
the timestamp on the request would have to be dynamic right?
Or low enough to always return some snapshots in the response. But that's just a suggestion, let's ignore if that's too complicated or ugly.
There was a problem hiding this comment.
Fair point :) Added a docs run for this with a low timestamp now.
henningandersen
left a comment
There was a problem hiding this comment.
Left a few smaller comments, otherwise this looks good.
| (Optional, string) | ||
| Sort order. Valid values are `asc` for ascending and `desc` for descending order. Defaults to `asc`, meaning ascending order. | ||
|
|
||
| `after_value`:: |
There was a problem hiding this comment.
I wonder if we should call the parameter after_sort_value to make the coupling to sort explicit in the name?
I am still pondering on after_. Reading your text, perhaps start_sort_value is better, though I am not really contend with that either...
There was a problem hiding this comment.
Not sure about this or the alternatives. How about just from maybe? It's short and it indicates "start" + "inclusive"?
There was a problem hiding this comment.
I like from, but perhaps still qualify it as from_sort_value? I think that makes it easier to see from the request what it means.
|
|
||
| assertThat(allAfterStartTimeAscending(startTime1 - 1), is(allSnapshotInfo)); | ||
| assertThat(allAfterStartTimeAscending(startTime1), is(allSnapshotInfo)); | ||
| assertThat(allAfterStartTimeAscending(startTime2), is(List.of(snapshot2, snapshot3))); |
There was a problem hiding this comment.
Are we sure that snapshot1 and snapshot2 get different timestamps?
There was a problem hiding this comment.
Maybe not on Windows where the clock is less accurate actually ... let me make sure they will.
There was a problem hiding this comment.
Adjusted the code to ensure the timestamps never collide now.
There was a problem hiding this comment.
I am not sure I see that here in the rest test case, only in the internal cluster test? Maybe I missed it?
There was a problem hiding this comment.
Oh right missed this one, adding a fix here as well :)
| .getSnapshots(); | ||
| } | ||
|
|
||
| public void testSortAfterName() { |
There was a problem hiding this comment.
This test and the one sorting by timestamp are very similar, perhaps we can share most of the code?
There was a problem hiding this comment.
Merged all tests into one :)
| blockAllDataNodes(repoName); | ||
| final ActionFuture<CreateSnapshotResponse> snapshot2Future = startFullSnapshot(repoName, "snapshot-2"); | ||
| awaitNumberOfSnapshotsInProgress(1); | ||
| TimeUnit.MILLISECONDS.sleep(snapshot1.endTime() - snapshot1.startTime() + 1); |
There was a problem hiding this comment.
Agreed, this seems overly complex. I think we could even just do one test with all 3 validations in one go?
| if (after != null) { | ||
| validationException = addValidationError("can't use after and offset simultaneously", validationException); | ||
| } | ||
| if (afterValue != null) { |
There was a problem hiding this comment.
Is there a reason we are not allowing this? offset could just be evaluated after after_value?
Obviously we can do that in a follow-up rather than here.
There was a problem hiding this comment.
Huh yea actually we have to that for Kibana (otherwise any filtered results can't be paginated through which seems weird). Let's push it to a follow-up though as it will require some more code-heavy tests I think, though the production code change is easy/small :)
There was a problem hiding this comment.
Nevermind, this turned out to be entirely trivial with the way offset works at the moment. I just enabled it and added 2 simple tests for offset + after.
|
|
||
| NOTE: The parameters `size`, `order`, `after`, `offset`, `slm_policy_filter` and `sort` are not supported when using `verbose=false` and | ||
| the sort order for requests with `verbose=false` is undefined. | ||
| NOTE: The parameters `size`, `order`, `after`, `after_value`, `offset`, `slm_policy_filter` and `sort` are not supported when using |
There was a problem hiding this comment.
We should also note the incompatibility of after_value with offset and after?
There was a problem hiding this comment.
Well now that you pointed it out below, it would only be with "after" in the final thing which I'd code up very shortly after this one. Not sure it's worth the added text though, it obviously makes no sense setting both, the request validation will tell you so and we already point this out in the after section?
There was a problem hiding this comment.
I see you added the incompatibility note to offset, I missed that on my initial read. I think it still makes sense to add to either after or after_value that they are incompatible. I could see this being slightly surprising, since after_value is like a filter so it could be surprising that it does not work with pagination using after (and this could slip through some UI testing).
There was a problem hiding this comment.
Done, added a note to after :)
| final Predicate<SnapshotInfo> isAfter; | ||
| switch (sortBy) { | ||
| case START_TIME: | ||
| isAfter = filterByLongOffset(SnapshotInfo::startTime, Long.parseLong(after), snapshotName, repoName, order); |
There was a problem hiding this comment.
I think we should add explicit error handling when parsing the arguments, perhaps to the request validation or here. That would allow the message to indicate that it is the after_value causing the problem.
Also, I think < 0 is illegal so we could guard against that.
Similarly for duration, indices and failed shards.
There was a problem hiding this comment.
I just realized we have the safe problem for the after value as well (though due to the way we encode things there it's less likely). May I push this into a follow-up? :) Then I can just deal with both in one go and make it a little nicer without blowing this one up.
There was a problem hiding this comment.
That said, I refactored this a little to build the predicate early on now. This makes sure that we at least don't run any actual requests unless we have a valid predicate. Refactoring this nicer and working in a fix for the after param + easy-to-read validation I'd still like to push to a follow-up though if possible :)
There was a problem hiding this comment.
I think that for after we expect it to be handle opaquely by clients so if they mess with it, I am OK to have a worse error (though I will not object to improving it).
I am OK with pushing the error handling to a separate PR.
| assertThat(allBeforeStartTimeDescending(startTime2), is(List.of(snapshot2, snapshot1))); | ||
| assertThat(allBeforeStartTimeDescending(startTime1), is(List.of(snapshot1))); | ||
| assertThat(allBeforeStartTimeDescending(startTime1 - 1), empty()); | ||
| } |
There was a problem hiding this comment.
Can we add a combination test too, validating that after_value and for instance snapshot name filtering works together? Just one of these is enough, no need to do several combinations.
There was a problem hiding this comment.
Added a test for that and also for offset + after_value which turned out to be trivial.
|
Thanks so much for reading through this monster @henningandersen @tlrx . I think I was able to work in all your points now :) The BwC failure is unrelated and know and this should be good for another review I hope. |
henningandersen
left a comment
There was a problem hiding this comment.
Thanks Armin, this looks good, just a comment on the name and some minor comments.
| (Optional, string) | ||
| Sort order. Valid values are `asc` for ascending and `desc` for descending order. Defaults to `asc`, meaning ascending order. | ||
|
|
||
| `after_value`:: |
There was a problem hiding this comment.
I like from, but perhaps still qualify it as from_sort_value? I think that makes it easier to see from the request what it means.
| (Optional, string) | ||
| Offset identifier to start pagination from as returned by the `next` field in the response body. | ||
| Offset identifier to start pagination from as returned by the `next` field in the response body. Using this parameter is mutually exclusive | ||
| with using the `after_value` parameter. |
There was a problem hiding this comment.
I think I saw a comment where you mentioned you did this anyway since it was so easy?
There was a problem hiding this comment.
I didn't do it for after, just for numeric offsets. We don't want after and from_sort_value working in parallel do we?
There was a problem hiding this comment.
Sorry, I got confused by the "offset" part of the sentence, this is fine as is. About after and from_sort_value, let us see what kibana thinks, I think they can handle it.
|
|
||
| assertThat(allAfterStartTimeAscending(startTime1 - 1), is(allSnapshotInfo)); | ||
| assertThat(allAfterStartTimeAscending(startTime1), is(allSnapshotInfo)); | ||
| assertThat(allAfterStartTimeAscending(startTime2), is(List.of(snapshot2, snapshot3))); |
There was a problem hiding this comment.
I am not sure I see that here in the rest test case, only in the internal cluster test? Maybe I missed it?
| ) throws Exception { | ||
| while (true) { | ||
| final SnapshotInfo snapshotInfo = createFullSnapshot(repoName, snapshotName); | ||
| final long duration = snapshotInfo.endTime() - snapshotInfo.startTime(); |
There was a problem hiding this comment.
I am mildly concerned about the duration being either 0 or 100ms on some platform meaning we could loop infinitely here. I think we will be ok so let us leave it, though a comment to not go beyond the 3 snapshots we generate now seems in order.
There was a problem hiding this comment.
Good point, comment added
|
Thanks @henningandersen renamed + other comments addressed as well now :) |
| // TESTRESPONSE[s/"duration_in_millis": 1/"duration_in_millis": $body.snapshots.1.duration_in_millis/] | ||
|
|
||
|
|
||
| The following request returns information for all snapshots that come after `snapshot_2` when sorted by snapshot name in the default |
|
Thanks Henning & Tanguy!! |
Add
from_sort_valueparameter to allow for filtering snapshots by comparing to concrete sort columnvalues similar to the existing
afterparameter`.