Skip to content

Add Filtering by SLM Policy to Get Snapshots API#77321

Merged
original-brownbear merged 10 commits intoelastic:masterfrom
original-brownbear:match-by-slm-policy-id
Sep 7, 2021
Merged

Add Filtering by SLM Policy to Get Snapshots API#77321
original-brownbear merged 10 commits intoelastic:masterfrom
original-brownbear:match-by-slm-policy-id

Conversation

@original-brownbear
Copy link
Copy Markdown
Contributor

It's in the title, add new slm_policy field as a filter to
the get snapshots API.

It's in the title, add new `slm_policy` field as a filter to
the get snapshots API.
@original-brownbear original-brownbear added >enhancement :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.16.0 labels Sep 6, 2021
@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team. label Sep 6, 2021
@elasticmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@arteam arteam self-requested a review September 7, 2021 07:06
Copy link
Copy Markdown
Contributor

@arteam arteam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I've left a couple of small comments

Numeric offset to start pagination from based on the snapshots matching this request. Using a non-zero value for this parameter is mutually
exclusive with using the `after` parameter. Defaults to `0`.

`slm_policy`::
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we can have multiple polices here, does it make sense to call the parameter slm_policies or slm_policy_patterns?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think there is something we could do about naming here, perhaps pre of suffix it with filter likeslm_policy_filter? It would be nice to be able to clearly identify the filtering options (in case we add more of them).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I went with slm_policy_filter now :)

final String[] excludes = excludePatterns.toArray(Strings.EMPTY_ARRAY);
filteredSnapshotInfos = snapshotInfos.stream().filter(snapshotInfo -> {
final Map<String, Object> metadata = snapshotInfo.userMetadata();
final String policy;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if were possible to extract getting policy from the metadata to own method in order to make policy final.

String getPolicy(Metadata metadata) {
  if (metadata != null) {
    final Object policyFound = metadata.get(SnapshotsService.POLICY_ID_METADATA_FIELD);
    if (policyFound instanceof String) {
       return (String) policyFound;
    }
  } 
  return "";
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll extract this logic in a follow-up then we can dry it up with SLM code in one go :)

Copy link
Copy Markdown
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments, primarily tests and structure.

Numeric offset to start pagination from based on the snapshots matching this request. Using a non-zero value for this parameter is mutually
exclusive with using the `after` parameter. Defaults to `0`.

`slm_policy`::
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think there is something we could do about naming here, perhaps pre of suffix it with filter likeslm_policy_filter? It would be nice to be able to clearly identify the filtering options (in case we add more of them).

Numeric offset to start pagination from based on the snapshots matching this request. Using a non-zero value for this parameter is mutually
exclusive with using the `after` parameter. Defaults to `0`.

`slm_policy`::
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using this option is only supported in verbose mode? We should note that here and validate it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, added it to the note and added a UT for the validation

GetSnapshotsRequest.NO_LIMIT,
order
order,
slmPolicies
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think slmPolicies is not really supporetd here, since it is not verbose?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ it's always an empty array here, resolved by splitting this method up now

final SortOrder order,
final String[] slmPolicies
) {
final List<SnapshotInfo> filteredSnapshotInfos;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make the filtering here a separate method instead? It looks like there is at least one usage where this is not needed and it is a bit odd to include into the sort method which is also already long enough as it stands. Perhaps add a sortAndFilter that calls both too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, made it into 2 methods and kept the old sort method for those spots that have no filtering :)

.setPolicies("*", "-" + policyName)
.get()
.getSnapshots();
assertThat(snapshotsNoPolicy, is(snapshotsWithoutPolicy));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add tests of:

  1. A slm-policy that is not there.
  2. Multiple slm-policies (where some may not exist).
  3. An actual inclusion pattern, i.e., policyName + "*".
  4. An actual exclusion pattern, i.e., `, "-" + policyName + ""

Could be randomly picking from some of these if that is easiest.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ Added all 4 cases explicitly to rest and internal cluster tests :)

@original-brownbear
Copy link
Copy Markdown
Contributor Author

Thanks @henningandersen and @arteam all comments addressed now I think :)

Copy link
Copy Markdown
Contributor

@arteam arteam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Copy Markdown
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Armin. On my second round I realized a couple more things to consider.


private static List<SnapshotInfo> getAllSnapshotsForPolicies(String... policies) throws IOException {
final Request requestWithPolicy = new Request(HttpGet.METHOD_NAME, "/_snapshot/*/*");
requestWithPolicy.addParameter("slm_policy_filter", Strings.arrayToCommaDelimitedString(policies));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we need an explicit sort here or to compare the results by converrting to sets? Since we spawn the creation of snapshots in createNSnapshots, I think the start times of two snapshots could come out of order? We also risk same timestamp but are then rescued by tiebreaking on name, which works because there are max 5 snapshots (i.e., less than 10), but it also seems a bit strange to rely on at least without a comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, we could actually theoretically fail if everything executes in the same millisecond (... looking at some/windows systems that seems possible). Forced sort-by-name everywhere now in both tests. Thanks!

}

private static List<SnapshotInfo> getAllSnapshotsForPolicies(String... policies) {
return clusterAdmin().prepareGetSnapshots("*").setSnapshots("*").setPolicies(policies).get().getSnapshots();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like for the rest test, some stable sort seems necessary.

final Map<String, Object> metadata = snapshotInfo.userMetadata();
final String policy;
if (metadata == null) {
policy = "";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a choice here. Since there could be no slm-policy (in contrast to repo and name), we could choose between a few options:

  1. * also matches no policy (like here). This allows *,-a* to give all snapshots with no policy starting with a (including all those with no policy), but no easy way to find all those with no policy.
  2. * only matches those that have a policy. This allows * to find all those with a policy and we could define -* to find all those without a policy.

So 2 seems more flexible but deviates from how we handle repo and name. But then again, we are defining this one from scratch so we have the freedom. Let me know your thoughts.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I think I like 2 better as well. I wonder though, the -* trick we'll have to document anyway. Maybe we should use a special parameter _none (like we have for the ingest pipeline name to turn off a pipeline) for this. That way we could even do queries that involve a set of policies + all snapshots without any policies (though we could do this with this pattern as well, it seem incredibly confusing to have this as the only exclude pattern that is technically a positive match pattern for null)?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ to using _none as a special null value.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to the null value pattern, too

@original-brownbear
Copy link
Copy Markdown
Contributor Author

@henningandersen explicit _none pattern and some tests for it have been added :)

Copy link
Copy Markdown
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the extra iterations on this.

*/
public class TransportGetSnapshotsAction extends TransportMasterNodeAction<GetSnapshotsRequest, GetSnapshotsResponse> {

public static final String NO_POLICY_PATTERN = "_none";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this belongs on the request more than the transport action. It is part of the interface of the action.

}
if (policies.length != 0) {
validationException = addValidationError("can't use slm policy filter with verbose=false", validationException);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a generic validation to not use -_none and perhaps even not allow _ or -_ prefixed policies?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-_none "theoretically" you could have a policy by that name I think :D

@@ -220,6 +226,19 @@ public void testFilterBySLMPolicy() throws Exception {
);
assertThat(getAllSnapshotsForPolicies(policyName, otherPolicyName), is(List.of(withOtherPolicy, withPolicy)));
assertThat(getAllSnapshotsForPolicies(policyName, otherPolicyName, "no-such-policy*"), is(List.of(withOtherPolicy, withPolicy)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just * case is missing?

assertThat(getAllSnapshotsForPolicies("*"), is(List.of(withOtherPolicy, withPolicy)));

@@ -244,6 +249,17 @@ public void testFilterBySLMPolicy() throws Exception {
);
assertThat(getAllSnapshotsForPolicies(policyName, otherPolicyName), is(List.of(withOtherPolicy, withPolicy)));
assertThat(getAllSnapshotsForPolicies(policyName, otherPolicyName, "no-such-policy*"), is(List.of(withOtherPolicy, withPolicy)));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add just * case here:

assertThat(getAllSnapshotsForPolicies("*"), is(List.of(withOtherPolicy, withPolicy)));

@original-brownbear
Copy link
Copy Markdown
Contributor Author

Thanks Henning & Artem!

@original-brownbear original-brownbear merged commit 7508720 into elastic:master Sep 7, 2021
@original-brownbear original-brownbear deleted the match-by-slm-policy-id branch September 7, 2021 16:24
original-brownbear added a commit that referenced this pull request Oct 17, 2021
It's in the title, add new `slm_policy_filter` param as a filter to the get snapshots API.
@original-brownbear original-brownbear restored the match-by-slm-policy-id branch April 18, 2023 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed Meta label for distributed team. v7.16.0 v8.0.0-alpha2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants