Reject creating ILM policies with phase timings are not >= previous phase#70089
Conversation
…hase It can be confusing to configure policies with phase timings that get smaller, because phase timings are absolute. To make things a little clearer, this commit now rejects policies where a configured min_age is less than a previous phase's min_age. This validation is added only to the `PutLifecycleAction.Request` instead of the `TimeseriesLifecycleType` class because we cannot do this validation every time a lifecycle is created or else we will block cluster state from being recoverable for existing clusters that may have invalid policies. Resolves elastic#70032
|
Pinging @elastic/es-core-features (Team:Core/Features) |
|
@elasticmachine update branch |
andreidan
left a comment
There was a problem hiding this comment.
Thanks for working on this Lee. I've left a few rather minor comments
...plugin/core/src/test/java/org/elasticsearch/xpack/core/ilm/TimeseriesLifecycleTypeTests.java
Outdated
Show resolved
Hide resolved
...plugin/core/src/test/java/org/elasticsearch/xpack/core/ilm/TimeseriesLifecycleTypeTests.java
Outdated
Show resolved
Hide resolved
...plugin/core/src/test/java/org/elasticsearch/xpack/core/ilm/TimeseriesLifecycleTypeTests.java
Show resolved
Hide resolved
| TimeValue phaseMinAge = phase.getMinimumAge(); | ||
| Set<String> followingPhases = new HashSet<>(ORDERED_VALID_PHASES.subList(i + 1, ORDERED_VALID_PHASES.size())); | ||
| Set<Phase> phasesWithBadAges = phases.stream() | ||
| .filter(p -> followingPhases.contains(p.getName())) |
There was a problem hiding this comment.
I believe that would always be true, but maybe I'm misunderstanding the intent here?
There was a problem hiding this comment.
So here we are looping through the phases, if we're on the "warm" phase then the followingPhases will be cold, frozen, delete. Then, this takes the phases configured in the policy (let's say it has hot, warm, frozen, and delete) and filters them to return a stream with "frozen" and "delete" in it because there is no "cold" phase.
Hopefully that makes sense?
There was a problem hiding this comment.
That makes sense, but I thought the sublist above doest that already?
ORDERED_VALID_PHASES.subList(i + 1, ORDERED_VALID_PHASES.size())
Which I'm just seeing is placed in a HashSet so the order might be lost? I think we should keep an ArrayList above?
List<String> followingPhases = new ArrayList<>(ORDERED_VALID_PHASES.subList(i + 1, ORDERED_VALID_PHASES.size()));
There was a problem hiding this comment.
Ah, I finally got there. I mixed-up the collections.
So I think the order can be indeed lost because we don't expect the validateMonotonicallyIncreasingPhaseTimings to be called with the list of ordered phases (and it is not https://github.com/elastic/elasticsearch/pull/70089/files#diff-dbca592978c7f9e3f747487f76b2e7327bdcac2fa635a23949aa35fba7411bfbR65) but we then iterate over the phases collection we received as if it is ordered - https://github.com/elastic/elasticsearch/pull/70089/files#diff-5964cde83b3b053cf693079cb18a73f4a5743c59590f09fda7dd268613098ce9R390
This is where the mix was in my mind. I think if we keep the order in the followingPhases collection (by not converting it to a Set) and iterate over it we can overcome this:
eg.
followingPhases.stream().filter(p -> p.getMinimumAge() != null && p.getMinimumAge().equals(TimeValue.ZERO) == false).filter(p -> p.getMinimumAge().compareTo(phaseMinAge) < 0)
There was a problem hiding this comment.
So I think the order can be indeed lost because we don't expect the validateMonotonicallyIncreasingPhaseTimings to be called with the list of ordered phases (and it is not #70089 (files)) but we then iterate over the phases collection we received as if it is ordered
We don't iterate over the phases collection received though, instead we iterate over ORDERED_VALID_PHASES, picking out the phases from the provided collection one-by-one.
followingPhases doesn't need to be ordered because it is a subset of the phases occurring after the maybePhase, so if phase were warm then followingPhases would be cold, frozen, delete, and the order for those doesn't matter because they are all compared independently to the timings for the "warm" phase.
Hopefully that was clear? I'm probably not explaining it in the clearest way.
There was a problem hiding this comment.
Thanks for the clarification. I was thrown by the phases.stream() line which is iterating over the possibly unordered parameter list, however the .filter(p -> followingPhases.contains(p.getName())) filtering makes sure we do only process the following phases, and you're right the order at that stage does not matter anymore.
Thanks for bearing with me!
| Phase warmPhase = new Phase(WARM_PHASE, TimeValue.timeValueDays(2), Collections.emptyMap()); | ||
| Phase coldPhase = new Phase(COLD_PHASE, null, Collections.emptyMap()); | ||
| Phase frozenPhase = new Phase(FROZEN_PHASE, TimeValue.timeValueDays(1), Collections.emptyMap()); | ||
| Phase deletePhase = new Phase(DELETE_PHASE, TimeValue.timeValueHours(36), Collections.emptyMap()); |
There was a problem hiding this comment.
I suspect we might get more errors than we want (or maybe we do want them?) in the following case:
hot 1d, warm 3d, cold 0, frozen 2d, delete 1d
Can we add a test for this please? (I think we either want to support to multiple errors in this case, or otherwise we need to add a break after this line https://github.com/elastic/elasticsearch/pull/70089/files#diff-5964cde83b3b053cf693079cb18a73f4a5743c59590f09fda7dd268613098ce9R400
There was a problem hiding this comment.
Yep, I've added a test for this and some logic to avoid it in 77dea57
|
@elasticmachine update branch |
andreidan
left a comment
There was a problem hiding this comment.
Thanks for iterating on this and for adding this validation Lee.
LGTM
…hase (elastic#70089) It can be confusing to configure policies with phase timings that get smaller, because phase timings are absolute. To make things a little clearer, this commit now rejects policies where a configured min_age is less than a previous phase's min_age. This validation is added only to the PutLifecycleAction.Request instead of the TimeseriesLifecycleType class because we cannot do this validation every time a lifecycle is created or else we will block cluster state from being recoverable for existing clusters that may have invalid policies. Resolves elastic#70032
…ious phase (#70089) (#70110) It can be confusing to configure policies with phase timings that get smaller, because phase timings are absolute. To make things a little clearer, this commit now rejects policies where a configured min_age is less than a previous phase's min_age. This validation is added only to the PutLifecycleAction.Request instead of the TimeseriesLifecycleType class because we cannot do this validation every time a lifecycle is created or else we will block cluster state from being recoverable for existing clusters that may have invalid policies. Resolves #70032
It can be confusing to configure policies with phase timings that get smaller, because phase timings
are absolute. To make things a little clearer, this commit now rejects policies where a configured
min_age is less than a previous phase's min_age.
This validation is added only to the
PutLifecycleAction.Requestinstead of theTimeseriesLifecycleTypeclass because we cannot do this validation every time a lifecycle iscreated or else we will block cluster state from being recoverable for existing clusters that may
have invalid policies.
Resolves #70032