[CCR] Changed AutoFollowCoordinator to keep track of certain statistics#33684
[CCR] Changed AutoFollowCoordinator to keep track of certain statistics#33684martijnvg merged 10 commits intoelastic:masterfrom
Conversation
The following stats are being kept track of: 1) The total number of times that auto following a leader index succeed. 2) The total number of times that auto following a leader index failed. 3) The total number of times that fetching a remote cluster state failed. 4) The most recent 256 auto follow failures per auto leader index (e.g. create_and_follow api call fails) or cluster alias (e.g. fetching remote cluster state fails). Each auto follow run now produces a result that is being used to update the stats being kept track of in AutoFollowCoordinator. Relates to elastic#33007
|
Pinging @elastic/es-distributed |
|
\cc @jasontedor |
dnhatn
left a comment
There was a problem hiding this comment.
@martijnvg This looks good. I left some comments.
x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/AutoFollowCoordinator.java
Show resolved
Hide resolved
| LOGGER.warn("failure occurred during auto-follower coordination", e); | ||
| Consumer<List<AutoFollowResult>> handler = results -> { | ||
| for (AutoFollowResult result : results) { | ||
| if (result.clusterStateFetchException != null) { |
There was a problem hiding this comment.
This flow is quite similar to updateStats. Should we combine these to a single method?
|
|
||
| private final CountDown autoFollowPatternsCountDown; | ||
| private final AtomicReference<Exception> autoFollowPatternsErrorHolder = new AtomicReference<>(); | ||
| private final AtomicArray<AutoFollowResult> clusterAliasResults; |
There was a problem hiding this comment.
maybe just call it autoFollowResults?
| for (Index indexToFollow : leaderIndicesToFollow) { | ||
| final AtomicArray<Tuple<Index, Exception>> results = new AtomicArray<>(leaderIndicesToFollow.size()); | ||
| for (int i = 0; i < leaderIndicesToFollow.size(); i++) { | ||
| final int slot = i; |
There was a problem hiding this comment.
I am wondering if we can use another data-structure (instead of AtomicArray) to avoid passing the slot-index around. It may be less error-prone because we now have two slot-indexes (clusterAliasSlot and slot) in handleClusterAlias method.
There was a problem hiding this comment.
maybe it would be better if we just do not pass the clusterAliasSlot around. Let me see if this would be cleaner.
| } | ||
| if (leaderIndicesCountDown.countDown()) { | ||
| finalise(leaderIndicesErrorHolder.get()); | ||
| finalise(clusterAliasSlot, new AutoFollowResult(clusterAlias, results.asList())); |
There was a problem hiding this comment.
Can we assert that every slot is assigned?
| clusterAliasResults.set(slot, result); | ||
| if (autoFollowPatternsCountDown.countDown()) { | ||
| handler.accept(autoFollowPatternsErrorHolder.get()); | ||
| handler.accept(clusterAliasResults.asList()); |
There was a problem hiding this comment.
Can we assert that every slot is assigned?
|
|
||
| AutoFollowResult(String clusterAlias) { | ||
| this.clusterAlias = clusterAlias; | ||
| this.clusterStateFetchException = null; |
There was a problem hiding this comment.
Maybe just delegate to this(clusterAlias, null)?
| public class AutoFollowStats implements Writeable, ToXContentObject { | ||
|
|
||
| private static final ParseField NUMBER_OF_SUCCESSFUL_INDICES_AUTO_FOLLOWED = | ||
| new ParseField("number_of_successful_indices_auto_followed"); |
There was a problem hiding this comment.
I am not sure if indices_auto_followed is a right term. @jasontedor WDYT?
There was a problem hiding this comment.
maybe just number_of_successful_followed_indices?
There was a problem hiding this comment.
Yes, I like this name. And I think we don't need ed here.
…method, this avoids handleClusterAlias() having to know about clusterAliasSlot parameter.
| getLeaderIndicesToFollow(autoFollowPattern, leaderClusterState, followerClusterState, followedIndices); | ||
| if (leaderIndicesToFollow.isEmpty()) { | ||
| finalise(slot, new AutoFollowResult(clusterAlias)); | ||
| }else { |
…cs (#33684) The following stats are being kept track of: 1) The total number of times that auto following a leader index succeed. 2) The total number of times that auto following a leader index failed. 3) The total number of times that fetching a remote cluster state failed. 4) The most recent 256 auto follow failures per auto leader index (e.g. create_and_follow api call fails) or cluster alias (e.g. fetching remote cluster state fails). Each auto follow run now produces a result that is being used to update the stats being kept track of in AutoFollowCoordinator. Relates to #33007
The following stats are being kept track of:
(e.g. create_and_follow api call fails) or cluster alias
(e.g. fetching remote cluster state fails).
Each auto follow run now produces a result that is being used to update
the stats being kept track of in AutoFollowCoordinator.
The transport and rest actions are added in a follow up PR.
Relates to #33007