[CCR] Add time since last auto follow fetch to auto follow stats by martijnvg · Pull Request #36542 · elastic/elasticsearch

martijnvg · 2018-12-12T13:16:54Z

For each remote cluster the auto follow coordinator, starts an auto
follower that checks the remote cluster state and determines whether an
index needs to be auto followed. The time since last auto follow is
reported per remote cluster and gives insight whether the auto follow
process is alive.

Relates to #33007
Originates from #35895

For each remote cluster the auto follow coordinator, starts an auto follower that checks the remote cluster state and determines whether an index needs to be auto followed. The time since last auto follow is reported per remote cluster and gives insight whether the auto follow process is alive. Relates to elastic#33007 Originates from elastic#35895

elasticmachine · 2018-12-12T13:16:56Z

Pinging @elastic/es-distributed

martijnvg · 2018-12-12T13:20:24Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ccr/AutoFollowStats.java

    private final long numberOfFailedRemoteClusterStateRequests;
    private final long numberOfSuccessfulFollowIndices;
    private final NavigableMap<String, ElasticsearchException> recentAutoFollowErrors;
+    private final NavigableMap<String, Long> trackingRemoteClusters;


I wonder whether all stats should be reported per remote cluster, so numberOfSuccessfulFollowIndices, recentAutoFollowErrors etc. would then be reported on a per remote cluster bases. This makes more sense to me now considering how the auto follow coordinator currently operates.

dnhatn

Thanks @martijnvg. This looks good. I left some comments.

dnhatn · 2018-12-12T16:57:16Z

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/Ccr.java

        return Arrays.asList(
            ccrLicenseChecker,
-            new AutoFollowCoordinator(client, clusterService, ccrLicenseChecker)
+            new AutoFollowCoordinator(client, clusterService, ccrLicenseChecker, System::nanoTime)


Maybe use ThreadPool::relativeTimeInMillis.

dnhatn · 2018-12-12T17:03:12Z

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/AutoFollowCoordinator.java

        private final Supplier<ClusterState> followerClusterStateSupplier;
+        private final LongSupplier relativeTimeProvider;

+        private volatile long lastAutoFollowTime = -1;


Can we add the time unit to the variable names (relativeTimeProvider and lastAutoFollowTime)?

dnhatn · 2018-12-12T17:04:48Z

x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ccr/AutoFollowStats.java

+        long numberOfFailedRemoteClusterStateRequests,
+        long numberOfSuccessfulFollowIndices,
+        NavigableMap<String, ElasticsearchException> recentAutoFollowErrors,
+        NavigableMap<String, Long> trackingRemoteClusters


Not sure about this parameter name (trackingRemoteClusters).

Agreed, current name is not so good. I'm thinking about auto_followed_cluster and also including the last_seen_metadata version together with time_since_last_auto_follow_millis.

fixed monitor mapping tests

…_auto_follow_fetch

martijnvg · 2018-12-13T16:41:05Z

@dnhatn Thanks for look and I've updated the PR.

dnhatn

I left some comments around the time unit. Feel free to merge it after addressing these.

dnhatn · 2018-12-13T22:54:41Z

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/AutoFollowCoordinator.java

            ClusterService clusterService,
-            CcrLicenseChecker ccrLicenseChecker) {
+            CcrLicenseChecker ccrLicenseChecker,
+            LongSupplier relativeNanoTimeProvider) {


nit: this became milliseconds.

Yikes, I completely missed that. Good catch!

dnhatn · 2018-12-13T22:55:31Z

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/AutoFollowCoordinator.java

+            long lastSeenMetadataVersion = entry.getValue().metadataVersion;
+            if (lastAutoFollowTimeInNanos != -1) {
+                long timeSinceLastAutoFollowInMillis =
+                    TimeUnit.NANOSECONDS.toMillis(relativeNanoTimeProvider.getAsLong() - lastAutoFollowTimeInNanos);


Since the time unit is ms, we should remove this conversion.

dnhatn · 2018-12-13T22:57:43Z

x-pack/plugin/core/src/main/resources/monitoring-es.json

+                "cluster_name": {
+                  "type": "keyword"
+                },
+                "time_since_last_auto_follow_started_millis": {


maybe drop started in the field name?

dnhatn · 2018-12-13T23:04:26Z

x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/AutoFollowCoordinator.java

        private final Supplier<ClusterState> followerClusterStateSupplier;
+        private final LongSupplier relativeTimeProvider;

+        private volatile long lastAutoFollowTimeInNanos = -1;


This also became ms now.

dnhatn

LGTM

martijnvg · 2018-12-14T10:45:02Z

run the gradle build tests 1

…_auto_follow_fetch

bleskes · 2018-12-17T08:11:52Z

I wonder if we should call this time_since_last_check. Fetch sounds implementation specific IMO.

martijnvg · 2018-12-17T09:25:51Z

I wonder if we should call this time_since_last_check. Fetch sounds implementation specific IMO.

👍 I will rename it to time_since_last_check

) For each remote cluster the auto follow coordinator, starts an auto follower that checks the remote cluster state and determines whether an index needs to be auto followed. The time since last auto follow is reported per remote cluster and gives insight whether the auto follow process is alive. Relates to #33007 Originates from #35895

* master: (30 commits) Revert "[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (elastic#35320)" Deprecate types in get_source and exist_source (elastic#36426) Fix duplicate phrase in shrink/split error message (elastic#36734) ingest: support default pipelines + bulk upserts (elastic#36618) TESTS:Debug Log. IndexStatsIT#testFilterCacheStats [Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (elastic#35320) [TEST] fix float comparison in RandomObjects#getExpectedParsedValue Initialize startup `CcrRepositories` (elastic#36730) ingest: fix on_failure with Drop processor (elastic#36686) SNAPSHOTS: Adjust BwC Versions in Restore Logic (elastic#36718) [Painless] Add boxed type to boxed type casts for method/return (elastic#36571) Do not resolve addresses in remote connection info (elastic#36671) Add back one line removed by mistake regarding java version check and COMPAT jvm parameter existence Fixing line length for EnvironmentTests and RecoveryTests (elastic#36657) SQL: Fix translation of LIKE/RLIKE keywords (elastic#36672) [DOCS] Adds monitoring requirement for ingest node (elastic#36665) SNAPSHOTS: Disable BwC Tests Until elastic#36659 Landed (elastic#36709) Add doc's sequence number + primary term to GetResult and use it for updates (elastic#36680) [CCR] Add time since last auto follow fetch to auto follow stats (elastic#36542) Watcher accounts constructed lazily (elastic#36656) ...

codebrain · 2019-05-08T02:10:21Z

Would it be possible to update the documentation? https://www.elastic.co/guide/en/elasticsearch/reference/6.7/ccr-get-stats.html

martijnvg added >enhancement v7.0.0 :Distributed/CCR Issues around the Cross Cluster State Replication features v6.6.0 labels Dec 12, 2018

martijnvg requested a review from jasontedor December 12, 2018 13:16

martijnvg commented Dec 12, 2018

View reviewed changes

dnhatn reviewed Dec 12, 2018

View reviewed changes

martijnvg mentioned this pull request Dec 13, 2018

[CCR] Auto follow patterns #33007

Closed

10 tasks

martijnvg added 7 commits December 13, 2018 10:52

iter

7b2bd3c

added lastSeenMetadataVersion and added serialization version checks

6dcd88d

renamed tracking_remote_clusters to auto_followed_clusters and

8f08cb5

fixed monitor mapping tests

fixed docs

ff0724f

Merge remote-tracking branch 'es/master' into ccr_add_time_since_last…

b9aa581

…_auto_follow_fetch

fixed hlrc

29d4645

fixed serialization bug

e99790f

dnhatn approved these changes Dec 13, 2018

View reviewed changes

iter

c0c0e83

dnhatn approved these changes Dec 14, 2018

View reviewed changes

Merge remote-tracking branch 'es/master' into ccr_add_time_since_last…

ae5fa2f

…_auto_follow_fetch

rename

c196e01

martijnvg merged commit a181a25 into elastic:master Dec 17, 2018

colings86 added the v7.0.0-beta1 label Feb 7, 2019

colings86 removed the v7.0.0 label Feb 7, 2019

codebrain mentioned this pull request May 8, 2019

Add time since last auto follow fetch to auto follow stats elastic/elasticsearch-net#3727

Merged

Conversation

martijnvg commented Dec 12, 2018

Uh oh!

elasticmachine commented Dec 12, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martijnvg commented Dec 13, 2018

Uh oh!

dnhatn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dnhatn left a comment

Choose a reason for hiding this comment

Uh oh!

martijnvg commented Dec 14, 2018

Uh oh!

bleskes commented Dec 17, 2018

Uh oh!

martijnvg commented Dec 17, 2018

Uh oh!

codebrain commented May 8, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants