Cache completion stats between refreshes#52872
Merged
DaveCTurner merged 1 commit intoelastic:7.xfrom Feb 27, 2020
Merged
Conversation
Computing the stats for completion fields may involve a significant amount of work since it walks every field of every segment looking for completion fields. Innocuous-looking APIs like `GET _stats` or `GET _cluster/stats` do this for every shard in the cluster. This repeated work is unnecessary since these stats do not change between refreshes; in many indices they remain constant for a long time. This commit introduces a cache for these stats which is invalidated on a refresh, allowing most stats calls to bypass the work needed to compute them on most shards. Closes elastic#51915 Backport of elastic#51991
Collaborator
|
Pinging @elastic/es-distributed (:Distributed/Engine) |
Member
Author
|
Slightly nontrivial backport since the I went for the naive solution and implemented the same thing using a mutex to keep things as close as possible to $ diff -U3 elasticsearch-{master,7.x}/server/src/main/java/org/elasticsearch/index/engine/CompletionStatsCache.java
--- elasticsearch-master/server/src/main/java/org/elasticsearch/index/engine/CompletionStatsCache.java 2020-02-27 08:47:44.000000000 +0000
+++ elasticsearch-7.x/server/src/main/java/org/elasticsearch/index/engine/CompletionStatsCache.java 2020-02-27 07:55:08.000000000 +0000
@@ -29,10 +29,10 @@
import org.elasticsearch.action.ActionListener;
import org.elasticsearch.action.support.PlainActionFuture;
import org.elasticsearch.common.FieldMemoryStats;
+import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.regex.Regex;
import org.elasticsearch.search.suggest.completion.CompletionStats;
-import java.util.concurrent.atomic.AtomicReference;
import java.util.function.Supplier;
class CompletionStatsCache implements ReferenceManager.RefreshListener {
@@ -45,7 +45,13 @@
* Futures are eventually completed with stats that include all fields, requiring further filtering (see
* {@link CompletionStatsCache#filterCompletionStatsByFieldName}).
*/
- private final AtomicReference<PlainActionFuture<CompletionStats>> completionStatsFutureRef = new AtomicReference<>();
+ @Nullable
+ private PlainActionFuture<CompletionStats> completionStatsFuture;
+
+ /**
+ * Protects accesses to {@code completionStatsFuture} since we can't use {@link java.util.concurrent.atomic.AtomicReference} in JDK8.
+ */
+ private final Object completionStatsFutureMutex = new Object();
CompletionStatsCache(Supplier<Engine.Searcher> searcherSupplier) {
this.searcherSupplier = searcherSupplier;
@@ -53,7 +59,18 @@
CompletionStats get(String... fieldNamePatterns) {
final PlainActionFuture<CompletionStats> newFuture = new PlainActionFuture<>();
- final PlainActionFuture<CompletionStats> oldFuture = completionStatsFutureRef.compareAndExchange(null, newFuture);
+
+ // final PlainActionFuture<CompletionStats> oldFuture = completionStatsFutureRef.compareAndExchange(null, newFuture);
+ // except JDK8 doesn't have compareAndExchange so we emulate it:
+ final PlainActionFuture<CompletionStats> oldFuture;
+ synchronized (completionStatsFutureMutex) {
+ if (completionStatsFuture == null) {
+ completionStatsFuture = newFuture;
+ oldFuture = null;
+ } else {
+ oldFuture = completionStatsFuture;
+ }
+ }
if (oldFuture != null) {
// we lost the race, someone else is already computing stats, so we wait for that to finish
@@ -91,7 +108,13 @@
} finally {
if (success == false) {
// invalidate the cache (if not already invalidated) so that future calls will retry
- completionStatsFutureRef.compareAndSet(newFuture, null);
+
+ // completionStatsFutureRef.compareAndSet(newFuture, null); except we're not using AtomicReference in JDK8
+ synchronized (completionStatsFutureMutex) {
+ if (completionStatsFuture == newFuture) {
+ completionStatsFuture = null;
+ }
+ }
}
}
@@ -121,7 +144,10 @@
@Override
public void afterRefresh(boolean didRefresh) {
if (didRefresh) {
- completionStatsFutureRef.set(null);
+ // completionStatsFutureRef.set(null); except we're not using AtomicReference in JDK8
+ synchronized (completionStatsFutureMutex) {
+ completionStatsFuture = null;
+ }
}
}
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Computing the stats for completion fields may involve a significant amount of
work since it walks every field of every segment looking for completion fields.
Innocuous-looking APIs like
GET _statsorGET _cluster/statsdo this forevery shard in the cluster. This repeated work is unnecessary since these stats
do not change between refreshes; in many indices they remain constant for a
long time.
This commit introduces a cache for these stats which is invalidated on a
refresh, allowing most stats calls to bypass the work needed to compute them on
most shards.
Closes #51915
Backport of #51991