Skip to content

Cache completion stats between refreshes #52872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

DaveCTurner
Copy link
Contributor

Computing the stats for completion fields may involve a significant amount of
work since it walks every field of every segment looking for completion fields.
Innocuous-looking APIs like GET _stats or GET _cluster/stats do this for
every shard in the cluster. This repeated work is unnecessary since these stats
do not change between refreshes; in many indices they remain constant for a
long time.

This commit introduces a cache for these stats which is invalidated on a
refresh, allowing most stats calls to bypass the work needed to compute them on
most shards.

Closes #51915
Backport of #51991

Computing the stats for completion fields may involve a significant amount of
work since it walks every field of every segment looking for completion fields.
Innocuous-looking APIs like `GET _stats` or `GET _cluster/stats` do this for
every shard in the cluster. This repeated work is unnecessary since these stats
do not change between refreshes; in many indices they remain constant for a
long time.

This commit introduces a cache for these stats which is invalidated on a
refresh, allowing most stats calls to bypass the work needed to compute them on
most shards.

Closes elastic#51915
Backport of elastic#51991
@DaveCTurner DaveCTurner added >enhancement :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. backport v7.7.0 labels Feb 27, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Engine)

@DaveCTurner
Copy link
Contributor Author

Slightly nontrivial backport since the master version uses AtomicReference#compareAndExchange but that's not available in JDK8 ☹️

I went for the naive solution and implemented the same thing using a mutex to keep things as close as possible to master; I don't think that lock-freedom is a big deal here. The diff to the master version is worth a quick check to make sure I haven't done something daft:

$ diff -U3 elasticsearch-{master,7.x}/server/src/main/java/org/elasticsearch/index/engine/CompletionStatsCache.java
--- elasticsearch-master/server/src/main/java/org/elasticsearch/index/engine/CompletionStatsCache.java	2020-02-27 08:47:44.000000000 +0000
+++ elasticsearch-7.x/server/src/main/java/org/elasticsearch/index/engine/CompletionStatsCache.java	2020-02-27 07:55:08.000000000 +0000
@@ -29,10 +29,10 @@
 import org.elasticsearch.action.ActionListener;
 import org.elasticsearch.action.support.PlainActionFuture;
 import org.elasticsearch.common.FieldMemoryStats;
+import org.elasticsearch.common.Nullable;
 import org.elasticsearch.common.regex.Regex;
 import org.elasticsearch.search.suggest.completion.CompletionStats;

-import java.util.concurrent.atomic.AtomicReference;
 import java.util.function.Supplier;

 class CompletionStatsCache implements ReferenceManager.RefreshListener {
@@ -45,7 +45,13 @@
      * Futures are eventually completed with stats that include all fields, requiring further filtering (see
      * {@link CompletionStatsCache#filterCompletionStatsByFieldName}).
      */
-    private final AtomicReference<PlainActionFuture<CompletionStats>> completionStatsFutureRef = new AtomicReference<>();
+    @Nullable
+    private PlainActionFuture<CompletionStats> completionStatsFuture;
+
+    /**
+     * Protects accesses to {@code completionStatsFuture} since we can't use {@link java.util.concurrent.atomic.AtomicReference} in JDK8.
+     */
+    private final Object completionStatsFutureMutex = new Object();

     CompletionStatsCache(Supplier<Engine.Searcher> searcherSupplier) {
         this.searcherSupplier = searcherSupplier;
@@ -53,7 +59,18 @@

     CompletionStats get(String... fieldNamePatterns) {
         final PlainActionFuture<CompletionStats> newFuture = new PlainActionFuture<>();
-        final PlainActionFuture<CompletionStats> oldFuture = completionStatsFutureRef.compareAndExchange(null, newFuture);
+
+        // final PlainActionFuture<CompletionStats> oldFuture = completionStatsFutureRef.compareAndExchange(null, newFuture);
+        // except JDK8 doesn't have compareAndExchange so we emulate it:
+        final PlainActionFuture<CompletionStats> oldFuture;
+        synchronized (completionStatsFutureMutex) {
+            if (completionStatsFuture == null) {
+                completionStatsFuture = newFuture;
+                oldFuture = null;
+            } else {
+                oldFuture = completionStatsFuture;
+            }
+        }

         if (oldFuture != null) {
             // we lost the race, someone else is already computing stats, so we wait for that to finish
@@ -91,7 +108,13 @@
         } finally {
             if (success == false) {
                 // invalidate the cache (if not already invalidated) so that future calls will retry
-                completionStatsFutureRef.compareAndSet(newFuture, null);
+
+                // completionStatsFutureRef.compareAndSet(newFuture, null); except we're not using AtomicReference in JDK8
+                synchronized (completionStatsFutureMutex) {
+                    if (completionStatsFuture == newFuture) {
+                        completionStatsFuture = null;
+                    }
+                }
             }
         }

@@ -121,7 +144,10 @@
     @Override
     public void afterRefresh(boolean didRefresh) {
         if (didRefresh) {
-            completionStatsFutureRef.set(null);
+            // completionStatsFutureRef.set(null); except we're not using AtomicReference in JDK8
+            synchronized (completionStatsFutureMutex) {
+                completionStatsFuture = null;
+            }
         }
     }
 }

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@DaveCTurner DaveCTurner merged commit 52fa465 into elastic:7.x Feb 27, 2020
@DaveCTurner DaveCTurner deleted the 2020-02-27-backport-51991-completion-stats-cache branch February 27, 2020 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. >enhancement v7.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants