Introduce FetchContext #62357

romseygeek · 2020-09-15T08:55:02Z

We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.

This commit introduces a new FetchContext class that exposes only those
methods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.

elasticmachine · 2020-09-15T08:55:04Z

Pinging @elastic/es-search (:Search/Search)

romseygeek · 2020-09-15T08:56:57Z

server/src/main/java/org/elasticsearch/search/fetch/FetchContext.java

+    /**
+     * The name of the index that documents are being fetched from
+     */
+    public String getIndexName() {


This is I think only used in error messages, and is duplicated by information in FetchPhaseExecutionException so we may be able to remove it.

romseygeek · 2020-09-15T08:57:24Z

server/src/main/java/org/elasticsearch/search/fetch/FetchContext.java

+
+    /**
+     * Gets index field data for a specific fieldtype
+     */


It would be really nice to be able to replace this with a SearchLookup

What do you mean? A method returning search lookup instead of index field data?

I don't follow 100% how that makes things better but I guess I need to see this updated once #61995 gets merged to understand how the two interact.

This is now gone entirely, but I think it would be useful to move SearchLookup directly into the FetchContext - it's not clear at present how to do that and get rid of the new QueryShardContext.newFetchLookup() method though.

agreed on moving search lookup into fetch context, I think Nik proposed it as well somewhere.

server/src/main/java/org/elasticsearch/search/fetch/FetchContext.java

romseygeek · 2020-09-15T09:03:01Z

server/src/main/java/org/elasticsearch/search/fetch/FetchPhase.java

-        Map<String, Set<String>> storedToRequestedFields = new HashMap<>();
-        FieldsVisitor fieldsVisitor = createStoredFieldsVisitor(context, storedToRequestedFields);
+        if (context.docIdsToLoadSize() == 0) {
+            // no individual hits to process, so we shortcut


We were checking in a couple of subphases for things like "how many docs are we executing against" or "is this a suggest-only response". This pulls this check up to the top level, which also means we don't need to include this information in the FetchContext.

romseygeek · 2020-09-15T09:04:31Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/ExplainPhase.java

@@ -33,8 +33,8 @@
 public final class ExplainPhase implements FetchSubPhase {

    @Override
-    public FetchSubPhaseProcessor getProcessor(SearchContext context) {
-        if (context.explain() == false || context.hasOnlySuggest()) {


The hasOnlySuggest check is now done at the top level

slight update; instead of checking for hasOnlySuggest(), we check that query() != null which is more general.

server/src/main/java/org/elasticsearch/search/fetch/subphase/FetchDocValuesPhase.java

romseygeek · 2020-09-15T09:07:31Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/FetchScorePhase.java


 import java.io.IOException;

 public class FetchScorePhase implements FetchSubPhase {

    @Override
-    public FetchSubPhaseProcessor getProcessor(SearchContext context) throws IOException {
-        if (context.trackScores() == false || context.docIdsToLoadSize() == 0 ||


This logic is all contained in FetchContext#fetchScores()

romseygeek · 2020-09-15T09:09:59Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/highlight/UnifiedHighlighter.java

-        try {
-            fieldSnippets = highlighter.highlightField(hitContext.reader(), hitContext.docId(), loadFieldValues);
-        } catch (IOException e) {
-            throw new FetchPhaseExecutionException(fieldContext.shardTarget,


IOExceptions are handled in the top-level FetchPhase

romseygeek · 2020-09-15T11:16:58Z

@elasticmachine run elasticsearch-ci/2 (seems unrelated)

jtibshirani

It seems really nice to scope down the context passed around during fetch.

I'm wondering about the higher-level plan/ design related to this direction. Will SearchContext eventually be broken into separate objects QueryContext and FetchContext ?

jtibshirani · 2020-09-15T17:55:59Z

server/src/test/java/org/elasticsearch/search/fetch/subphase/FetchSourcePhaseTests.java

@@ -173,30 +172,4 @@ private HitContext hitExecuteMultiple(XContentBuilder source, boolean fetchSourc
        return hitContext;
    }

-    private static class FetchSourcePhaseTestSearchContext extends TestSearchContext {


This sort of deletion is a good sign !

jtibshirani · 2020-09-15T18:04:08Z

server/src/main/java/org/elasticsearch/search/fetch/subphase/MatchedQueriesPhase.java

-            context.parsedQuery() == null) {
-            return null;
+    public FetchSubPhaseProcessor getProcessor(FetchContext context) throws IOException {
+        Map<String, Query> namedQueries = new HashMap<>();


Checking I understand, does this actually fix a bug where we wouldn't return named query information if there was only a post filter?

I don't think it's actually possible to have a post filter without a query, but the logic of what gets set where is pretty hairy so I thought it best to be extra defensive here.

jtibshirani · 2020-09-15T18:13:48Z

server/src/internalClusterTest/java/org/elasticsearch/search/fetch/FetchSubPhasePluginIT.java

-                    hitContext.hit().getId());
-            TermVectorsResponse termVector = TermVectorsService.getTermVectors(context.indexShard(), termVectorsRequest);
-            try {
+            Terms terms = hitContext.reader().getTermVector(hitContext.docId(), field);


Checking I understand, is this just an optional clean-up ?

Right, it means that we don't need to expose the whole index shard to the FetchContext.

server/src/main/java/org/elasticsearch/search/fetch/FetchContext.java

server/src/main/java/org/elasticsearch/search/fetch/subphase/FetchDocValuesPhase.java

nik9000 · 2020-09-16T13:12:43Z

server/src/main/java/org/elasticsearch/search/fetch/FetchContext.java

+    /**
+     * Create a FetchContext based on a SearchContext
+     */
+    public static FetchContext fromSearchContext(SearchContext context) {


Why the private ctor and static factory? I know its nice sometimes when you need the name to describe, but I think we don't here?

I started out with a few ways of building the FetchContext but have ended up with just the one, so the factory method is unnecessary. I'll get rid.

nik9000 · 2020-09-16T13:13:13Z

server/src/main/java/org/elasticsearch/search/fetch/FetchContext.java

+        return new FetchContext(context);
+    }
+
+    private final SearchContext searchContext;


Could you float it to the top so it doesn't get lost? I get really confused if member variables aren't above member methods. I'm just super used to that.

nik9000 · 2020-09-16T13:15:57Z

server/src/main/java/org/elasticsearch/search/fetch/FetchSubPhase.java

     * implementation should return {@code null}
     */
-    FetchSubPhaseProcessor getProcessor(SearchContext searchContext, SearchLookup lookup) throws IOException;
+    FetchSubPhaseProcessor getProcessor(FetchContext fetchContext, SearchLookup lookup) throws IOException;


Do you think lookup should be a member on FetchContext? I know I just added it, but now that you are proposing our own "context" I think I'd be nice to fold it in.

Yes! I gave it a quick try after the merge but I think we really want to remove the QueryShardContext#newFetchLookup() method at the same time and it's not obvious to me how to do that yet, so I thought I'd leave it for a follow-up.

I'd be ok moving to a member now if you're happy with it. It can wait too. Either way.

It isn't so much that we want to get rid of newFetchLookup as that we want it to return a FetchLookup instead of a SearchLookup. We just haven't boiled out the common interface for those.

server/src/main/java/org/elasticsearch/search/fetch/subphase/FetchDocValuesPhase.java

romseygeek · 2020-09-16T13:31:10Z

I'm wondering about the higher-level plan/ design related to this direction. Will SearchContext eventually be broken into separate objects QueryContext and FetchContext ?

I don't really have further plans around SearchContext itself. What I'd like to do here is to move SearchLookup (and the associated SourceLookup) to FetchContext, and then look again at source fetching, because the whole 'loading source and additional stored fields' logic at the moment is very hairy - adding logic to load values from external sources will make it almost impossible to follow as things stand.

romseygeek · 2020-09-16T14:27:51Z

@elasticmachine run elasticsearch-ci/packaging-sample-windows

romseygeek · 2020-09-16T14:29:27Z

@elasticmachine run elasticsearch-ci/packaging-sample-windows again

romseygeek · 2020-09-16T15:21:50Z

@elasticmachine update branch

jtibshirani

Thanks @romseygeek, all my comments have been addressed.

I don't really have further plans around SearchContext itself. What I'd like to do here is to move SearchLookup (and the associated SourceLookup) to FetchContext, and then look again at source fetching...

That makes sense -- even if we don't push further in making SearchContext more modular, this PR is an improvement. And I'm looking forward to future PRs/ discussions around SearchLookup sharing.

romseygeek · 2020-09-17T07:50:57Z

@elasticmachine update branch

romseygeek · 2020-09-17T07:52:39Z

@elasticmachine run elasticsearch-ci/packaging-sample-windows and do it properly this time

javanna

LGTM

We currently pass a SearchContext around to share configuration among FetchSubPhases. With the introduction of runtime fields, it would be useful to start storing some state on this context to be shared between different subphases (for example, stored fields or search lookups can be loaded lazily but referred to by many different subphases). However, SearchContext is a very large and unwieldy class, and adding more methods or state here feels like a bridge too far. This commit introduces a new FetchContext class that exposes only those methods on SearchContext that are required for fetch phases. This reduces the API surface area for fetch phases considerably, and should give us some leeway to add further state.

In #62357 we introduced an additional optimization that allows us to skip the most of the fetch phase early if no results are found. This change caused some cancellation test failures that were relying on definitive cancellation during the fetch phase. This commit adds an additional quick cancellation check at the very beginning of the fetch phase to make cancellation process more deterministic. Fixes #62530

) In #62357 we introduced an additional optimization that allows us to skip the most of the fetch phase early if no results are found. This change caused some cancellation test failures that were relying on definitive cancellation during the fetch phase. This commit adds an additional quick cancellation check at the very beginning of the fetch phase to make cancellation process more deterministic. Fixes #62530

romseygeek added 10 commits September 10, 2020 14:05

WIP

f3153aa

Rationalise fetch phase exceptions

7966f1e

Merge branch 'fetch/exceptions' into fetch/fetchcontext

9c0ad25

Introduce FetchContext

66541df

Merge remote-tracking branch 'origin/master' into fetch/fetchcontext

a994425

reduce our surface area a little

8df1535

Merge remote-tracking branch 'origin/master' into fetch/fetchcontext

39f599c

null check on matchedqueries

e66798a

Merge remote-tracking branch 'origin/master' into fetch/fetchcontext

8c13e61

javadocs

d292e4f

romseygeek added :Search/Search Search-related issues that do not fall into other categories >breaking-java >refactoring v8.0.0 v7.10.0 labels Sep 15, 2020

romseygeek requested review from nik9000 and jtibshirani September 15, 2020 08:55

romseygeek self-assigned this Sep 15, 2020

elasticmachine added the Team:Search Meta label for search team label Sep 15, 2020

precommit

762f3a8

romseygeek commented Sep 15, 2020

View reviewed changes

romseygeek added 2 commits September 15, 2020 11:11

check for null query in explain

fa61e5b

test plugin failure

9f76bd7

jtibshirani reviewed Sep 15, 2020

View reviewed changes

Merge remote-tracking branch 'origin/master' into fetch/fetchcontext

12e9bbb

nik9000 reviewed Sep 16, 2020

View reviewed changes

feedback

5b429a8

Merge branch 'master' into fetch/fetchcontext

1ccde5e

jtibshirani approved these changes Sep 16, 2020

View reviewed changes

Merge branch 'master' into fetch/fetchcontext

ff13ae3

javanna approved these changes Sep 17, 2020

View reviewed changes

romseygeek merged commit 41bce50 into elastic:master Sep 17, 2020

romseygeek deleted the fetch/fetchcontext branch September 17, 2020 08:46

imotov mentioned this pull request Sep 17, 2020

Add an additional cancellation check to the fetch phase #62577

Merged

imotov mentioned this pull request Sep 17, 2020

[7.x] Add an additional cancellation check to the fetch phase (#62577) #62587

Merged

romseygeek mentioned this pull request Oct 6, 2020

Introduce FetchContext #62228

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Introduce FetchContext #62357

Introduce FetchContext #62357

Uh oh!

Conversation

romseygeek commented Sep 15, 2020

Uh oh!

elasticmachine commented Sep 15, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

romseygeek commented Sep 15, 2020

Uh oh!

jtibshirani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

romseygeek commented Sep 16, 2020

Uh oh!

romseygeek commented Sep 16, 2020

Uh oh!

romseygeek commented Sep 16, 2020

Uh oh!

romseygeek commented Sep 16, 2020

Uh oh!

jtibshirani left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

romseygeek commented Sep 17, 2020

Uh oh!

romseygeek commented Sep 17, 2020

Uh oh!

jtibshirani left a comment •

edited

Loading