-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Introduce FetchContext #62357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce FetchContext #62357
Conversation
Pinging @elastic/es-search (:Search/Search) |
/** | ||
* The name of the index that documents are being fetched from | ||
*/ | ||
public String getIndexName() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is I think only used in error messages, and is duplicated by information in FetchPhaseExecutionException so we may be able to remove it.
|
||
/** | ||
* Gets index field data for a specific fieldtype | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be really nice to be able to replace this with a SearchLookup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? A method returning search lookup instead of index field data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't follow 100% how that makes things better but I guess I need to see this updated once #61995 gets merged to understand how the two interact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now gone entirely, but I think it would be useful to move SearchLookup directly into the FetchContext - it's not clear at present how to do that and get rid of the new QueryShardContext.newFetchLookup() method though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed on moving search lookup into fetch context, I think Nik proposed it as well somewhere.
Map<String, Set<String>> storedToRequestedFields = new HashMap<>(); | ||
FieldsVisitor fieldsVisitor = createStoredFieldsVisitor(context, storedToRequestedFields); | ||
if (context.docIdsToLoadSize() == 0) { | ||
// no individual hits to process, so we shortcut |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were checking in a couple of subphases for things like "how many docs are we executing against" or "is this a suggest-only response". This pulls this check up to the top level, which also means we don't need to include this information in the FetchContext.
@@ -33,8 +33,8 @@ | |||
public final class ExplainPhase implements FetchSubPhase { | |||
|
|||
@Override | |||
public FetchSubPhaseProcessor getProcessor(SearchContext context) { | |||
if (context.explain() == false || context.hasOnlySuggest()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hasOnlySuggest
check is now done at the top level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slight update; instead of checking for hasOnlySuggest()
, we check that query() != null
which is more general.
server/src/main/java/org/elasticsearch/search/fetch/subphase/FetchDocValuesPhase.java
Show resolved
Hide resolved
|
||
import java.io.IOException; | ||
|
||
public class FetchScorePhase implements FetchSubPhase { | ||
|
||
@Override | ||
public FetchSubPhaseProcessor getProcessor(SearchContext context) throws IOException { | ||
if (context.trackScores() == false || context.docIdsToLoadSize() == 0 || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic is all contained in FetchContext#fetchScores()
try { | ||
fieldSnippets = highlighter.highlightField(hitContext.reader(), hitContext.docId(), loadFieldValues); | ||
} catch (IOException e) { | ||
throw new FetchPhaseExecutionException(fieldContext.shardTarget, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IOExceptions are handled in the top-level FetchPhase
@elasticmachine run elasticsearch-ci/2 (seems unrelated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems really nice to scope down the context passed around during fetch.
I'm wondering about the higher-level plan/ design related to this direction. Will SearchContext
eventually be broken into separate objects QueryContext
and FetchContext
?
@@ -173,30 +172,4 @@ private HitContext hitExecuteMultiple(XContentBuilder source, boolean fetchSourc | |||
return hitContext; | |||
} | |||
|
|||
private static class FetchSourcePhaseTestSearchContext extends TestSearchContext { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sort of deletion is a good sign !
context.parsedQuery() == null) { | ||
return null; | ||
public FetchSubPhaseProcessor getProcessor(FetchContext context) throws IOException { | ||
Map<String, Query> namedQueries = new HashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking I understand, does this actually fix a bug where we wouldn't return named query information if there was only a post filter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's actually possible to have a post filter without a query, but the logic of what gets set where is pretty hairy so I thought it best to be extra defensive here.
hitContext.hit().getId()); | ||
TermVectorsResponse termVector = TermVectorsService.getTermVectors(context.indexShard(), termVectorsRequest); | ||
try { | ||
Terms terms = hitContext.reader().getTermVector(hitContext.docId(), field); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking I understand, is this just an optional clean-up ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, it means that we don't need to expose the whole index shard to the FetchContext.
server/src/main/java/org/elasticsearch/search/fetch/subphase/FetchDocValuesPhase.java
Show resolved
Hide resolved
/** | ||
* Create a FetchContext based on a SearchContext | ||
*/ | ||
public static FetchContext fromSearchContext(SearchContext context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the private ctor and static factory? I know its nice sometimes when you need the name to describe, but I think we don't here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started out with a few ways of building the FetchContext but have ended up with just the one, so the factory method is unnecessary. I'll get rid.
return new FetchContext(context); | ||
} | ||
|
||
private final SearchContext searchContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you float it to the top so it doesn't get lost? I get really confused if member variables aren't above member methods. I'm just super used to that.
* implementation should return {@code null} | ||
*/ | ||
FetchSubPhaseProcessor getProcessor(SearchContext searchContext, SearchLookup lookup) throws IOException; | ||
FetchSubPhaseProcessor getProcessor(FetchContext fetchContext, SearchLookup lookup) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think lookup
should be a member on FetchContext
? I know I just added it, but now that you are proposing our own "context" I think I'd be nice to fold it in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes! I gave it a quick try after the merge but I think we really want to remove the QueryShardContext#newFetchLookup() method at the same time and it's not obvious to me how to do that yet, so I thought I'd leave it for a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be ok moving to a member now if you're happy with it. It can wait too. Either way.
It isn't so much that we want to get rid of newFetchLookup
as that we want it to return a FetchLookup
instead of a SearchLookup
. We just haven't boiled out the common interface for those.
server/src/main/java/org/elasticsearch/search/fetch/subphase/FetchDocValuesPhase.java
Show resolved
Hide resolved
I don't really have further plans around SearchContext itself. What I'd like to do here is to move SearchLookup (and the associated SourceLookup) to FetchContext, and then look again at source fetching, because the whole 'loading source and additional stored fields' logic at the moment is very hairy - adding logic to load values from external sources will make it almost impossible to follow as things stand. |
@elasticmachine run elasticsearch-ci/packaging-sample-windows |
@elasticmachine run elasticsearch-ci/packaging-sample-windows again |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @romseygeek, all my comments have been addressed.
I don't really have further plans around SearchContext itself. What I'd like to do here is to move SearchLookup (and the associated SourceLookup) to FetchContext, and then look again at source fetching...
That makes sense -- even if we don't push further in making SearchContext
more modular, this PR is an improvement. And I'm looking forward to future PRs/ discussions around SearchLookup sharing.
@elasticmachine update branch |
@elasticmachine run elasticsearch-ci/packaging-sample-windows and do it properly this time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
We currently pass a SearchContext around to share configuration among FetchSubPhases. With the introduction of runtime fields, it would be useful to start storing some state on this context to be shared between different subphases (for example, stored fields or search lookups can be loaded lazily but referred to by many different subphases). However, SearchContext is a very large and unwieldy class, and adding more methods or state here feels like a bridge too far. This commit introduces a new FetchContext class that exposes only those methods on SearchContext that are required for fetch phases. This reduces the API surface area for fetch phases considerably, and should give us some leeway to add further state.
In #62357 we introduced an additional optimization that allows us to skip the most of the fetch phase early if no results are found. This change caused some cancellation test failures that were relying on definitive cancellation during the fetch phase. This commit adds an additional quick cancellation check at the very beginning of the fetch phase to make cancellation process more deterministic. Fixes #62530
) In #62357 we introduced an additional optimization that allows us to skip the most of the fetch phase early if no results are found. This change caused some cancellation test failures that were relying on definitive cancellation during the fetch phase. This commit adds an additional quick cancellation check at the very beginning of the fetch phase to make cancellation process more deterministic. Fixes #62530
We currently pass a SearchContext around to share configuration among
FetchSubPhases. With the introduction of runtime fields, it would be useful
to start storing some state on this context to be shared between different
subphases (for example, stored fields or search lookups can be loaded lazily
but referred to by many different subphases). However, SearchContext is a
very large and unwieldy class, and adding more methods or state here feels
like a bridge too far.
This commit introduces a new
FetchContext
class that exposes only thosemethods on SearchContext that are required for fetch phases. This reduces
the API surface area for fetch phases considerably, and should give us some
leeway to add further state.