-
Notifications
You must be signed in to change notification settings - Fork 25.2k
ESQL: Add a pre-mapping logical plan processing step #120368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This adds a processing step that runs rules which can modify an optimized LogicalPlan by executing async calls to the services avaialble to ESQL's transport query action. The rules need to implement a new interface (`MappingPreProcessor`). The resulting plan won't be further optimized logically, but mapped and further processed as a physical plan.
Hi @bpintea, I've created a changelog YAML for you. |
Pinging @elastic/es-analytical-engine (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a first round of comments.
} | ||
|
||
return queryAsObject; | ||
return BytesRefs.toString(queryAsObject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch
ResolvedIndices resolvedIndices = ResolvedIndices.resolveWithIndexNamesAndOptions( | ||
indexNames.toArray(String[]::new), | ||
IndexResolver.FIELD_CAPS_INDICES_OPTIONS, | ||
services.clusterService().state(), | ||
services.indexNameExpressionResolver(), | ||
services.transportService().getRemoteClusterService(), | ||
System.currentTimeMillis() | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indices have already been resolved through field-caps; double check if this information can be reused instead of going back to string format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ResolvedIndices
splits the indices by local and remote and uses the string format for that. Not sure if/how we can simplify this.
* {@link FullTextFunctionMapperPreprocessor#preprocess(LogicalPlan, TransportActionServices, ActionListener)} will rewrite the plan by | ||
* replacing {@link FullTextFunction} expression with new ones that hold rewritten {@link QueryBuilder}s. | ||
*/ | ||
public class FullTextFunctionMapperPreprocessor implements MappingPreProcessor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses a pull approach where a built-in pre-processor searches for FullTextFunctions before rewriting the query.
A push approach would be better, similar to the Aware interfaces: can the TranslationAware be used instead or potentially an extension to it.
Specifically this stage is important only for queries that depend on QueryRewriteContext (not its subclasses) - creating this context is not trivial so checking whether this is needed is valuable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reworked this part, to have the FullTextFunction
s return the preprocessor as a singleton. This way the preprocessors are "discovered" in the plan.
); | ||
} | ||
|
||
public LogicalPlan updateQueryBuilders(LogicalPlan plan, Map<FullTextFunction, QueryBuilder> newQueryBuilders) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't the rewrite applied on the spot and instead an associated map is used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reworked this bit. But my guess is that the intention was to avoid multiple tree passes.
* Note that the LogicalPlan following the rules' changes will not undergo another logical optimization round. The changes these rules | ||
* should apply are only those that require access to services that need to be performed asynchronously. | ||
*/ | ||
public interface MappingPreProcessor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be multiple preprocessors?
Why not have a dedicated "stage" such as preMapping() (similar to preAnalyze) and encapsulate the logic there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be multiple preprocessors?
That was my assumption, yes.
Why not have a dedicated "stage" such as preMapping() (similar to preAnalyze) and encapsulate the logic there?
There are some particularities to how queries are repeatedly rewritten (and stopping that is done by identity checking of a return object) -- these should remain particular to FullTextFunction
IMO and this logic stay within a dedicated class.
But the execution of the preprocessor is done at a dedicated stage.
Not sure if the current format meets your expectation, but happy to shape it further if it can be done better.
...ck/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/capabilities/TranslationAware.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are needed to make the FullTextFunction
s instances distinguishable by their SemanticQueryBuilder
field having or not an inferenceResultsSupplier
; which is needed to update the query tree in place.
Deprecated by a simpler #121260. |
WIP
This adds a processing step that runs rules which can modify an optimized LogicalPlan by executing async calls to the services avaialble to ESQL's transport query action. The rules need to implement a new interface (
MappingPreProcessor
). The resulting plan won't be further optimized logically, but mapped and further processed as a physical plan.