ESQL: Add optimization to purge join on null merge key #127583

bpintea · 2025-04-30T18:22:46Z

This adds a new logical optimization rule to purge a Join in case the merge key(s) are null. The null detection is based on recognizing a tree pattern where the join sits atop a project and/or eval (possibly a few nodes deep) which contains a reference to a null, reference which matches the join key.

It works at coordinator planning level, but it's most useful locally, after insertions of nulls in the plan on detecting missing fields.

The Join is substituted with a projection with the same attributes as the join, atop an eval with all join's right fields aliased to null.

Closes #125577.

This adds a new logical optimization rule to purge a Join in case the merge key(s) are null. The null detection is based on recognizing a tree pattern where the join sits atop a project and/or eval which contains a reference to a null, reference which matches the join key. It works at coordinator planning level, but it's most useful locally, after insertions of nulls in the plan on detecting missing fields.

elasticsearchmachine · 2025-04-30T18:23:10Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-04-30T18:23:10Z

Hi @bpintea, I've created a changelog YAML for you.

bpintea · 2025-04-30T18:23:49Z

x-pack/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/expression/Alias.java

@@ -138,7 +138,7 @@ public String toString() {

    @Override
    public String nodeString() {
-        return child.nodeString() + " AS " + name();
+        return child.nodeString() + " AS " + name() + "#" + id();


Not strictly related, but not sure why we wouldn't include the id, it's easier to track which exactly reference points to an alias.

astefan · 2025-05-06T13:38:57Z

...ql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LocalPhysicalPlanOptimizerTests.java

+        var exchange = as(agg.child(), ExchangeExec.class);
+        agg = as(exchange.child(), AggregateExec.class);
+        var extract = as(agg.child(), FieldExtractExec.class);
+        var eval = as(extract.child(), EvalExec.class);


I think here we need to see the languages == null part.

Sure, added it.

astefan · 2025-05-06T15:12:03Z

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

+        var joinType = join.config().type();
+        if (joinType == INNER || joinType == LEFT) { // other types will have different replacement logic
+            AttributeMap.Builder<Expression> attributeMapBuilder = AttributeMap.builder();
+            loop: for (var child = join.left();; child = ((UnaryPlan) child).child()) { // cast is safe as both plans are UnaryPlans


In this loop you are collecting all the aliases that are contained in every Project and Eval on the left hand side of the LOOKUP until a LogicalPlan of a different type is found. Are there cases where evals and projects were not merged together already by CombineEvals and CombineProjections, respectively?

Are there cases where evals and projects were not merged together already by CombineEvals and CombineProjections, respectively?

Yes; but:

if the plan goes through multiple transformations, that's OK, this rule will apply apply as soon as the pattern Join - Project/Eval is detected; and the plan will converge to a stable state eventually.

the plan is most useful on the data node, where the plan has stabilised already (before taking into account the local conditions, that is).

alex-spies

Thanks @bpintea !

I focused on the actual optimizer rule, as I think @astefan already reviewed more deeply.

I left some remarks that I think should be addressed, but could yield a good optimizer rule. That said, I think there is potential for simplifying this together with other optimizer rules, namely ReplaceMissingFieldsWithNull and PropagateEvalFoldables, and this would hinge more on placing literals (null or not) into the join config.

I'll reach out offline to discuss this.

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

alex-spies · 2025-05-09T14:50:08Z

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

+import static org.elasticsearch.xpack.esql.plan.logical.join.JoinTypes.LEFT;
+
+/**
+ * The rule matches a plan pattern having a Join on top of a Project and/or Eval. It then checks if the join's performed on a field which


Looking for this pattern is restrictive IMO, but it's also not what the rule actually does. Leftover?

Also, this is repeating some of the logic of PropagateEvalFoldables. Could they share code? That one collects aliases from the plan when they point to literals (via potentially several indirections), which this rule also does. But PropagateEvalFoldables does only 1 pass to collect all aliases, while this rule descends back into the children whenever it finds a join, forgetting about the previous resolutions.

PropagateEvalFoldables has a boolean called shouldCollect. That's looks like it could become a more general predicate?

Looking for this pattern is restrictive IMO

It is more restrictive than I'd wish for it to be more effective, indeed, but I found no other way to detect null join keys (in stabilised plans, after all transformations). Any other command doing data transformations would need executing or "invasive" analysis to determine if a null just passes through. But happy to apply modifications if I overlooked solutions.
Non-data-transforming commands are pushed out of the way (a.t.p.).

but it's also not what the rule actually does. Leftover?

No, I think that's actually what the rule does; but maybe I misunderstood the question? :)

this is repeating some of the logic of PropagateEvalFoldables. Could they share code?

The refs collection in this rule is more restrictive than that in PropagateEvalFoldables and operates on few node types. I guess there might be a way to share code, but not sure it'll make it more legible.

But PropagateEvalFoldables does only 1 pass to collect all aliases, while this rule descends back into the children whenever it finds a join, forgetting about the previous resolutions.

Right; that's because it can only apply this rule on a specific tree pattern; if there were more nodes in-between (of different types), the rule wouldn't work.

PropagateEvalFoldables has a boolean called shouldCollect. That's looks like it could become a more general predicate?

The tree traversing would be different, though.

Looking for this pattern is restrictive IMO, but it's also not what the rule actually does. Leftover?

Sorry, you are correct; I didn't notice the break in the switch below and thought we'd just go over all children but looking only for evals and projections in it.

The refs collection in this rule is more restrictive than that in PropagateEvalFoldables and operates on few node types. I guess there might be a way to share code, but not sure it'll make it more legible.

My point is that both rules look for chains of aliases and propagate the result into a command that (transitively) depneds on a literal. I think we should have only 1 way to do this; if the approach is correct for PropagateEvalFoldables, it should also be correct here - and if it's not, then PropagateEvalFoldables probably has a bug and we need to find a different solution.

Conceptually, the difference is that PropagateEvalFoldables just places a literal in the downstream command - while we rather wish to prune it - but that could be solved either by to place a literal into the join config (which I think is interesting anyway because that allows even more optimizations).

alex-spies · 2025-05-09T14:50:12Z

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

+import static org.elasticsearch.xpack.esql.plan.logical.join.JoinTypes.LEFT;
+
+/**
+ * The rule matches a plan pattern having a Join on top of a Project and/or Eval. It then checks if the join's performed on a field which


Looking for this pattern is restrictive IMO, but it's also not what the rule actually does. Leftover?

Also, this is repeating some of the logic of PropagateEvalFoldables. Could they share code? That one collects aliases from the plan when they point to literal (via potentially several indirections), which this rule also does. But PropagateEvalFoldables does only 1 pass to collect all aliases, while this rule descends back into the children whenever it finds a join, forgetting about the previous resolutions.

alex-spies · 2025-05-09T14:54:53Z

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

+                    default -> {
+                        break loop;
+                    }


There could be other aliasing further down the plan. E.g. FROM ... | EVAL x = NULL | SORT whatever | LOOKUP JOIN ... ON x. In particular, I think this will not mesh well with the current implementation of ReplaceMissingFieldsWithNull which places EVAL missing = NULL at the very source of the query plan, right on top the EsRelation.

The rule applies last in a rule batch on coordinator, but more importantly, it applies on data nodes after the tree has stabilised on the coordinator, and only taking into account the local conditions can modify it (such as missing fields detection). The SORT (and LIMIT) will have been pushed further down by now. This is true for the coordinator too, btw, it just might take more rules batch iterations.

I think we shouldn't assume certain patterns for a rule to work. When more commands are added to ESQL, our assumptions will break and this means that unrelated changes can lead to (potentially very noticeable) performance regressions. Also, we might just not have thought about all the examples. For instance, couldn't there be a non-pushable WHERE clause that depends on an EVAL upstream from the join? With this approach, we wouldn't prune the join in this case even when the join key is missing from the index, I believe.

For this reason, I'd prefer if we went with a less pattern-reliant approach like in PropagateEvalFoldables.

I think we shouldn't assume certain patterns for a rule to work.

I took another look and reused the PropagateEvalFoldables logic to build the references. No particular pattern is assumed anymore. Thanks for insisting. :)

alex-spies · 2025-05-09T15:07:26Z

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

+                    }
+                }
+            }
+            for (var attr : AttributeSet.of(join.config().matchFields())) {


I think this shouldn't go over the match fields, but over the left + right fields. I don't trust the match fields right now, as their contract is never enforced and they only exist because our Join modeling got wonky.

That said, for the current implementation it's probably correct: only for the left fields can we ever know that they will be null before running the execution.

alex-spies · 2025-05-09T15:08:16Z

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

+    protected LogicalPlan rule(Join join) {
+        LogicalPlan plan = join;
+        var joinType = join.config().type();
+        if (joinType == INNER || joinType == LEFT) { // other types will have different replacement logic


If a field is always null in an INNER join, isn't the result empty? No match => no row goes into the result, no?

If a field is always null in an INNER join, isn't the result empty? No match => no row goes into the result, no?

You're right, INNER should not be considered. Fixed, thanks! Also renamed the rule to reflect this change.

alex-spies · 2025-05-09T15:17:35Z

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

+
+    private static LogicalPlan replaceJoin(Join join) {
+        var joinRightOutput = join.rightOutputFields();
+        if (joinRightOutput.isEmpty()) { // can be empty when the join key is null and the other right side entries pruned (by an agg)


They right side attributes can also be pruned by a keep, drop or by being shadowed - not just by aggs.

:) Yes.
The (silly) intention was not to go over the line limit, this being an inlined comment. Pulled the comment on its own line and slightly extended it. (Also, the projection-related commands would be moved out by now.)

alex-spies · 2025-05-09T15:17:54Z

...org/elasticsearch/xpack/esql/optimizer/rules/logical/local/PruneJoinOnNullMatchingField.java

+            return join.left();
+        }
+        List<Alias> aliases = new ArrayList<>(joinRightOutput.size());
+        // TODO: cache aliases by type, à la ReplaceMissingFieldWithNull#missingToNull (tho lookup indices won't have Ks of fields)


Yep, let's re-use that code, maybe!

Not sure if lookup indices will have much fewer fields, I expect them to grow over time.

Not sure if lookup indices will have much fewer fields, I expect them to grow over time.

Assuming here, but I believe that the Ks of fields occurrence stems mostly from ECS and lookup indices would probably grow too, but controlled (as they're more "manual"?). However this can be an issue with "generic" indices joins, indeed.

Yep, let's re-use that code, maybe!

If we won't remodel our approach, "simplifying this together with other optimizer rules, namely ReplaceMissingFieldsWithNull and PropagateEvalFoldables", I'd add this in an immediate follow-up, as it'd touch ReplaceMissingFieldsWithNull... but can also do it here, if you feel strongly about this. :)

No, I don't think this is a blocker. A follow-up is perfectly fine IMO. Thank you!

I'd add this in an immediate follow-up

I've pushed the change in this PR, after all.

…ll_merge_key

alex-spies

Unblocking because the ClassCastException that I thought I saw can't actually happen. My bad.

I'd prefer if we try consolidating the alias propagation logic with that from PropagateEvalFoldables and shared code. I believe this will be less brittle over time and sharing code will mean we have 1 mechanism to keep correct rather than 2.

However, the added tests and the initial approach already are an improvement in itself, so there's no reason to hold this PR.

I won't be available to review more deeply - could you please continue iterating with @astefan ?(Who already reviewed more precisely, anyway - thanks Andrei!)

…ll_merge_key

bpintea · 2025-05-20T14:59:26Z

I'd prefer if we try consolidating the alias propagation logic with that from PropagateEvalFoldables and shared code. I believe this will be less brittle over time and sharing code will mean we have 1 mechanism to keep correct rather than 2.

Pushed an update to reuse code from both PropagateEvalFoldables and ReplaceMissingFieldWithNull.

bpintea added 2 commits April 30, 2025 19:38

restrict join types the rule affects

d0ed566

bpintea added >enhancement auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v8.19.0 v9.1.0 labels Apr 30, 2025

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Apr 30, 2025

Update docs/changelog/127583.yaml

43922e8

github-actions bot deployed to docs-preview April 30, 2025 18:23 View deployment

bpintea commented Apr 30, 2025

View reviewed changes

alex-spies self-requested a review May 2, 2025 09:35

bpintea added 2 commits May 2, 2025 20:55

stabilize tests

8a484b8

Merge branch 'main' into enh/drop_join_on_null_merge_key

c82823f

github-actions bot deployed to docs-preview May 2, 2025 18:57 View deployment

bpintea requested review from costin and astefan May 2, 2025 21:07

astefan reviewed May 6, 2025

View reviewed changes

alex-spies requested changes May 9, 2025

View reviewed changes

bpintea added 2 commits May 12, 2025 15:14

Address review comments

33c0575

Merge remote-tracking branch 'upstream/main' into enh/drop_join_on_nu…

95037e9

…ll_merge_key

github-actions bot deployed to docs-preview May 12, 2025 22:21 View deployment

alex-spies approved these changes May 13, 2025

View reviewed changes

bpintea added 2 commits May 20, 2025 15:40

Refactor to reuse code

d7dc024

Merge remote-tracking branch 'upstream/main' into enh/drop_join_on_nu…

a81cefd

…ll_merge_key

github-actions bot deployed to docs-preview May 20, 2025 13:44 View deployment

[CI] Auto commit changes from spotless

099b965

github-actions bot deployed to docs-preview May 20, 2025 13:52 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ESQL: Add optimization to purge join on null merge key #127583

ESQL: Add optimization to purge join on null merge key #127583

bpintea commented Apr 30, 2025 •

edited

Loading

elasticsearchmachine commented Apr 30, 2025

elasticsearchmachine commented Apr 30, 2025

bpintea Apr 30, 2025

astefan May 6, 2025

bpintea May 12, 2025

astefan May 6, 2025 •

edited

Loading

bpintea May 12, 2025

alex-spies left a comment •

edited

Loading

alex-spies May 9, 2025

bpintea May 12, 2025

alex-spies May 13, 2025

alex-spies May 9, 2025

alex-spies May 9, 2025

bpintea May 12, 2025

alex-spies May 13, 2025

bpintea May 20, 2025

alex-spies May 9, 2025

alex-spies May 9, 2025

bpintea May 12, 2025

alex-spies May 9, 2025

bpintea May 12, 2025

alex-spies May 9, 2025

bpintea May 12, 2025

alex-spies May 13, 2025

bpintea May 20, 2025

alex-spies left a comment

bpintea commented May 20, 2025

ESQL: Add optimization to purge join on null merge key #127583

Are you sure you want to change the base?

ESQL: Add optimization to purge join on null merge key #127583

Conversation

bpintea commented Apr 30, 2025 • edited Loading

elasticsearchmachine commented Apr 30, 2025

elasticsearchmachine commented Apr 30, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

astefan May 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alex-spies left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alex-spies left a comment

Choose a reason for hiding this comment

bpintea commented May 20, 2025

bpintea commented Apr 30, 2025 •

edited

Loading

astefan May 6, 2025 •

edited

Loading

alex-spies left a comment •

edited

Loading