Skip to content

SQL: Convert ST_Distance into query when possible #40595

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 2, 2019

Conversation

imotov
Copy link
Contributor

@imotov imotov commented Mar 28, 2019

Adds additional optimization logic to convert ST_Distance function
calls into geo_distance query when it is called in WHERE clauses.

Adds additional optimization logic to convert ST_Distance function
calls into geo_distance query when it is called in WHERE clauses.
@imotov imotov added >enhancement :Analytics/Geo Indexing, search aggregations of geo points and shapes :Analytics/SQL SQL querying v8.0.0 labels Mar 28, 2019
@imotov imotov requested review from costin, astefan and matriv March 28, 2019 15:54
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment about BinaryOperator class and operator swapping.

@@ -134,6 +135,8 @@ public LogicalPlan optimize(LogicalPlan verified) {
// needs to occur before BinaryComparison combinations (see class)
new PropagateEquals(),
new CombineBinaryComparisons(),
// Geo
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already a rule that does this for BinaryOperators - if StDistance extends that calls (and it looks like it can), the swapping will happen automatically.

@@ -675,6 +680,21 @@ private static Query translateQuery(BinaryComparison bc) {
return new RangeQuery(source, name, value, true, null, false, format);
}
if (bc instanceof LessThan) {
if (bc.left() instanceof StDistance && value instanceof Number) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect more similar functions to be optimized in a similar fashion? If the answer is yes, then likely we could have more infrastructure to help identify such patterns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we will. There will be some cases for shapes, but I don't see the similarity clearly at the moment to create a good abstraction at the moment. I would rather postpone this until we get there and then try to wrap it.

@@ -675,6 +680,21 @@ private static Query translateQuery(BinaryComparison bc) {
return new RangeQuery(source, name, value, true, null, false, format);
}
if (bc instanceof LessThan) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't a similar query be applied for LessThanOrEqual?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. Now that I am thinking about it and after discussions with @iverase, it looks like it should be applied only to LessThanOrEqual, but since the calculation is not precise, maybe we can get away with implementing it for both.

@@ -83,4 +72,22 @@ protected Pipe makePipe() {
protected String scriptMethodName() {
return "stDistance";
}

public static class StDistanceFunction implements PredicateBiFunction<Object, Object, Double> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would define it in as a normal class in a separate file instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

import static org.elasticsearch.xpack.sql.expression.gen.script.ParamsBuilder.paramsBuilder;

/**
* Calculates the distance between two points
*/
public class StDistance extends BinaryScalarFunction {
public class StDistance extends BinaryOperator<Object, Object, Double, StDistance.StDistanceFunction> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename it to StDistanceOperator

Copy link
Contributor Author

@imotov imotov Mar 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ST_DISTANCE(sh1, sh2) is not an operator, we are just reusing some logic that can be applied to both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I had a mistake. My suggestion is to rename StDistance to StDistanceOperation and StDistanceFunction to StDistance similarly to other functions/operators in SQL.

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor Author

@imotov imotov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matriv @astefan @costin Thanks for the reviews!

@costin I think I addressed your comments fully, so I am going to merge this in because my next PR builds on the top of this one. Please let me know if I missed something and I will address it as a follow up.

@imotov imotov merged commit ed0c0e6 into elastic:geosql Apr 2, 2019
@elasticmachine elasticmachine mentioned this pull request Apr 10, 2019
13 tasks
@imotov imotov removed the v8.0.0 label Apr 24, 2019
@imotov imotov deleted the geosql-distance-as-range branch May 1, 2020 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes :Analytics/SQL SQL querying >enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants