-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Add bounds support for geogrid agg on shapes #51973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bounds support for geogrid agg on shapes #51973
Conversation
This PR cleans up some aspects of GeoShapeCellValues to support the specialization of bounded geo_shape geo-grid aggregations.
Pinging @elastic/es-analytics-geo (:Analytics/Geo) |
This PR is still a WIP because its tests exposed existing bugs in the TriangleTreeReader#relate logic. This logic is well tested in The picture below depicts the unwanted behavior: These edge tiles should not be accounted for, outer edges in the triangle tree that are collinear with the queried tiles should not be considered CROSSES in these cases if these are not to be included. The example above is actually solved in the non-dateline-wrapping case using a hack in the GeoGridTiler, but that hack is not useful for when the shape's bounds crosses the dateline. |
run elasticsearch-ci/packaging-sample-matrix-unix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the biggest issue at the moment in the PR is that we need to filter tiles when we collect them from fully contained tiles. This makes me wonder that this should be different between the bounded and unbounded case, what do you think?
server/src/main/java/org/elasticsearch/index/fielddata/MultiGeoValues.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridTiler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridTiler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridTiler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridTiler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/index/fielddata/MultiGeoValues.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridTiler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridTiler.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/search/aggregations/bucket/geogrid/GeoGridTiler.java
Outdated
Show resolved
Hide resolved
I am not sure if the current approach is the most efficient. We are wrapping the bounds around the geo value, I think it would be more efficient to wrap it around the GridTiler? the tiler can call an abstract method to check if the tile is in bounds. For the unbounded case the answer will always be true, for the bounder, it will perform the check against the provided bounds. |
This refactor reverts some of the BoundedCellValues constructs. Instead, BoundedGeoTileGridTiler and BoundedGeoHashGridTiler are introduced. As part of this change, the definition/semantics of geo_grid aggs with bounds on geo_point are modified to match the same behavior as geo_shapes, where it is the tile of the point that must intersect the bounds in order for the point to be accounted for
I am still thinking of how to best reduce the duplicate code around the GeoBoundingBox parts of both Geohash and Geotile, but let me know if the update is what you had in mind! thanks Ignacio |
.../main/java/org/elasticsearch/search/aggregations/bucket/geogrid/BoundedGeoTileGridTiler.java
Show resolved
Hide resolved
.../main/java/org/elasticsearch/search/aggregations/bucket/geogrid/BoundedGeoTileGridTiler.java
Show resolved
Hide resolved
@tal, yes, I think in this way the unbounded case behaves the same as before. Thanks! |
I am not sure there is a much better way to re-use the bounded logic that is less verbose. I think it may only be worth re-visiting once an additional geogrid agg is every introduced. so this PR is ready for a final pass if you'd like. thanks @iverase! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
thanks! |
This PR cleans up some aspects of GeoShapeCellValues to support the specialization of bounded geo_shape geo-grid aggregations. This refactor reverts some of the BoundedCellValues constructs. Instead, BoundedGeoTileGridTiler and BoundedGeoHashGridTiler are introduced. As part of this change, the definition/semantics of geo_grid aggs with bounds on geo_point are modified to match the same behavior as geo_shapes, where it is the tile of the point that must intersect the bounds in order for the point to be accounted for
This commit adds support for
geo-grid
aggregations withbounds
parameteron geo_shape doc values
This also modifies the existing geo_point handling of
bounds
to be consistent with shapes.