Skip to content

CentroidCalculator needs protection against very small polygons #52774

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iverase opened this issue Feb 25, 2020 · 3 comments
Closed

CentroidCalculator needs protection against very small polygons #52774

iverase opened this issue Feb 25, 2020 · 3 comments
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes

Comments

@iverase
Copy link
Contributor

iverase commented Feb 25, 2020

I was working with one of my datasets on the geoshape doc values branch and some of the documents are now eroding out when indexing. After having a closer look, this polygons are very small (probably lines) and they only have four points. for example:

POLYGON(((-98.1912833 50.9375,-98.1912812 50.9375,-98.1912832 50.9375002,-98.1912833 50.9375)))

Although they can be more complicated and the problematic polygon is a hole:

POLYGON((-4.385064 55.2259599,-4.385056 55.2259224,-4.3850466 55.2258994,-4.3849755 55.2258574,-4.3849339 55.2258589,-4.3847033 55.2258742,-4.3846805 55.2258818,-4.3846282 55.2259132,-4.3846215 55.2259247,-4.3846121 55.2259683,-4.3846147 55.2259798,-4.3846369 55.2260157,-4.3846472 55.2260241,-4.3846697 55.2260409,-4.3846952 55.2260562,-4.384765 55.22608,-4.3848199 55.2260861,-4.3848481 55.2260845,-4.3849245 55.2260761,-4.3849393 55.22607,-4.3849996 55.2260432,-4.3850131 55.2260364,-4.3850426 55.2259989,-4.385064 55.2259599),(-4.3850104 55.2259583,-4.385005 55.2259752,-4.384997 55.2259892,-4.3849339 55.2259981,-4.3849272 55.2259308,-4.3850016 55.2259262,-4.385005 55.2259377,-4.3850104 55.2259583),(-4.3849996 55.2259193,-4.3847502 55.2259331,-4.3847548 55.2258921,-4.3848012 55.2258895,-4.3849219 55.2258811,-4.3849514 55.2258818,-4.3849728 55.2258933,-4.3849996 55.2259193),(-4.3849917 55.2259984,-4.3849849 55.2260103,-4.3849771 55.2260192,-4.3849701 55.2260019,-4.3849917 55.2259984),(-4.3846608 55.2259374,-4.384663 55.2259316,-4.3846711 55.2259201,-4.3846992 55.225904,-4.384718 55.2258941,-4.3847434 55.2258927,-4.3847314 55.2259407,-4.3849098 55.2259316,-4.3849098 55.2259492,-4.3848843 55.2259515,-4.3849017 55.2260119,-4.3849567 55.226005,-4.3849701 55.2260272,-4.3849299 55.2260486,-4.3849192 55.2260295,-4.384883 55.2260188,-4.3848776 55.2260119,-4.3848441 55.2260149,-4.3848441 55.2260226,-4.3847864 55.2260241,-4.384722 55.2259652,-4.3847053 55.2259706,-4.384683 55.225954,-4.3846608 55.2259374),(-4.3846541 55.2259549,-4.384698 55.2259883,-4.3847173 55.2259828,-4.3847743 55.2260333,-4.3847891 55.2260356,-4.3848146 55.226031,-4.3848199 55.2260409,-4.3848387 55.2260417,-4.3848494 55.2260593,-4.3848092 55.2260616,-4.3847623 55.2260539,-4.3847341 55.2260432,-4.3847046 55.2260279,-4.3846738 55.2260062,-4.3846496 55.2259844,-4.3846429 55.2259737,-4.3846523 55.2259714,-4.384651 55.2259629,-4.3846541 55.2259549),(-4.3846608 55.2259374,-4.3846559 55.2259502,-4.3846541 55.2259549,-4.3846608 55.2259374))

The issue is that when calculating the area, this value is zero and makes the centroid calculator to generate NaN for the coordinates which errors out when trying to serialise it.

cc @talevy

@iverase iverase added the :Analytics/Geo Indexing, search aggregations of geo points and shapes label Feb 25, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Geo)

@talevy
Copy link
Contributor

talevy commented Feb 25, 2020

I'll dig into this, the problem seems to be when a triangle in a polygon has zero weight, its centroid degrades to infinity.

for example, the second polygon you shared has a total ring area of 0, while the weighted centroid sum for each coordinate is something that is close to zero at 1e-11.

talevy added a commit to talevy/elasticsearch that referenced this issue Feb 25, 2020
there are times where small triangle areas within a polygon
have really small areas 1e-11, while the whole polygon's area is
zero. This results in an infinite valuation of the centroid point
representing that triangle. This commit ignores the addition of
such values

Addresses elastic#52774
talevy added a commit that referenced this issue Feb 26, 2020
there are times where small triangle areas within a polygon
have really small areas 1e-11, while the whole polygon's area is
zero. This results in an infinite valuation of the centroid point
representing that triangle. This commit ignores the addition of
such values

Addresses #52774
@talevy
Copy link
Contributor

talevy commented Feb 26, 2020

Fixed in #52782

@talevy talevy closed this as completed Feb 26, 2020
talevy added a commit that referenced this issue Feb 26, 2020
there are times where small triangle areas within a polygon
have really small areas 1e-11, while the whole polygon's area is
zero. This results in an infinite valuation of the centroid point
representing that triangle. This commit ignores the addition of
such values

Addresses #52774
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Geo Indexing, search aggregations of geo points and shapes
Projects
None yet
Development

No branches or pull requests

3 participants