Skip to content

Significant text aggregation nullpointers if subject field does not exist. #64045

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
markharwood opened this issue Oct 22, 2020 · 8 comments · Fixed by #64144
Closed

Significant text aggregation nullpointers if subject field does not exist. #64045

markharwood opened this issue Oct 22, 2020 · 8 comments · Fixed by #64144
Assignees
Labels
:Analytics/Aggregations Aggregations >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@markharwood
Copy link
Contributor

This bug looks to have crept in since version 7-something.
Given a query like:

{
	"size": 0,
	"aggs": {
		"s": {
			"significant_text": {
				"field": "thisDoesNotExist"
			}
		}
	}
}

Version 6.6 responds with

"Aggregation [s] cannot process field [thisDoesNotExist] since it is not present"

But 7.10 onwards errors with:

{
  "error": {
	"root_cause": [
	  {
		"type": "null_pointer_exception",
		"reason": "Cannot invoke \"org.elasticsearch.index.mapper.MappedFieldType.indexAnalyzer()\" because the return value of \"org.elasticsearch.search.aggregations.bucket.terms.SignificantTextAggregatorFactory$SignificantTextCollectorSource.access$400(org.elasticsearch.search.aggregations.bucket.terms.SignificantTextAggregatorFactory$SignificantTextCollectorSource)\" is null"
	  }
	],
	"type": "search_phase_execution_exception",
	"reason": "all shards failed",
	"phase": "query",

Stack trace is

Caused by: java.lang.NullPointerException: Cannot invoke "org.elasticsearch.index.mapper.MappedFieldType.indexAnalyzer()" because the return value of "org.elasticsearch.search.aggregations.bucket.terms.SignificantTextAggregatorFactory$SignificantTextCollectorSource.access$400(org.elasticsearch.search.aggregations.bucket.terms.SignificantTextAggregatorFactory$SignificantTextCollectorSource)" is null
		at org.elasticsearch.search.aggregations.bucket.terms.SignificantTextAggregatorFactory$SignificantTextCollectorSource$1.collectFromSource(SignificantTextAggregatorFactory.java:219) ~[elasticsearch-7.10.0-SNAPSHOT.jar:7.10.0-SNAPSHOT]
		at org.elasticsearch.search.aggregations.bucket.terms.SignificantTextAggregatorFactory$SignificantTextCollectorSource$1.collect(SignificantTextAggregatorFactory.java:189) ~[elasticsearch-7.10.0-SNAPSHOT.jar:7.10.0-SNAPSHOT]
		at org.elasticsearch.search.aggregations.LeafBucketCollectorBase.collect(LeafBucketCollectorBase.java:60) ~[elasticsearch-7.10.0-SNAPSHOT.jar:7.10.0-SNAPSHOT]
		at org.elasticsearch.search.aggregations.bucket.terms.MapStringTermsAggregator$SignificantTermsResults$1.collect(MapStringTermsAggregator.java:472) ~[elasticsearch-7.10.0-SNAPSHOT.jar:7.10.0-SNAPSHOT]
		at org.elasticsearch.search.aggregations.LeafBucketCollector.collect(LeafBucketCollector.java:107) ~[elasticsearch-7.10.0-SNAPSHOT.jar:7.10.0-SNAPSHOT]
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 22, 2020
@markharwood
Copy link
Contributor Author

markharwood commented Oct 22, 2020

@jtibshirani - this may be going back a bit but it looks like null field types became acceptable with field alias work in 15ff3da . Can you offer any insight on when might be the best point to validate the field name is not resolved?

@nik9000
Copy link
Member

nik9000 commented Oct 22, 2020

@not-napoleon is who I'd turn to to ask about this sort of thing in aggs. I don't know this code super duper well so I don't have a strong opinion other than to bring @not-napoleon into the conversation.

@romseygeek
Copy link
Contributor

romseygeek commented Oct 22, 2020

I think the change in behaviour actually came in 91813ea when we merged sigterms and sigtext; that reworked the collectFromSource method that had previously been passed a MappedFieldType and so could check whether or not it was null.

I think the simplest fix is to enforce field existence in the SignificantTextAggregatorFactory constructor?

@markharwood
Copy link
Contributor Author

I think the simplest fix is to enforce field existence in the SignificantTextAggregatorFactory constructor?

It was this comment from 15ff3da that made me hold off from that:

    // Note that if the field is unmapped (its field type is null), we don't fail,
    // and just use the given field name as a placeholder.

@nik9000
Copy link
Member

nik9000 commented Oct 22, 2020

Oh boy! Past Nik is at it again, breaking stuff.

@jtibshirani
Copy link
Contributor

this may be going back a bit but it looks like null field types became acceptable with field alias work in 15ff3da

Reviewing the change where I added this comment, it seems I was just trying to maintain the current logic (which didn't explicitly check for a null field type), and didn't mean to assert that we should support significant_text on unmapped fields.

@markharwood
Copy link
Contributor Author

Thanks y'all for the feedback.
I added a null test to the SignificantTextAggregatorFactory constructor

markharwood added a commit that referenced this issue Oct 26, 2020
…t exist (#64144)

Fixed Nullpointer in significant text agg - added test
Closes #64045
markharwood added a commit to markharwood/elasticsearch that referenced this issue Oct 26, 2020
…t exist (elastic#64144)

Fixed Nullpointer in significant text agg - added test
Closes elastic#64045
markharwood added a commit that referenced this issue Oct 26, 2020
…t exist (#64144) (#64157)

Fixed Nullpointer in significant text agg - added test
Closes #64045
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/Aggregations Aggregations >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
5 participants