Skip to content

Fix NPE when values is omitted on percentile_ranks agg #26046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 15, 2017

Conversation

polyfractal
Copy link
Contributor

An array of values is required because there is no default (or reasonable way to set a default). But validation for values only happens if it is actually set on the builder. If the values param is omitted entirely then values is passed in as null, and later NPEs when attempting to iterate over the ranks.

I wasn't sure the best place to do this validation, since regular aggs don't have a validate() or equivalent. And I couldn't find a way to make params required in the ObjectParser. So checking right before the factory is built seemed like the best/only place?

@cbuescher would you mind reviewing this?

@jpountz
Copy link
Contributor

jpountz commented Aug 7, 2017

I'm ok with this change, but I think that a better fix might be to make values a required argument of PercentileRanksAggregationBuilder and remove the setter? This way it won't be possible to create illegal instances of PercentileRanksAggregationBuilder?

@cbuescher cbuescher self-assigned this Aug 7, 2017
Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@polyfractal I think checking values is a good change, however I'd also like to try removing the setter and try to validate the values as part of the constructor if it is mandatory. I think using ConstructingObjectParser should work in this case. I haven't tried but I can give it a go if you are stuck somewhere.

@polyfractal
Copy link
Contributor Author

Ah right, I didn't think about ConstructingObjectParser. ++ will make the changes!

@polyfractal polyfractal force-pushed the fixPercentileRanksNPE branch 2 times, most recently from e6784d4 to b33e6f1 Compare August 9, 2017 17:55
@polyfractal
Copy link
Contributor Author

Alright, so @cbuescher, @nik9000 and I chatted about this earlier today. The way aggregations are built makes it tricky to use ConstructingObjectParser. To construct the object, we need both an externally provided value (aggregationName) and a value from the request (values).

We kicked around a few ideas:

  1. Seeing if the NamedObject parsers would potentially work
  2. Closing over the name when calling parse
  3. Merging as-is and enhancing ConstructingObjectParser in a follow up PR to support this workflow a bit better
  4. Using the parser context as a way to pass in the aggregation name

We agreed to see if the context approach would work, and if not, fall back to option 3 (merge as-is and follow up with another PR).

The most recent commit adjusts to use ConstructingObjectParser. The change was a bit more invasive than I wanted; the helpers that add common numeric fields (missing, etc) only supported ObjectParser. So I added support for Constructing, as well as generifying the methods so that the context wasn't hardcoded to Void.

The rest of the changes are just fixing up the tests.

Sooo... thoughts? Keep this, or revert to the simple exception and trying to enhance ConstructingObjectParser?

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @polyfractal, thats a great change. I left a couple of comments how to maybe simplify things a little more and some nits but I think this is almost good to go.

@@ -295,6 +295,10 @@ public Value parse(XContentParser parser, Context context) throws IOException {
}
}

public String getName() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, I think I did this in some WIP branch as well the other day. Didn't get back to it though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is implemented also in ObjectParser, I would make this an abstract method in AbstractObjectParser. I think this way ValuesSourceParserHelper.declareFields() could work AbstractObjectParser without all the current duclication. Not 100% sure though but I'd give it a try.

@@ -99,4 +100,72 @@ public static void declareGeoFields(
}
}

public static <T> void declareAnyFields(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole block can go if all the existing declare* methods in this class take AbstractObjectParser instead of ObjectParser as an argument. That works when getName() (see above) is mave abstract in AbstractObjectParser.

static {
PARSER = new ObjectParser<>(PercentileRanksAggregationBuilder.NAME);
PARSER = new ConstructingObjectParser<>(PercentileRanksAggregationBuilder.NAME, false,
(a, context) -> new PercentileRanksAggregationBuilder(context, (List)a[0]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space between after (List)

(b, v) -> b.values(v.stream().mapToDouble(Double::doubleValue).toArray()),
VALUES_FIELD);

PARSER.declareDoubleArray(ConstructingObjectParser.constructorArg(), VALUES_FIELD);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe static import for ConstructingObjectParser.constructorArg, that makes it more readable IMHO.

@@ -97,7 +98,7 @@
}

public static AggregationBuilder parse(String aggregationName, XContentParser parser) throws IOException {
return PARSER.parse(parser, new PercentileRanksAggregationBuilder(aggregationName), null);
return PARSER.parse(parser, aggregationName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see now how you are piggybacking the name in the Context slot. maybe worth a comment here to remember the "why" in a few weeks from now.

@@ -152,8 +153,10 @@ public void testUnmapped() throws Exception {
.prepareSearch("idx_unmapped")
.setQuery(matchAllQuery())
.addAggregation(
percentileRanks("percentile_ranks").method(PercentilesMethod.HDR).numberOfSignificantValueDigits(sigDigits)
.field("value").values(0, 10, 15, 100))
percentileRanks("percentile_ranks", new double[]{0,10,15,100})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: spaces between the values

super(name, ValuesSourceType.NUMERIC, ValueType.NUMERIC);
if (values == null) {
throw new IllegalArgumentException("[values] must not be null: [" + name + "]");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this include a check for an empy array? Not sure, just a though. In case the answer is yes, maybe also add a test for that.

@polyfractal
Copy link
Contributor Author

Review comments addressed, definitely cleaner with the change to AbstractObjectParser :)

Also added a few more tests to the AggregatorTestCase tests, since they are easy and fast to execute.

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM. I left two small nits about comments but nothing that needs another review round once CI is green.

@@ -74,12 +76,15 @@
new ParseField("number_of_significant_value_digits"));
}

// The builder requires to parameters for the constructor: aggregation name and values array. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/to/two

@@ -98,6 +103,8 @@
}

public static AggregationBuilder parse(String aggregationName, XContentParser parser) throws IOException {
// the aggregation name is supplied to the parser as a Context, a bit hacky but it works well
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I don't think its hacky ;-) Its okay as long as we have the context argument, when we try to remove it one day it might be useful to know why we use it here.

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I forgot one thing, since this changes the aggregation constructor and the AggregationBuilders helper signature without keeping the old version, I think this should be marked as breaking. I don't know which versions this was supposed to go to but in any case I think some migration docs would be needed.

An array of values is required because there is no default (or
reasonable way to set a default).  But validation for values
only happens if it is actually set.  If the values param is omitted
entirely than the agg builder will NPE.
- Moves the parser to ConstructingObjectParser
- Uses the context to pass in the aggregation name
- Adds getName() to ConstructingObjectParser
- Adds more helpers to ValuesSourceParserHelper to support Constructing variety
@polyfractal polyfractal force-pushed the fixPercentileRanksNPE branch from f600ea5 to d5b5eb8 Compare August 15, 2017 15:08
@polyfractal polyfractal merged commit d26becc into elastic:master Aug 15, 2017
polyfractal added a commit that referenced this pull request Aug 15, 2017
An array of values is required because there is no default (or
reasonable way to set a default).  But validation for values
only happens if it is actually set.  If the values param is omitted
entirely than the agg builder will NPE.
polyfractal added a commit that referenced this pull request Aug 15, 2017
An array of values is required because there is no default (or
reasonable way to set a default).  But validation for values
only happens if it is actually set.  If the values param is omitted
entirely than the agg builder will NPE.
jasontedor added a commit to glefloch/elasticsearch that referenced this pull request Aug 16, 2017
* master: (458 commits)
  Prevent cluster internal `ClusterState.Custom` impls to leak to a client (elastic#26232)
  Add packaging test for systemd runtime directive
  [TEST] Reenable RareClusterStateIt#testDeleteCreateInOneBulk
  Serialize and expose timeout of acknowledged requests in REST layer (elastic#26189)
  (refactor) some opportunities to use diamond operator (elastic#25585)
  [DOCS] Clarified readme for testing a single page
  Settings: Add keystore.seed auto generated secure setting (elastic#26149)
  Update version information (elastic#25226)
  "result" : created -> "result" : "created" (elastic#25446)
  Set RuntimeDirectory (elastic#23526)
  Drop upgrade from full cluster restart tests (elastic#26224)
  Further improve docs for requests_per_second
  Docs disambiguate reindex's requests_per_second (elastic#26185)
  [DOCS] Cleanup link for ec2 discovery (elastic#26222)
  Fix document field equals and hash code test
  Use holder pattern for lazy deprecation loggers
  Settings: Add keystore creation to add commands (elastic#26126)
  Docs: Cleanup docs for ec2 discovery (elastic#26065)
  Fix NPE when `values` is omitted on percentile_ranks agg (elastic#26046)
  Several internal improvements to internal test cluster infra (elastic#26214)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants