-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Aggregation Framework Refactoring meta #41713
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-analytics-geo |
…() (#43214) A number of the aggregation base classes have an abstract doEquals() and doHashCode() (e.g. InternalAggregation.java, AbstractPipelineAggregationBuilder.java). Theoretically this is so the sub-classes can add to the equals/hashCode and don't need to worry about calling super.equals(). In practice, it's mostly just confusing/inconsistent. And if there are more than two levels, we end up with situations like InternalMappedSignificantTerms which has to call super.doEquals() which defeats the point of having these overridable methods. This PR removes the do versions and just use equals/hashCode ensuring the super when necessary. This PR is part of #41713 refactoring meta * Refactored all subclasses of InternalAggregation to remove doEquals() and doHashCode() * Refactored all subclasses of AbstractPipelineAggregationBuilder to remove doEquals() and doHashCode() * Refactored all subclasses of AbstractAggregationBuilder and CompositeValuesSourceBuilder to remove doEquals() and doHashCode()
…() (elastic#43214) A number of the aggregation base classes have an abstract doEquals() and doHashCode() (e.g. InternalAggregation.java, AbstractPipelineAggregationBuilder.java). Theoretically this is so the sub-classes can add to the equals/hashCode and don't need to worry about calling super.equals(). In practice, it's mostly just confusing/inconsistent. And if there are more than two levels, we end up with situations like InternalMappedSignificantTerms which has to call super.doEquals() which defeats the point of having these overridable methods. This PR removes the do versions and just use equals/hashCode ensuring the super when necessary. This PR is part of elastic#41713 refactoring meta * Refactored all subclasses of InternalAggregation to remove doEquals() and doHashCode() * Refactored all subclasses of AbstractPipelineAggregationBuilder to remove doEquals() and doHashCode() * Refactored all subclasses of AbstractAggregationBuilder and CompositeValuesSourceBuilder to remove doEquals() and doHashCode() Backport from v8.0.0
Procedural note: I've moved the test-related items to their own ticket: #51324 |
Adding |
I think most of what's left on this one is #82273. |
I moved "Allow aggs to "see" up the tree" into its own issue #82290. This was the last outstanding issue in this meta issue, so we can now close it. |
This is a running list of things in the agg framework we might like to refactor.
Agg framework
Agg boilerplate simplification.
Aggregations require a lot of boilerplate. Most aggregations will require a Builder, Aggregator, AggregatorFactory, result interface, parsed result interface and Internal result object. Simplifying this would make it easier to develop new aggs and collapse the agg package considerably
#43664
ValuesSource/ValuesSourceConfig/ValueType/ValueSourceType
We have a number of classes and enums that represent what kind of data a field is. They all exist for variously good reasons, but it's difficult to grok their purpose as a new developer and how they all interact. If we could simplify or unify them somehow it would be helpful.
Or at the very least come up with clearer names so their purpose is more obvious. And javadocs explaining their purpose
#42949
HashCode/doHashCode/doInnerHashCode
DONE: The agg framework has mixed style when it comes to asking child objects for their hashcode. Some override hashcode directly, some require
doHashCode
or evendoInnerHashCode()
. Doesn't really matter which style is used but it should be consistent#43214
equals/doEquals/doInnerEquals
DONE: Same as hashcodes, but with equals
#43214
Replace
DocValuesIndexFieldData
builder with static methods?The
DocValuesIndexFieldData
builder is a bit funky; instantiated by itself it returns a sortedSetDV, invoked withnumericType()
it returns a sortedNumericDV, withscriptFunction
it returns a sortedSet with the script. And there's a catch-all override for special binary DVs. None of this is very builder'y since the parameters are mutually exclusive, and it's really just a thin wrapper for constructing DVs.It might be simpler to replace this with a set of static helper methods that callers can use to generate DVs and ditch the builder model.
Separate XContent serialization from reduce logic?
Some classes, like agg buckets, combine both xcontent serialization and reduction logic. This leads to very large classes, and is a confusing mixture of concerns. Is it possible to separate this? Is it a good idea?
Allow aggs to "see" up the tree?
Aggs are isolated by design, and only know about their direct sub-aggs (and only in a limited manner). If aggs could look up the tree, it would open up new agg possibilities. And it would likely help with the pipeline agg refactoring. This would add complexity, and the isolation helps keep everything simple/composable.
The text was updated successfully, but these errors were encountered: