[Rollup] Use composite's missing_bucket #31402

polyfractal · 2018-06-18T17:03:45Z

We can leverage the composite agg's new missing_bucket feature on terms groupings. This means the aggregation criteria used in the indexer will now return null buckets for missing keys. We then index these and rely on a default null_value on the Rollup's mapping to inject a placeholder.

On the search side of the house, we can remove the placeholder when unrolling the response.

By indexing null values, we can guarantee correct doc counts with "combined" jobs (where a job rolls up multiple schemas). This was previously impossible since composite would ignore documents that didn't have all the keys, meaning non-overlapping schemas would cause composite to return no buckets.

The documentation has been adjusted to recommend a single, combined job. It also makes reference to the previous issue to help users that are upgrading (rather than just deleting the sections).

BWC
This change is somewhat incompatible with older rollup jobs, because all jobs in a shared index share the same mapping templates. So if the template was updated, new jobs would be "fixed" but old jobs would start getting null_values and end up in an inconsistent state. But not updating the index meant the new jobs would also be "broken".

I opted to just disallow new jobs from being created rollup indices that were pre-6.4.0. So the user will have to create a new rollup index for new jobs.

Important Note:
This PR enables missing_bucket on all groups (terms, histo, date_histo), but we only enable a null_value on string terms. There's not a reasonable default missing value for numerics as all numerics are technically a real value. I want to follow this PR up with another that allows the user to configure a missing value for histo/date_histo if they desire.

Otherwise, the null values will just be indexed as null and dropped as usual, so this only "fixes" strings. But that's the major problem at the moment.

We can leverage the composite agg's new `missin_bucket` feature on terms groupings. This means the aggregation criteria used in the indexer will now return null buckets for missing keys. We then index these and rely on a default `null_value` on the Rollup's mapping to inject a placeholder. On the search side of the house, we can remove the placeholder when unrolling the response. By indexing null values, we can guarantee correct doc counts with "combined" jobs (where a job rolls up multiple schemas). This was previously impossible since composite would ignore documents that didn't have _all_ the keys, meaning non-overlapping schemas would cause composite to return no buckets. The docs have been adjusted to recommend a single, combined job. It also makes reference to the previous issue to help users that are upgrading (rather than just deleting the sections). Because the mapping change is incompatible with prior versions, this PR forbids 6.4.0+ jobs from being created in a pre-6.4.0 index.

elasticmachine · 2018-06-18T17:03:47Z

Pinging @elastic/es-search-aggs

colings86 · 2018-06-22T11:47:17Z

x-pack/docs/en/rollup/understanding-groups.asciidoc

-_all_ of the keys `A, B` and `C`.
+There was previously a limitation in how Rollup could handle indices that had heterogeneous mappings (multiple, unrelated/non-overlapping
+mappings).  The recommendation at the time was to configure a separate job per data "type".  For example, you might configure a separate
+job for each Beat module that you had enabled (one for `process`, another for `filesystem`, etc).


nit: I think it should be Beats module?

colings86 · 2018-06-22T11:51:10Z

x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/Rollup.java

@@ -85,6 +85,8 @@
    public static final String ROLLUP_TEMPLATE_VERSION_FIELD = "rollup-version";
    public static final String ROLLUP_TEMPLATE_VERSION_PATTERN =
            Pattern.quote("${rollup.dynamic_template.version}");
+    public static final String ROLLUP_NULL_VALUE_PLACEHOLDER = "ROLLUP_NULL_VALUE_PLACEHOLDER";
+    public static final String ROLLUP_NULL_VALUE = "__ROLLUP_NULL_VALUE_PLACEHOLDER__";


These look like they might be for the same purpose, maybe we should add comments to them to explain why they are both needed and what for?

colings86 · 2018-06-22T11:52:26Z

x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/RollupResponseTranslator.java

+                // Hide our `null_value` placeholder so it doesn't show up in the terms list
+                if (bucket.getKeyAsString().equals(Rollup.ROLLUP_NULL_VALUE)) {
+                    return null;
+                }


Why is this not needed for LongTerms and the histograms? do they not have the concept of a null bucket as well?

colings86 · 2018-06-22T11:54:35Z

.../rollup/src/main/java/org/elasticsearch/xpack/rollup/action/TransportPutRollupJobAction.java

+
+            String stringVersion = (String)((Map<String, Object>) m).get(Rollup.ROLLUP_TEMPLATE_VERSION_FIELD);
+            if (stringVersion == null) {
+                logger.warn("Could not determine version of existing rollup metadata for index [" + indexName + "]");


Is a warning enough here? I am wondering if when the stringVersion is null: a) will we end up throwing a NPE below such as in the Version.fromString() method and b) if we will be able to "do the right thing"?

colings86 · 2018-06-22T11:55:41Z

.../rollup/src/main/java/org/elasticsearch/xpack/rollup/action/TransportPutRollupJobAction.java

+            if (parsedVersion.before(Version.V_6_4_0)) {
+                String msg = "Cannot create rollup job [" + job.getConfig().getId() + "] because the rollup index contains " +
+                    "jobs from pre-6.4.0.  The mappings for these jobs are not compatible with 6.4.0+.  Please specify a new rollup " +
+                    "index.";


Does this mean that users are going to need to recreate their existing rollup jobs and rollup indexes when they upgrade from 6.3 to 6.4+?

colings86 · 2018-06-22T11:56:43Z

x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/IndexerUtils.java

-                if (v instanceof String) {
+                if (v == null) {
+                    // Arbitrary value to update the doc ID with for nulls
+                    docID.update(19);


disappointed you didn't make this value 42 😃

polyfractal · 2018-07-03T02:14:25Z

@colings86 sorry for the delay on this, I think it's ready for the next round of reviews.

jimczi

Can you explain why you need to translate null values to concrete values in the rollup index. IMO this is not needed, the issue without the missing_bucket option is that rollup documents with a missing value are omitted. With missing_bucket these rollup documents are taken into account and will be present in the rollup index. Now why can't you just index them with their null values explicitly ? The count should be correct if you aggregate on another field since documents will null values are indexed in the rollup ?

polyfractal · 2018-07-09T12:39:59Z

Hmm hmm... I think you're right. I was assuming we had to represent those nulls for accurate counts, but we don't care if the field is null so long as the rest of the fields are present (whereas before the entire document was omitted if one of the keys was gone).

Let me think about it, but I think you're right. That also means we don't have the compatibility issue anymore if we don't have to set a null_value

Which also means no need for compatibility check or restart test. Non-keyword keys needed a null check for doc ID generation however, plus a test for null keys.

polyfractal · 2018-07-09T18:53:58Z

Ok, so I just pushed a commit which:

Removed null_value
Removed the rollup version compatibility check (since it was the use of null_value which made it incompatible)
Removed the full cluster restart test, since there's not incompatibility to check anymore
Added some logic to handle null numeric/histo keys which I missed before... those would have NPEd. Added a test to verify that behavior

I left in the check for rollup version in general (since that's a useful validation) as well as the tweaks to the full cluster restart helper methods, since that will make future tests easier.

jimczi

Thanks @polyfractal . I left some comments regarding null buckets and date histograms.

jimczi · 2018-07-11T15:35:23Z

.../plugin/core/src/main/java/org/elasticsearch/xpack/core/rollup/job/DateHistoGroupConfig.java

@@ -159,7 +159,7 @@ public Rounding createRounding() {
        vsBuilder.dateHistogramInterval(interval);
        vsBuilder.field(field);
        vsBuilder.timeZone(timeZone);
-
+        vsBuilder.missingBucket(true);


We shouldn't allow missing bucket here. We use this config for the timestamp field which a required field for rollup documents. Moreover since it's the first field in the composite, enabling missing buckets would disable the composite optimization when the field is indexed. IMO it's ok to require the timestamp field to be present since otherwise we have no way to know if the document should be part of the rollup.

Ah yeah, silly me. Absolutely shouldn't allow missing on the timestamp field :)

Good catch thanks

jimczi · 2018-07-11T15:37:52Z

x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/IndexerUtils.java

@@ -83,14 +83,27 @@ private static CRC32 processKeys(Map<String, Object> keys, Map<String, Object> d
                doc.put(k + "." + RollupField.TIMESTAMP, v);
                doc.put(k  + "." + RollupField.INTERVAL, groupConfig.getDateHisto().getInterval());
                doc.put(k  + "." + DateHistoGroupConfig.TIME_ZONE, groupConfig.getDateHisto().getTimeZone().toString());
-                docID.update(Numbers.longToBytes((Long)v), 0, 8);
+                if (v == null) {


we should not allow null here

jimczi · 2018-07-11T15:39:37Z

x-pack/docs/en/rollup/understanding-groups.asciidoc

-when this particular scenario is fixed.
+The rollup job will automatically use a placeholder term (`__ROLLUP_NULL_VALUE_PLACEHOLDER__`) as the `null_value` for keyword fields,
+which allows it to handle documents that may be missing some of the grouping fields.  This placeholder is then removed from search
+results, resulting in correct doc counts in a manner that is invisible to the user.


This part should be updated since we don't use the placeholder term anymore.

polyfractal · 2018-07-12T17:54:14Z

Review comments addressed, tests back to passing too.

jimczi

LGTM

polyfractal · 2018-07-13T14:04:43Z

Woo, thanks @jimczi! And good catch on the null_value, I'm happy we didn't need to do that. :)

We can leverage the composite agg's new `missing_bucket` feature on terms groupings. This means the aggregation criteria used in the indexer will now return null buckets for missing keys. Because all buckets are now returned (even if a key is null), we can guarantee correct doc counts with "combined" jobs (where a job rolls up multiple schemas). This was previously impossible since composite would ignore documents that didn't have _all_ the keys, meaning non-overlapping schemas would cause composite to return no buckets. Note: date_histo does not use `missing_bucket`, since a timestamp is always required. The docs have been adjusted to recommend a single, combined job. It also makes reference to the previous issue to help users that are upgrading (rather than just deleting the sections).

* 6.x: Watcher: Make settings reloadable (#31746) [Rollup] Histo group config should support scaled_floats (#32048) lazy snapshot repository initialization (#31606) Add secure setting for watcher email password (#31620) Watcher: cleanup ensureWatchExists use (#31926) Add second level of field collapsing (#31808) Added lenient flag for synonym token filter (#31484) (#31970) Test: Fix a second case of bad watch creation [Rollup] Use composite's missing_bucket (#31402) Docs: Restyled cloud link in getting started Docs: Change formatting of Cloud options Re-instate link in StringFunctionUtils javadocs Correct spelling of AnalysisPlugin#requriesAnalysisSettings (#32025) Fix problematic chars in javadoc [ML] Move open job failure explanation out of root cause (#31925) [ML] Switch ML native QA tests to use a 3 node cluster (#32011)

$polyfractal$

$@polyfractal$ polyfractal added >enhancement review v7.0.0 :StorageEngine/Rollup Turn fine-grained time-based data into coarser-grained data v6.4.0 labels Jun 18, 2018

$@polyfractal$ polyfractal requested a review from colings86 June 18, 2018 17:03

colings86 reviewed Jun 22, 2018

View reviewed changes

polyfractal and others added 4 commits June 22, 2018 17:39

$@polyfractal$

Review cleanup

d29f681

$@polyfractal$

Merge remote-tracking branch 'origin/master' into rollup_missing_bucket

d3197bb

$@polyfractal$

Add upgrade/restart tests

d75d2b0

$@polyfractal$

Make Restart tests play nicely with multiple jobs

bae94c5

jimczi reviewed Jul 9, 2018

View reviewed changes

$@polyfractal$

Review: no need for null_value

e43a975

Which also means no need for compatibility check or restart test. Non-keyword keys needed a null check for doc ID generation however, plus a test for null keys.

jimczi requested changes Jul 11, 2018

View reviewed changes

polyfractal added 2 commits July 11, 2018 17:49

$@polyfractal$

Review cleanup

38ed727

$@polyfractal$

Merge remote-tracking branch 'origin/master' into rollup_missing_bucket

f7be757

jimczi approved these changes Jul 13, 2018

View reviewed changes

$@polyfractal$ polyfractal merged commit b7f07f0 into elastic:master Jul 13, 2018

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Rollup] Use composite's missing_bucket #31402

[Rollup] Use composite's missing_bucket #31402

$@polyfractal$ polyfractal commented Jun 18, 2018

elasticmachine commented Jun 18, 2018

colings86 Jun 22, 2018

colings86 Jun 22, 2018

colings86 Jun 22, 2018

colings86 Jun 22, 2018

colings86 Jun 22, 2018

colings86 Jun 22, 2018

polyfractal commented Jul 3, 2018

jimczi left a comment

polyfractal commented Jul 9, 2018

polyfractal commented Jul 9, 2018

jimczi left a comment

jimczi Jul 11, 2018

$@polyfractal$ polyfractal Jul 11, 2018

jimczi Jul 11, 2018

jimczi Jul 11, 2018

polyfractal commented Jul 12, 2018

jimczi left a comment

polyfractal commented Jul 13, 2018

[Rollup] Use composite's missing_bucket #31402

[Rollup] Use composite's missing_bucket #31402

Conversation

polyfractal commented Jun 18, 2018

elasticmachine commented Jun 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polyfractal commented Jul 3, 2018

jimczi left a comment

Choose a reason for hiding this comment

polyfractal commented Jul 9, 2018

polyfractal commented Jul 9, 2018

jimczi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

polyfractal commented Jul 12, 2018

jimczi left a comment

Choose a reason for hiding this comment

polyfractal commented Jul 13, 2018

$@polyfractal$ polyfractal commented Jun 18, 2018