[Docs] clarification about cardinality accuracy (#34616)

polyfractal · polyfractal · commit e259689b1a01 · 2018-10-22T13:16:11.000-04:00
Adds a bit more clarification about how accuracy is dependent on the dataset in question. Closes #18231
diff --git a/docs/reference/aggregations/metrics/cardinality-aggregation.asciidoc b/docs/reference/aggregations/metrics/cardinality-aggregation.asciidoc
@@ -150,10 +150,18 @@ public static void main(String[] args) {
 
 image:images/cardinality_error.png[]
 
-For all 3 thresholds, counts have been accurate up to the configured threshold
-(although not guaranteed, this is likely to be the case). Please also note that
-even with a threshold as low as 100, the error remains very low, even when
-counting millions of items.
+For all 3 thresholds, counts have been accurate up to the configured threshold.
+Although not guaranteed, this is likely to be the case.  Accuracy in practice depends
+on the dataset in question.  In general, most datasets show consistently good
+accuracy. Also note that even with a threshold as low as 100, the error
+remains very low (1-6% as seen in the above graph) even when counting millions of items.
+
+The HyperLogLog++ algorithm depends on the leading zeros of hashed
+values, the exact distributions of hashes in a dataset can affect the 
+accuracy of the cardinality.  
+
+Please also note that even with a threshold as low as 100, the error remains
+very low, even when counting millions of items.
 
 ==== Pre-computed hashes