Add a metric aggs to support TSDB high performance computing #84930
Labels
:Analytics/Aggregations
Aggregations
>enhancement
:StorageEngine/TSDB
You know, for Metrics
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Description
There is a common case in TSDB, e.g the total cpu cost of a cluster. The metric is collected by node, named
node.cpu_percent
. To get the metric line, the DSL is:It will use the pipeline sum_bucket, to calculate the cpu_percent of total node.
But it will face many problems:
search.max_buckets
count limit.The example is a common case in TSDB, we can describe the requirement as that:
To calculate the metric from a bucket, the data is as follows:
If we want to get a metric from the bucket, we must calculate the total 9 numbers in only one way. e.g:
But in time series case, the requirement is alway that: calculate the metric of one time series line, and then calculate the metric of all time series lines.
e.g in the above
node.cpu_percent
case, we first calculate the avg value of each time series line:And then to get the sum of all time series lines: sum = a' + b' + c'.
The requirement can be implement by pipeline aggs, but it has many problems as above.
To implement the requirement, we can add a new metric aggs operator.
time_series_metric contain three fields:
the result is:
Since the data is sorted by _tsid, we can calculate the aggregator results in the data node, and aggregate the results of the _tsid downsampled value one by one.
The text was updated successfully, but these errors were encountered: