Skip to content

Inaccurate values when calculating SUM or AVG aggregations #29345

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aurimasplu opened this issue Apr 3, 2018 · 5 comments
Closed

Inaccurate values when calculating SUM or AVG aggregations #29345

aurimasplu opened this issue Apr 3, 2018 · 5 comments

Comments

@aurimasplu
Copy link

aurimasplu commented Apr 3, 2018

Elasticsearch version : 6.1.1-1

Plugins installed: X-pack

JVM version: java-1.8.0-openjdk-1.8.0.151-5.b12.el7_4.x86_64

OS version : RHEL 7.4

Description of the problem including expected versus actual behavior:
When running sum or avg aggregations within terms aggregation getting inaccurate results.

Steps to reproduce:

I run this query:

{
	"size": 1000,
		"_source": [
		"tag.agent_host",
		"tag.ifDescr",
		"interface.ifHCInOctets"
	],
	"query": {
		"bool": {
			"filter": [{
					"range": {
						"@timestamp": {
							"gte": "1522330307503",
							"lte": "1522330607503",
							"format": "epoch_millis"
						}
					}
				}, {
					"query_string": {
						"analyze_wildcard": true,
						"query": "tag.agent_host:testrouter1.testdomain.net AND tag.ifDescr:GigabitEthernet*0 AND _exists_:interface.ifHCInOctets"
					}
				}
			]
		}
	},
	"aggs": {
		"DBF_Device": {
			"terms": {
				"field": "tag.agent_host",
				"size": 10
			},
			"aggs": {
				"DBF_Interface": {
					"terms": {
						"field": "tag.ifDescr",
						"size": 10
					},
					"aggs": {
						"DBF_Metric_SUM": {
							"sum": {
								"field": "interface.ifHCInOctets"
							}
						}
					}
				}
			}
		}
	}
}

and get this response:

{
    "took": 154,
    "timed_out": false,
    "_shards": {
        "total": 1155,
        "successful": 1155,
        "skipped": 1123,
        "failed": 0
    },
    "hits": {
        "total": 5,
        "max_score": 0,
        "hits": [
            {
                "_index": "test-index-2018.03.29",
                "_type": "metrics",
                "_id": "QrH1cWIBrVLc4RYS48TM",
                "_score": 0,
                "_source": {
                    "tag": {
                        "agent_host": "testrouter1.testdomain.net",
                        "ifDescr": "GigabitEthernet0/0/0"
                    },
                    "interface": {
                        **"ifHCInOctets": 402423415108**
                    }
                }
            },
            {
                "_index": "test-index-2018.03.29",
                "_type": "metrics",
                "_id": "BsP4cWIBrVLc4RYSomgl",
                "_score": 0,
                "_source": {
                    "tag": {
                        "agent_host": "testrouter1.testdomain.net",
                        "ifDescr": "GigabitEthernet0/0/0"
                    },
                    "interface": {
                        **"ifHCInOctets": 402849967058**
                    }
                }
            },
            {
                "_index": "test-index-2018.03.29",
                "_type": "metrics",
                "_id": "x7f2cWIBrVLc4RYSzXP8",
                "_score": 0,
                "_source": {
                    "tag": {
                        "agent_host": "testrouter1.testdomain.net",
                        "ifDescr": "GigabitEthernet0/0/0"
                    },
                    "interface": {
                        **"ifHCInOctets": 402533814871**
                    }
                }
            },
            {
                "_index": "test-index-2018.03.29",
                "_type": "metrics",
                "_id": "e733cWIBrVLc4RYSt7HE",
                "_score": 0,
                "_source": {
                    "tag": {
                        "agent_host": "testrouter1.testdomain.net",
                        "ifDescr": "GigabitEthernet0/0/0"
                    },
                    "interface": {
                        **"ifHCInOctets": 402662297513**
                    }
                }
            },
            {
                "_index": "test-index-2018.03.29",
                "_type": "metrics",
                "_id": "wsn5cWIBrVLc4RYSjcrG",
                "_score": 0,
                "_source": {
                    "tag": {
                        "agent_host": "testrouter1.testdomain.net",
                        "ifDescr": "GigabitEthernet0/0/0"
                    },
                    "interface": {
                        **"ifHCInOctets": 402989750842**
                    }
                }
            }
        ]
    },
    "aggregations": {
        "DBF_Device": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [
                {
                    "key": "testrouter1.testdomain.net",
                    "doc_count": 5,
                    "DBF_Interface": {
                        "doc_count_error_upper_bound": 0,
                        "sum_other_doc_count": 0,
                        "buckets": [
                            {
                                "key": "GigabitEthernet0/0/0",
                                "doc_count": 5,
                                "DBF_Metric_SUM": {
                                    **"value": 2013459218432**
                                }
                            }
                        ]
                    }
                }
            ]
        }
    }
}

When calculating output by myself from hits section I get that SUM of ifHCInOctets fields is 2013459245392 but when elastic calculates its SUM aggregation result is 2013459218432 which is lesser by 26960.
I also tried adding various sorting or removing child element- result always remained inaccurate. Same problem appears with avg aggregation.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@colings86
Copy link
Contributor

@aurimasplu I have not been able to reproduce this bug. I tried with the following script which is based on the information you provided but with a few modifications to get it to work (your query uses a @timestamp field that doesn't exist in your hits because of source filtering in your request) and some assumptions:

  1. That your tag and interface fields are object fields and not nested fields
  2. That interface.ifHCInOctets is a long field
  3. That tag.ifDescr and tag.agent_host are keyword fields
  • Could you provide the mappings for this index so we can validate the assumptions above?
  • Could you try my script on your system to see if you can reproduce the bug with my script?
  • Could you modify my script so it does reproduce the bug?

@aurimasplu
Copy link
Author

Hi @colings86 , thanks for quick answer.
It might be that you are right regarding field type as these metrics are 64 bit counters, they should be long but right now they are float. We will do remapping today and will see how it goes.
Regarding the script I do not see anything attached :)

Here is whole index mapping:


{
    "test-index-2018.04.03": {
        "aliases": {},
        "mappings": {
            "metrics": {
                "_all": {
                    "enabled": false
                },
                "dynamic_templates": [
                    {
                        "tags": {
                            "path_match": "tag.*",
                            "match_mapping_type": "string",
                            "mapping": {
                                "ignore_above": 512,
                                "type": "keyword"
                            }
                        }
                    },
                    {
                        "metrics_long": {
                            "match_mapping_type": "long",
                            "mapping": {
                                "index": false,
                                "type": "float"
                            }
                        }
                    },
                    {
                        "metrics_double": {
                            "match_mapping_type": "double",
                            "mapping": {
                                "index": false,
                                "type": "float"
                            }
                        }
                    },
                    {
                        "text_fields": {
                            "match": "*",
                            "mapping": {
                                "norms": false
                            }
                        }
                    }
                ],
                "properties": {
                    "@timestamp": {
                        "type": "date"
                    },
                    "interface": {
                        "properties": {
                            "ifAlias": {
                                "type": "text",
                                "norms": false
                            },
                            "ifErrCongestedPktsDrops": {
                                "type": "float",
                                "index": false
                            },
                            "ifErrCongestionLimitPktDrops": {
                                "type": "float",
                                "index": false
                            },
                            "ifErrRxNoBuffs": {
                                "type": "float",
                                "index": false
                            },
                            "ifHCInOctets": {
                                "type": "float",
                                "index": false
                            },
                            "ifHCInUcastPkts": {
                                "type": "float",
                                "index": false
                            },
                            "ifHCOutOctets": {
                                "type": "float",
                                "index": false
                            },
                            "ifHCOutUcastPkts": {
                                "type": "float",
                                "index": false
                            },
                            "ifInDiscards": {
                                "type": "float",
                                "index": false
                            },
                            "ifInErrors": {
                                "type": "float",
                                "index": false
                            },
                            "ifOutDiscards": {
                                "type": "float",
                                "index": false
                            },
                            "ifOutErrors": {
                                "type": "float",
                                "index": false
                            },
                            "ifTotXoffSent": {
                                "type": "float",
                                "index": false
                            },
                            "ifnicTxStalls": {
                                "type": "float",
                                "index": false
                            }
                        }
                    },
                    "measurement_name": {
                        "type": "keyword"
                    },
                    "tag": {
                        "properties": {
                            "agent_host": {
                                "type": "keyword",
                                "ignore_above": 512
                            },
                            "host": {
                                "type": "keyword",
                                "ignore_above": 512
                            },
                            "ifAlias": {
                                "type": "keyword",
                                "ignore_above": 512
                            },
                            "ifDescr": {
                                "type": "keyword",
                                "ignore_above": 512
                            },
                            "ifIndex": {
                                "type": "keyword",
                                "ignore_above": 512
                            },
                            "measurement_tag": {
                                "type": "keyword",
                                "ignore_above": 512
                            },
                            "platform_tag": {
                                "type": "keyword",
                                "ignore_above": 512
                            }
                        }
                    }
                }
            },
            "_default_": {
                "_all": {
                    "enabled": false
                },
                "dynamic_templates": [
                    {
                        "tags": {
                            "path_match": "tag.*",
                            "match_mapping_type": "string",
                            "mapping": {
                                "ignore_above": 512,
                                "type": "keyword"
                            }
                        }
                    },
                    {
                        "metrics_long": {
                            "match_mapping_type": "long",
                            "mapping": {
                                "index": false,
                                "type": "float"
                            }
                        }
                    },
                    {
                        "metrics_double": {
                            "match_mapping_type": "double",
                            "mapping": {
                                "index": false,
                                "type": "float"
                            }
                        }
                    },
                    {
                        "text_fields": {
                            "match": "*",
                            "mapping": {
                                "norms": false
                            }
                        }
                    }
                ],
                "properties": {
                    "@timestamp": {
                        "type": "date"
                    },
                    "measurement_name": {
                        "type": "keyword"
                    }
                }
            }
        },
        "settings": {
            "index": {
                "mapping": {
                    "total_fields": {
                        "limit": "5000"
                    }
                },
                "refresh_interval": "10s",
                "number_of_shards": "4",
                "translog": {
                    "flush_threshold_size": "2048mb"
                },
                "provided_name": "test-index-2018.04.03",
                "creation_date": "1522713612149",
                "number_of_replicas": "0",
                "uuid": "kXS2qTXJQu66lQUlSx5OyA",
                "version": {
                    "created": "6010199"
                }
            }
        }
    }
}
 

@colings86
Copy link
Contributor

Oops I forgot to paste the script. Here it is:

DELETE test

PUT test
{
  "mappings": {
    "doc": {
      "properties": {
        "tag": {
          "type": "object",
          "properties": {
            "agent_host": {
              "type": "keyword"
            },
            "ifDescr": {
              "type": "keyword"
            }
          }
        },
        "interface": {
          "type": "object",
          "properties": {
            "ifHCInOctets": {
              "type": "long"
            }
          }
        }
      }
    }
  }
}

POST test/doc/1
{
  "tag": {
    "agent_host": "testrouter1.testdomain.net",
    "ifDescr": "GigabitEthernet0/0/0"
  },
  "interface": {
    "ifHCInOctets": 402423415108
  }
}

POST test/doc/2
{
  "tag": {
    "agent_host": "testrouter1.testdomain.net",
    "ifDescr": "GigabitEthernet0/0/0"
  },
  "interface": {
    "ifHCInOctets": 402849967058
  }
}

POST test/doc/3
{
  "tag": {
    "agent_host": "testrouter1.testdomain.net",
    "ifDescr": "GigabitEthernet0/0/0"
  },
  "interface": {
    "ifHCInOctets": 402533814871
  }
}

POST test/doc/4
{
  "tag": {
    "agent_host": "testrouter1.testdomain.net",
    "ifDescr": "GigabitEthernet0/0/0"
  },
  "interface": {
    "ifHCInOctets": 402662297513
  }
}

POST test/doc/5
{
  "tag": {
    "agent_host": "testrouter1.testdomain.net",
    "ifDescr": "GigabitEthernet0/0/0"
  },
  "interface": {
    "ifHCInOctets": 402989750842
  }
}

GET test/_search
{
  "size": 1000,
  "_source": [
    "tag.agent_host",
    "tag.ifDescr",
    "interface.ifHCInOctets"
  ],
  "query": {
    "bool": {
      "filter": [
        {
          "query_string": {
            "analyze_wildcard": true,
            "query": "tag.agent_host:testrouter1.testdomain.net AND tag.ifDescr:GigabitEthernet*0 AND _exists_:interface.ifHCInOctets"
          }
        }
      ]
    }
  },
  "aggs": {
    "DBF_Device": {
      "terms": {
        "field": "tag.agent_host",
        "size": 10
      },
      "aggs": {
        "DBF_Interface": {
          "terms": {
            "field": "tag.ifDescr",
            "size": 10
          },
          "aggs": {
            "DBF_Metric_SUM": {
              "sum": {
                "field": "interface.ifHCInOctets"
              }
            }
          }
        }
      }
    }
  }
}

I think the problem will come down the floating point rounding since your fields are floats. Note that from 6.3 the basic metric aggregations (sum, avg, stats, extended_stats) will use Kahan summation which will significantly reduce the error introduced by floating point rounding on these calculations (see #27848).

I'm going to close this for now assuming that your issue will be fixed when you remap to a long field type but please comment if that doesn't solve your problem and we can look into this more

@aurimasplu
Copy link
Author

Remapping to long worked as a charm, now sum and avg are calculated as expected. @colings86 thanks a lot for your help :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants