Skip to content

Extended Stats Bucket Aggregation returns parse exception when sigma is an Integer #17499

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
russcam opened this issue Apr 4, 2016 · 5 comments
Assignees

Comments

@russcam
Copy link
Contributor

russcam commented Apr 4, 2016

Elasticsearch version:

2.3.0

JVM version:

java version "1.8.0_77"
Java(TM) SE Runtime Environment (build 1.8.0_77-b03)
Java HotSpot(TM) 64-Bit Server VM (build 25.77-b03, mixed mode)

OS version:

Windows 10

Description of the problem including expected versus actual behavior:

The documentation for Extended Stats Aggregation specifies that sigma can be any non-negative double. In the example in the docs, 3 is passed as the value for sigma.

Extended Stats Bucket Aggregation also has a sigma parameter but the documentation does not specify what type sigma is. I assumed it is also a double and confirmed this with a (failing) integration test for NEST; a parse exception is returned when passing an integer (i.e. no decimal places) as the value for sigma. I would expect an integer to be valid value as an input for a double, based on the example provided for Extended Stats Aggregation.

Steps to reproduce:

  1. Failing request (from NEST integration test). Notice the value of 2 for sigma
{
  "size": 0,
  "aggs": {
    "projects_started_per_month": {
      "date_histogram": {
        "field": "startedOn",
        "interval": "month"
      },
      "aggs": {
        "commits": {
          "sum": {
            "field": "numberOfCommits"
          }
        }
      }
    },
    "extended_stats_commits_per_month": {
      "extended_stats_bucket": {
        "buckets_path": "projects_started_per_month>commits",
        "sigma": 2
      }
    }
  }
}

returns:

{
  "error" : {
    "root_cause" : [ {
      "type" : "parse_exception",
      "reason" : "Parameter [sigma] must be a Double, type `Integer` provided instead"
    } ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [ {
      "shard" : 0,
      "index" : "project",
      "node" : "afZfSF7hQoqmZQip5n3-LQ",
      "reason" : {
        "type" : "parse_exception",
        "reason" : "Parameter [sigma] must be a Double, type `Integer` provided instead"
      }
    } ]
  },
  "status" : 500
}
  1. Change the request from step 1, to change value of sigma from 2 to 2.0:
{
  "size": 0,
  "aggs": {
    "projects_started_per_month": {
      "date_histogram": {
        "field": "startedOn",
        "interval": "month"
      },
      "aggs": {
        "commits": {
          "sum": {
            "field": "numberOfCommits"
          }
        }
      }
    },
    "extended_stats_commits_per_month": {
      "extended_stats_bucket": {
        "buckets_path": "projects_started_per_month>commits",
        "sigma": 2.0
      }
    }
  }
}

successfully returns results:

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "hits" : {
    "total" : 100,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "projects_started_per_month" : {
      "buckets" : [ {
        "key_as_string" : "2015-04-01T00:00:00.000Z",
        "key" : 1427846400000,
        "doc_count" : 2,
        "commits" : {
          "value" : 796.0
        }
      }, {
        "key_as_string" : "2015-05-01T00:00:00.000Z",
        "key" : 1430438400000,
        "doc_count" : 12,
        "commits" : {
          "value" : 4982.0
        }
      }, {
        "key_as_string" : "2015-06-01T00:00:00.000Z",
        "key" : 1433116800000,
        "doc_count" : 13,
        "commits" : {
          "value" : 5847.0
        }
      }, {
        "key_as_string" : "2015-07-01T00:00:00.000Z",
        "key" : 1435708800000,
        "doc_count" : 7,
        "commits" : {
          "value" : 3693.0
        }
      }, {
        "key_as_string" : "2015-08-01T00:00:00.000Z",
        "key" : 1438387200000,
        "doc_count" : 7,
        "commits" : {
          "value" : 3848.0
        }
      }, {
        "key_as_string" : "2015-09-01T00:00:00.000Z",
        "key" : 1441065600000,
        "doc_count" : 8,
        "commits" : {
          "value" : 4421.0
        }
      }, {
        "key_as_string" : "2015-10-01T00:00:00.000Z",
        "key" : 1443657600000,
        "doc_count" : 9,
        "commits" : {
          "value" : 5119.0
        }
      }, {
        "key_as_string" : "2015-11-01T00:00:00.000Z",
        "key" : 1446336000000,
        "doc_count" : 1,
        "commits" : {
          "value" : 339.0
        }
      }, {
        "key_as_string" : "2015-12-01T00:00:00.000Z",
        "key" : 1448928000000,
        "doc_count" : 12,
        "commits" : {
          "value" : 4218.0
        }
      }, {
        "key_as_string" : "2016-01-01T00:00:00.000Z",
        "key" : 1451606400000,
        "doc_count" : 10,
        "commits" : {
          "value" : 4804.0
        }
      }, {
        "key_as_string" : "2016-02-01T00:00:00.000Z",
        "key" : 1454284800000,
        "doc_count" : 7,
        "commits" : {
          "value" : 4316.0
        }
      }, {
        "key_as_string" : "2016-03-01T00:00:00.000Z",
        "key" : 1456790400000,
        "doc_count" : 10,
        "commits" : {
          "value" : 5009.0
        }
      }, {
        "key_as_string" : "2016-04-01T00:00:00.000Z",
        "key" : 1459468800000,
        "doc_count" : 2,
        "commits" : {
          "value" : 1664.0
        }
      } ]
    },
    "extended_stats_commits_per_month" : {
      "count" : 13,
      "min" : 339.0,
      "max" : 5847.0,
      "avg" : 3773.5384615384614,
      "sum" : 49056.0,
      "sum_of_squares" : 2.21307799E8,
      "variance" : 2784084.325443786,
      "std_deviation" : 1668.55755832509,
      "std_deviation_bounds" : {
        "upper" : 7110.653578188641,
        "lower" : 436.42334488828146
      }
    }
  }
}
@colings86
Copy link
Contributor

@russcam Are you sure this is an Elasticsearch issue rather than a NEST issue? I have just tested this on both master and 2.3 (see command line output below for what I did for 2.3) and have not been able to reproduce the issue. Could you maybe provide the stack trace from the Elasticsearch logs?

$ curl -XDELETE localhost:9200/test
{"acknowledged":true}% 

$ curl -XPOST 'localhost:9200/test/doc/1' -d '{ "i": 2 } '
{"_index":"test","_type":"doc","_id":"1","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}% 

$ curl -XGET 'localhost:9200/test/_search?pretty' -d '{
quote>   "aggs": {
quote>     "orders": {
quote>       "extended_stats": {
quote>         "field": "i",
quote>         "sigma": 2
quote>       }
quote>     }
quote>   }
quote> }'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "test",
      "_type" : "doc",
      "_id" : "1",
      "_score" : 1.0,
      "_source" : {
        "i" : 2
      }
    } ]
  },
  "aggregations" : {
    "orders" : {
      "count" : 1,
      "min" : 2.0,
      "max" : 2.0,
      "avg" : 2.0,
      "sum" : 2.0,
      "sum_of_squares" : 4.0,
      "variance" : 0.0,
      "std_deviation" : 0.0,
      "std_deviation_bounds" : {
        "upper" : 2.0,
        "lower" : 2.0
      }
    }
  }
}

@colings86 colings86 removed the help wanted adoptme label Apr 6, 2016
@colings86
Copy link
Contributor

@russcam
Copy link
Contributor Author

russcam commented Apr 6, 2016

@colings86 I think you've made the same mistake I initially did in looking at extended_stats and not extended_stats_bucket 😄 The former works fine with any number but the latter has the issue.

Looking at ExtendedStatsBucketParser in 2.3, it looks like the parsing logic for this should be brought in line with the logic used in ExtendedStatsParser.

I see @alexshadow007 has opened a PR with a fix; happy to submit a fix if needed.

NEST had an issue with modelling the Extended Stats Bucket Aggregation's Sigma property as a nullable int; that is now fixed 👍

@colings86
Copy link
Contributor

@russcam yep you are right, I missed that it was extended_stats_bucket we were talking about here. The PR from @alexshadow007 looks pretty good so I'll work with him to address the few comments I have on it and get it merged

@colings86
Copy link
Contributor

Fixed by #17562

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants