You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Elasticsearch version (bin/elasticsearch --version): 7.6.1
Description of the problem including expected versus actual behavior:
Steps to reproduce:
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
DELETE test
PUT test
{
"mappings": {
"properties": {
"dateRange": { "type": "date_range" }
}
}
}
PUT test/_doc/1
{
"dateRange": {
"gte": "2020-03-01"
}
}
GET test/_search
{
"aggs": {
"test": {
"date_histogram": {
"field": "dateRange",
"interval" : "day"
}
}
}
}
triggers the circuitbreaker like this
{
"error" : {
"root_cause" : [
{
"type" : "circuit_breaking_exception",
"reason" : "[request] Data too large, data for [<reused_arrays>] would be [805344256/768mb], which is larger than the limit of [622775500/593.9mb]",
"bytes_wanted" : 805344256,
"bytes_limit" : 622775500,
"durability" : "TRANSIENT"
}
],
"type" : "search_phase_execution_exception",
"reason" : "all shards failed",
"phase" : "query",
"grouped" : true,
"failed_shards" : [
{
"shard" : 0,
"index" : "test",
"node" : "I8l1uS_GSKO8LfCDlV69OQ",
"reason" : {
"type" : "circuit_breaking_exception",
"reason" : "[request] Data too large, data for [<reused_arrays>] would be [805344256/768mb], which is larger than the limit of [622775500/593.9mb]",
"bytes_wanted" : 805344256,
"bytes_limit" : 622775500,
"durability" : "TRANSIENT"
}
}
]
},
"status" : 429
}
The above snippet triggers the circuitbreaker after a few seconds (which is good!). But just having a single document with an unbounded upper range in your dataset will make any aggregations on date range fields impossible and slow down your system. Maybe we want to exit earlier in that case?
The text was updated successfully, but these errors were encountered:
I don't know what's the best thing to do in the scope of this issue, but I think it would be great to have more options for date_histogram aggregation.
We already have an extended_bounds setting, may be we need a new setting to restrict buckets creation ?
It could be a restricted_bounds setting for instance, where we could specify min/max values.
I think this new setting could also be great for multi valued date in addition of date_range when we are not always interested by all auto-generated buckets
Thanks @spinscale! Going to close this as a duplicate of: #50109
@ajacob agreed! That's pretty much the direction we are thinking as well. In the most recent comment of that thread (#50109 (comment)) we suggested an extra flag on extended_bounds which lets you indicate that unbounded ranges should be "truncated" at the provided limits.
Incidentally, this would also fix a normal complaint with extended_bounds that it creates more buckets than expected, even for normal field types. E.g. people expect it to work as a hard threshold, when in reality it's a max(extended_bounds, data bounds) situation.
Elasticsearch version (
bin/elasticsearch --version
): 7.6.1Description of the problem including expected versus actual behavior:
Steps to reproduce:
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
triggers the circuitbreaker like this
The above snippet triggers the circuitbreaker after a few seconds (which is good!). But just having a single document with an unbounded upper range in your dataset will make any aggregations on date range fields impossible and slow down your system. Maybe we want to exit earlier in that case?
The text was updated successfully, but these errors were encountered: