-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Remove _field_stats endpoint #25577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We no longer use field_stats to get field info at index pattern creation time, but we do still use it prior to executing searches for some index patterns. There are plans to remove that option (it's already deprecated) but it does not look like that's happened yet. I'm not sure if it's targeted for 6.0. @epixa ? |
Pretty sure we are still planning to ship the "expand indices" option with K 6.0 (where we convert If we are confident that removing this optimization from Kibana won't result in worse, or specifically unusable, performance then I'm not opposed to removing it now. |
If the optim is to check if an index matches document in a specific timeframe with |
I agree |
With _field_stats gone, how does one easily calculate "old indices"? https://www.elastic.co/blog/managing-time-based-indices-efficiently has us using _field_stats and I don't think a min/max aggregation will be performant against lots of indices. I had mentioned this at #23914 (comment) as well but no one seemed to respond. A problem with the min/max rewriting that I've just noticed is that it appears to require some warming up that is painful on our cluster. Our Kibana currently uses the _field_stats for index calculation and requests usually are <10 seconds. But if I run the same query without the index subset, the first time takes several minutes. After that, they are <10 seconds (though a few seconds slower). This warm up seems to happen if I haven't run a query in an hour and it increases the load average in my cluster fairly substantially. |
@spalger, thanks for pointing that out. Unfortunately, we do have deletions and updates. Not a lot (<5% of the data), though. |
@trevan the "expand time based indices" feature should be done on ES side transparently. That's what we're trying to achieve. Kibana does not replace |
@jimczi, I know Kibana is trying to let ES handle the time range filters. I guess that must have been lost in what I said. Here's the query that can take several minutes: localhost:9200/logstash-*/_search?pretty' -d'{"size":0,"aggs":{"suggestions":{"terms":{"field":"ip.raw"}}},"query":{"range":{"@timestamp":{"gte":1499640596171,"lte":1499726996171,"format":"epoch_millis"}}}}' I just did a test to grab the times using both the old _field_stats method and the new "let ES do it all". I did the following 3 steps in order.
This is using version 5.4.1 Should this be moved to a separate issue? |
An update for those following along: the field_stats API was removed from master today, and the corresponding "expand indices" option in Kibana was removed as well. This was made possible by a new change to the search/msearch APIs that automatically optimizes requests to only hit a subset of relevant shards that actually could have documents that match the given filters when there are a non-trivial amount of shards that match the given index pattern. This is a pretty naive summary of the change, so I encourage folks to take a look at the PR that added this improvement for more details: #25658 |
@trevan we fixed a long list of issues to make |
@jimczi, can these changes be backported to 5.x? I'd like to be able to see if these changes will actually help while staying in 5.x where _field_stats is still available as a back up option. |
we don't plan to backport these changes into 5.x |
that said, I will spend some time looking into backporting it since I see the benefit here for a broader audience. so bare with me I will update this issue @trevan |
@s1monw, are all the changes basically on the coordinating node? So if I have a 6.x client node talking to my 5.x cluster, would I be able to see if this is performant enough? My worry is that I currently have to use _field_stats in 5.x because of the performance hit and I won't be able to know if you've made it performant enough in 6.x until after I upgrade and by that time, I'm now stuck. That's why I'm asking about a backport. I would like someway to test these changes in my cluster before loosing _field_stats. |
@trevan I can't answer your question in terms of Elasticsearch, but Kibana 5.5 will fail with a red status if it targets an ES cluster that has a 6.x node in it. This requirement will be relaxed in the final version of 5.x that is released to support a migration/rolling upgrade scenario, but you'll need to test the performance of this change with a raw query to Elasticsearch in that scenario. |
@epixa, yeah, I was planning on doing a raw query. That's how I've been testing out the non _field_stats queries so far since it would otherwise Kibana would kill our cluster. |
@trevan you can only see these changes once you upgraded to the upcoming 5.6 and use a 6.0 client node that is correct. I will nevertheless look into backporting this feature it seems low risk at this poing |
@trevan FYI I back-ported the change that went into master to 5.x it will be released with 5.6 |
@trevan it would be very much appreciated if you could report back once you upgraded to |
@s1monw, thanks for the backport. I'll try and get us upgraded in the next month or two and report back. |
@trevan |
@s1monw, we just upgraded to 5.6.x. I ran the logstash-* version and then waited 5 hours before running the index list version. Both of them took about 16 seconds the first time. So it is looking good. Thanks. |
_field_stats endpoint has been deprecated in 5 and should be removed in 6.
The text was updated successfully, but these errors were encountered: