Remove _field_stats endpoint #25577

jimczi · 2017-07-06T12:39:19Z

_field_stats endpoint has been deprecated in 5 and should be removed in 6.

jimczi · 2017-07-06T12:39:34Z

@Bargs @spalger can you confirm that Kibana 6 will not use _field_stats at all and rely only on _field_caps and ES request cache for aggs ?

Bargs · 2017-07-06T22:20:47Z

We no longer use field_stats to get field info at index pattern creation time, but we do still use it prior to executing searches for some index patterns. There are plans to remove that option (it's already deprecated) but it does not look like that's happened yet. I'm not sure if it's targeted for 6.0. @epixa ?

spalger · 2017-07-06T22:24:59Z

Pretty sure we are still planning to ship the "expand indices" option with K 6.0 (where we convert index* into an index list using the _field_stats API). IIRC it was intended as a backup plan in case the wildcard optimizations that ES has implemented aren't sufficient for some users.

If we are confident that removing this optimization from Kibana won't result in worse, or specifically unusable, performance then I'm not opposed to removing it now.

jimczi · 2017-07-06T23:50:38Z

If we are confident that removing this optimization from Kibana won't result in worse, or specifically unusable, performance then I'm not opposed to removing it now.

If the optim is to check if an index matches document in a specific timeframe with _field_stats then I think it's safe to remove. The only difference with the _field_stats solution is that the optim will happen locally on each shard rather than on Kibana side. If you have 2000 shards you'll send 2000 queries that should return very fast if the min/max timeframe is empty for them.
@colings86 should confirm but the min/max rewriting for time range queries plus the request cache should be enough to have decent perf even when doing a gazillion shard query.

colings86 · 2017-07-07T10:22:44Z

@colings86 should confirm but the min/max rewriting for time range queries plus the request cache should be enough to have decent perf even when doing a gazillion shard query.

I agree

trevan · 2017-07-10T19:47:25Z

With _field_stats gone, how does one easily calculate "old indices"? https://www.elastic.co/blog/managing-time-based-indices-efficiently has us using _field_stats and I don't think a min/max aggregation will be performant against lots of indices. I had mentioned this at #23914 (comment) as well but no one seemed to respond.

A problem with the min/max rewriting that I've just noticed is that it appears to require some warming up that is painful on our cluster. Our Kibana currently uses the _field_stats for index calculation and requests usually are <10 seconds. But if I run the same query without the index subset, the first time takes several minutes. After that, they are <10 seconds (though a few seconds slower). This warm up seems to happen if I haven't run a query in an hour and it increases the load average in my cluster fairly substantially.

spalger · 2017-07-10T20:20:02Z

@trevan sounds like you would benefit from #14835

trevan · 2017-07-10T20:26:41Z

@spalger, thanks for pointing that out. Unfortunately, we do have deletions and updates. Not a lot (<5% of the data), though.

jimczi · 2017-07-10T21:06:28Z

@trevan the "expand time based indices" feature should be done on ES side transparently. That's what we're trying to achieve. Kibana does not replace _field_stats with a min/max aggregation but simply remove the optimization and let ES do the right thing regarding the time range filters.
Now what you're describing is not expected, can you share the query that takes several minutes "without the index subset" ?

trevan · 2017-07-10T23:07:36Z

@jimczi, I know Kibana is trying to let ES handle the time range filters. I guess that must have been lost in what I said.

Here's the query that can take several minutes:

localhost:9200/logstash-*/_search?pretty' -d'{"size":0,"aggs":{"suggestions":{"terms":{"field":"ip.raw"}}},"query":{"range":{"@timestamp":{"gte":1499640596171,"lte":1499726996171,"format":"epoch_millis"}}}}'

I just did a test to grab the times using both the old _field_stats method and the new "let ES do it all". I did the following 3 steps in order.

_field_stats call took <1s
Above request using the index list took 6 seconds the first time and then 2 seconds the next two
Above request using the wildcard pattern took 47 seconds the first time and then 3 seconds the next two

This is using version 5.4.1

Should this be moved to a separate issue?

epixa · 2017-07-13T12:46:53Z

An update for those following along: the field_stats API was removed from master today, and the corresponding "expand indices" option in Kibana was removed as well.

This was made possible by a new change to the search/msearch APIs that automatically optimizes requests to only hit a subset of relevant shards that actually could have documents that match the given filters when there are a non-trivial amount of shards that match the given index pattern. This is a pretty naive summary of the change, so I encourage folks to take a look at the PR that added this improvement for more details: #25658

jimczi · 2017-07-13T12:50:29Z

@trevan we fixed a long list of issues to make _field_stats completely obsolete in the last few days: #25658 and #25632 are the main one.
They are all merged in master (v6) and should bring significant improvements in Kibana regarding the handling of time range filters. The issue that you described should also be fixed so _field_stats is effectively gone but so much more has been added to es core in the mean time to make the transition transparent in v6.

trevan · 2017-07-13T14:15:41Z

@jimczi, can these changes be backported to 5.x? I'd like to be able to see if these changes will actually help while staying in 5.x where _field_stats is still available as a back up option.

s1monw · 2017-07-13T15:41:07Z

@jimczi, can these changes be backported to 5.x? I'd like to be able to see if these changes will actually help while staying in 5.x where _field_stats is still available as a back up option.

we don't plan to backport these changes into 5.x

s1monw · 2017-07-13T15:47:32Z

we don't plan to backport these changes into 5.x

that said, I will spend some time looking into backporting it since I see the benefit here for a broader audience. so bare with me I will update this issue @trevan

trevan · 2017-07-13T15:51:35Z

@s1monw, are all the changes basically on the coordinating node? So if I have a 6.x client node talking to my 5.x cluster, would I be able to see if this is performant enough? My worry is that I currently have to use _field_stats in 5.x because of the performance hit and I won't be able to know if you've made it performant enough in 6.x until after I upgrade and by that time, I'm now stuck. That's why I'm asking about a backport. I would like someway to test these changes in my cluster before loosing _field_stats.

epixa · 2017-07-13T15:57:53Z

@trevan I can't answer your question in terms of Elasticsearch, but Kibana 5.5 will fail with a red status if it targets an ES cluster that has a 6.x node in it. This requirement will be relaxed in the final version of 5.x that is released to support a migration/rolling upgrade scenario, but you'll need to test the performance of this change with a raw query to Elasticsearch in that scenario.

trevan · 2017-07-13T16:27:18Z

@epixa, yeah, I was planning on doing a raw query. That's how I've been testing out the non _field_stats queries so far since it would otherwise Kibana would kill our cluster.

s1monw · 2017-07-13T18:02:11Z

@trevan you can only see these changes once you upgraded to the upcoming 5.6 and use a 6.0 client node that is correct. I will nevertheless look into backporting this feature it seems low risk at this poing

s1monw · 2017-07-15T19:11:07Z

@trevan FYI I back-ported the change that went into master to 5.x it will be released with 5.6

s1monw · 2017-07-17T09:12:12Z

@trevan it would be very much appreciated if you could report back once you upgraded to 5.6 is this helps you or not. The sooner the better, we might still have time to fix stuff if needed. Note, your nodes must all run 5.6 or higher in order to make use of the optimization. If you have questions about how you should structure your query please feel free to shard your current setup. Ideally we would do this in a discuss forum instead of here but you are more than welcome to paste the link to the discuss thread once you created it so we can follow up

trevan · 2017-07-17T14:49:48Z

@s1monw, thanks for the backport. I'll try and get us upgraded in the next month or two and report back.

s1monw · 2017-09-11T19:54:38Z

@trevan 5.6.0 is out. If would be fantastic if you could report back on this.

trevan · 2017-10-30T20:26:36Z

@s1monw, we just upgraded to 5.6.x. I ran the logstash-* version and then waited 5 hours before running the index list version. Both of them took about 16 seconds the first time. So it is looking good. Thanks.

jimczi added :Search/Search Search-related issues that do not fall into other categories blocker >breaking labels Jul 6, 2017

spalger mentioned this issue Jul 7, 2017

"Create index pattern" wizard. elastic/kibana#12689

Closed

5 tasks

colings86 mentioned this issue Jul 10, 2017

Removes FieldStats API #25628

Merged

epixa mentioned this issue Jul 10, 2017

Remove "expand index pattern when searching" setting for index patterns elastic/kibana#12736

Closed

colings86 closed this as completed in #25628 Jul 13, 2017

Remove _field_stats endpoint #25577

Remove _field_stats endpoint #25577

Comments

jimczi commented Jul 6, 2017

jimczi commented Jul 6, 2017

Uh oh!

Bargs commented Jul 6, 2017

Uh oh!

spalger commented Jul 6, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jimczi commented Jul 6, 2017

Uh oh!

colings86 commented Jul 7, 2017

Uh oh!

trevan commented Jul 10, 2017

Uh oh!

spalger commented Jul 10, 2017

Uh oh!

trevan commented Jul 10, 2017

Uh oh!

jimczi commented Jul 10, 2017

Uh oh!

trevan commented Jul 10, 2017

Uh oh!

epixa commented Jul 13, 2017

Uh oh!

jimczi commented Jul 13, 2017

Uh oh!

trevan commented Jul 13, 2017

Uh oh!

s1monw commented Jul 13, 2017

Uh oh!

s1monw commented Jul 13, 2017

Uh oh!

trevan commented Jul 13, 2017

Uh oh!

epixa commented Jul 13, 2017

Uh oh!

trevan commented Jul 13, 2017

Uh oh!

s1monw commented Jul 13, 2017

Uh oh!

s1monw commented Jul 15, 2017

Uh oh!

s1monw commented Jul 17, 2017

Uh oh!

trevan commented Jul 17, 2017

Uh oh!

s1monw commented Sep 11, 2017

Uh oh!

trevan commented Oct 30, 2017

Uh oh!

spalger commented Jul 6, 2017 •

edited

Loading