Skip to content

The Jest Elasticsearch health indicator uses a potentially heavy call to check its status #9379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rodol-fo opened this issue Jun 1, 2017 · 14 comments
Labels
status: superseded An issue that has been superseded by another type: bug A general bug

Comments

@rodol-fo
Copy link

rodol-fo commented Jun 1, 2017

The spring-boot-actuator Elasticsearch health check performs a call to /_all/_stats which potentially comes with a big response. In my case, this response is ~8MB at the moment and regular health checks were causing a huge load on the system. We create indices at an hourly basis. IMHO, using /_cat/health is better suited for this task

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Jun 1, 2017
@bclozel
Copy link
Member

bclozel commented Jun 2, 2017

I agree, the output of that command can be quite large.

It seems that the /_cat/health endpoint may not be supported in SaaS environments though; they tend to use REST gateways and limit features in shared environments, see searchly documentation for example. I've just tried with a searchly instance (ES 5.1+) and this doesn't work (I'm getting an HTTP 500).

Do you know of other potential endpoints that we could use?

@bclozel bclozel added status: waiting-for-feedback We need additional information before we can continue and removed status: waiting-for-triage An issue we've not yet triaged labels Jun 2, 2017
@rodol-fo
Copy link
Author

rodol-fo commented Jun 5, 2017

In the case of Searchly, all administrative features of ElasticSearch are restricted from the API, according to their documentation. That doesn't leave us with a lot of room, does it? I'm sure that Searchly have their reasons to do so, but that is one case out of millions of users, and many of them are not developing in shared environments with special restrictions.

I still believe that the right endpoint to call by default should be /_cat/health which could be overridden via configuration properties.

Instead of having management.health.elasticsearch.indices= we could have management.health.elasticsearch.endpoint=/_cat/health

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Jun 5, 2017
@philwebb philwebb added priority: normal type: bug A general bug and removed status: feedback-provided Feedback has been provided labels Jun 6, 2017
@philwebb philwebb added this to the 1.5.4 milestone Jun 6, 2017
@philwebb
Copy link
Member

philwebb commented Jun 6, 2017

I think we should treat this one as a bug so I've targeted it to 1.5.4. We've only got a few days before that release so we might not get to it in time. It still feels like a solution in 1.5.x would be useful.

@philwebb philwebb added the for: team-attention An issue we'd like other members of the team to review label Jun 6, 2017
@bclozel
Copy link
Member

bclozel commented Jun 6, 2017

Note: it's not a case of Searchly vs. rest of the world. It's more a case of ElasticSearch instances running on premise vs. in the cloud (the official elastic.co cloud offering or the AWS one). Even companies running their own cluster can have X-Pack enabled and disable cluster info access from applications.

@snicoll snicoll modified the milestones: 1.5.4, 1.5.5 Jun 8, 2017
@snicoll snicoll removed the for: team-attention An issue we'd like other members of the team to review label Jun 21, 2017
@bclozel
Copy link
Member

bclozel commented Jun 21, 2017

The team just discussed this topic and it seems that changing the default for all setups out there might not be the solution we're looking for. Having a strategy that uses one or the other might be more useful; in fact, the HealthIndicator bean could be that strategy.

Right now, defining your own bean named elasticsearchHealthIndicator should replace the existing one and provide something more suited to your needs.

We could consider switching strategies and providing different information automatically, but I'm not aware of any clear signal in the configuration that says if you've got access to those endpoints or not (besides trying and possibly failing). Any idea about that @rodol-fo ?

Let's keep this issue opened until we've figured this out.

@wilkinsona wilkinsona modified the milestones: 1.5.6, 1.5.5 Jul 25, 2017
@wilkinsona
Copy link
Member

@rodol-fo Your input would be useful here. Please see the comment from @bclozel above.

@wilkinsona wilkinsona modified the milestones: 1.5.8, 1.5.7 Sep 11, 2017
@wilkinsona wilkinsona added the status: blocked An issue that's blocked on an external project change label Sep 11, 2017
@rodol-fo
Copy link
Author

rodol-fo commented Sep 14, 2017

Hello @bclozel,

I think you are right. The only bulletproof way of knowing if a particular ES endpoint is available is by trying a request on it. However, in my experience /_cluster/health was always available when I worked with cloud based ES clusters, especifically AWS and Elastic.

I'm not so sure about /_cat/health in the case Elastic Cloud but it is definitely available for Amazon. See the below:

http://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-supported-es-operations.html

I hope this helps

@wilkinsona
Copy link
Member

Thanks, @rodol-fo.

I think I've changed my opinion on this. Having spent some more time reading around, and reading the Elasticsearch documentation in particular, I now think that /_cluster/health is a more appropriate default for the health indicator. However, I can't see a way to make that change in 1.5.x without possibly breaking people's applications.

Perhaps we should consider leaving this as-is in 1.5 and making a change in 2.0 instead?

@wilkinsona wilkinsona added the for: team-attention An issue we'd like other members of the team to review label Oct 13, 2017
@snicoll
Copy link
Member

snicoll commented Oct 14, 2017

That last option has my vote.

@wilkinsona wilkinsona removed this from the 1.5.8 milestone Oct 16, 2017
@wilkinsona wilkinsona removed for: team-attention An issue we'd like other members of the team to review status: blocked An issue that's blocked on an external project change labels Oct 18, 2017
@wilkinsona wilkinsona added this to the 2.0.0.M6 milestone Oct 18, 2017
@bclozel
Copy link
Member

bclozel commented Oct 30, 2017

It seems the /_cat/** endpoints can only be leveraged using REST calls directly; this means you can use any HTTP client directly or the official elasticsearch-rest-client (ES 5.0+) / elasticsearch-rest-high-level-client (ES 6.0+) dependencies.

In other words, we can't use the Java native transport API to fetch that information.
The org.elasticsearch.client.Client native transport API or io.searchbox.client.JestClient are what we're auto-configuring, and none of those seem to support that endpoint.

The Java native transport should be deprecated in the future, see here and here.
Ultimately, using elasticsearch-rest-client means using Apache httpclient (4.5.x, so in line with what we've got right now).

With all that in mind, a couple of options:

  1. Leave the current behavior as it is and revisit things in the future
  2. Deprecate both ES health indicators in 1.5.x and replace them with a single indicator that uses elasticsearch-rest-client, for both native and Jest use cases. Note that this choice probably means altering existing starters to include that dependency. Of course, it would also make sense to auto-configure a elasticsearch-rest-client instance for application usage as well.

@snicoll, @wilkinsona, @philwebb, @rodol-fo - let me know what you think and if I've missed something.

@philwebb
Copy link
Member

I don't think we should do much in 1.5.x except perhaps add a warning to the documentation. For 2.x, I guess option 2 makes the most sense in the long-term, but it's certainly something we can tackle in 2.1 if necessary.

@bclozel bclozel modified the milestones: 2.0.0.M6, 2.1 Oct 31, 2017
@bclozel
Copy link
Member

bclozel commented Mar 22, 2018

Depends on #12600

@bclozel bclozel added the status: blocked An issue that's blocked on an external project change label Mar 22, 2018
@bclozel bclozel removed their assignment Mar 22, 2018
@bclozel bclozel removed the status: blocked An issue that's blocked on an external project change label May 7, 2018
@bclozel
Copy link
Member

bclozel commented May 7, 2018

#12600 is now fixed, and we can try a global approach for that change now.
/_cluster/health is still probably the best way to go, even if some providers don't support it (at least one of them doesn't seem to support /_all/_stats anyway).

@bclozel bclozel modified the milestones: General Backlog, 2.1.x Oct 12, 2018
@bclozel bclozel changed the title The Elasticsearch health indicator uses a potentially heavy call to check its status The Jest Elasticsearch health indicator uses a potentially heavy call to check its status Oct 29, 2018
@bclozel bclozel added the status: superseded An issue that has been superseded by another label Oct 29, 2018
@bclozel bclozel removed this from the General Backlog milestone Oct 29, 2018
@bclozel
Copy link
Member

bclozel commented Oct 29, 2018

The ES REST client will be supported by Spring Data ElasticSearch in the next Moore release train. As of #12600, we're auto-configuring those clients in Spring Boot.

I'm closing this issue in favour of #14914 since Jest is now supporting the cluster endpoint. This aligns the Jest health indicator with what we've got for the transport client already.

We'll figure the rest out in #15008.

@bclozel bclozel closed this as completed Oct 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: superseded An issue that has been superseded by another type: bug A general bug
Projects
None yet
Development

No branches or pull requests

6 participants