-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Rest client: provide the ability to define a specific timeout for a request #21789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @gsmet thanks for reporting this. How big of an overhead would be for you to use a specific client instance for long running operations? Do you have an example of requests that would need that? |
Hi @javanna, For instance, it might be useful when you use the bulk update API. This API call might take significantly longer that what you would expect for a "normal" search query. Setting the global timeout higher is not a good solution IMHO. In Hibernate Search (https://github.com/hibernate/hibernate-search - we are working on the Elasticsearch support for the next version), we also allow the user to set the timeout of a query. This is used to set the Lucene timeout in the Lucene backend and the Elasticsearch timeout in the Elasticsearch backend but, if we are using the Rest client, we need to be sure the HTTP timeout is higher than that. That being said, I think you're right about the number of method arguments and maybe a DSL would be in order. To be completely thorough, I tried to implement a backend for Jest (https://github.com/searchbox-io/Jest) based on the Elasticsearch rest client the other day and I found that the Jest client has a lot of useful options (the reaper for connections idle for too long being one but there are a few others which might be interesting). I think taking a look to what is done there might give you some good ideas. That being said, they do not support setting the timeout per query atm either. |
I think you mean the elasticsearch The socket timeout is quite different though, it is the timeout when waiting for individual packets, it doesn't mean the request will never take more than n seconds overall. That said I am not sure these two timeout mechanisms should be related to each other. There may still be use-cases for increasing the socket timeout per request though, we just need to find a good way to do it in the low-level client. Also important if we want to expose it to the high-level client that is in the works. |
We discussed this in our FixItFriday and we are still not convinced that we should expose the socket timeout per request. I explained above why the search timeout example shouldn't necessarily require to increase the socket timeout too, please let me know if that makes sense. Did you actually experience socket timeout problems when playing with search timeout? Or was that a theoretical use-case? |
Hello,
We did experience socket timeout issues in one particular case. In Hibernate Search, we setup indexes for users and then allow users to use those indexes (index, search). But in the initial setup phase, we have to wait for the index to reach a particular status (yellow or green). This is done through a request to the Health API, with a particularly long timeout, because in this particular case we can wait. Which means that, for this single Health API call to work, we have to raise the socket timeout to a high value, which will affect other calls (search API in particular). This is just an example, though: some requests to the Search API could require a longer timeout too, but only some of them. Here are the workarounds we could think of, and why they're not satisfactory:
Configurable timeouts are a feature we provide when searching on a local Lucene instance, and it would just feel weird that when the indexing service is remote (and thus timeout issues are more relevant), we provide less control. |
Hi @yrodiere thanks for the detailed explanation. The call to cluster health after creating an index should not be needed anymore against elasticsearch 5.x. Also, I wonder how long you would want to wait for green, 30 seconds (which is the default socket timeout) seems already quite high to me. I do agree that having to globally increase the timeout is not ideal, as well as having separate client instances just because you need a different socket timeout. But I would like to understand first whether you really need a higher timeout in those calls and why. |
Thanks for the info. We target ES2 too, though. Not because we want to, but, you know, there are still users out there.
We use 60s as a default. 30s would probably be alright, I agree. But I also think 30s is way too high for a global default. 30s for a random, reasonably tuned search requests (low hits size, sorts on optimized fields, etc.) is way too much. When your requests are used to build web pages, you generally don't want to wait for more than, say, one second.
The problem is not so much that we can't set arbitrarily high timeouts, but it's more that we can't set multiple, different timeouts. With a Health request I'd want something like 30s. With a create index query, maybe something similar, 30s. With a normal search query I'd want something like 1s. With a very heavy search query (executed for some rarely used report, and thus not optimized at all), maybe 5s or 10s would be required. Anyway, I wonder why we need to argue on whether high timeouts are needed. The Elasticsearch APIs provides ways to set timeouts. We established that currently, the non-configurable socket timeout gets in the way of these API timeouts. So, unless there are there plans to drop the API timeouts, I'd say we need to add something to make them usable on the client... don't we? |
heya @yrodiere I wouldn't call this arguing, I am asking questions to better understand the usecase, as the change requires modifying public apis to provide the timeout per request. I am looking for the right compromise between making things configurable, yet keeping them simple, which is a challenge and I am sure you know that too :) I definitely agree with you that the default of 30 seconds is way too high, I would advocate to have a 1 second default too, but users asked us to raise it and we saw in practice that 1 second was too low. Our reindex from remote api uses the REST client and we saw in practice the same problems, which convinced us to raise the default. I tend to be in favour of exposing the timeout per request, my concerns are mainly around the fact that it can be misused. For instance when you mention search and apis that expose a timeout parameter, I would ask you to read my reply above. The semantics of the timeout query_string parameter varies api by api and is different compared to socket timeout. I am marking this for discussion so we can talk about it again with the team and figure out what we want to do given the info you provided. Thank you! |
Ah, I can only promise I will try to use it right :) But yes, more configuration options usually means more ways to blow up in your face.
Looking forward to hearing from you then. Thanks! |
Note: I just discovered there is another global timeout that takes precedence over all other timeouts. It is If a solution is ever implemented, it probably should allow to raise this timeout per-request, too; otherwise raising the socket timeout will be useless. Was there any discussion about the issue? Do you think there is a chance you would accept a pull request? |
Right, I had forgotten about Good point, if we allow to change the socket timeout per request, also max retry should be changed per request, probably we should allow to pass in some kind of request config argument like the apache http client does. I don't know yet what we want to do, sorry we haven't discussed this yet with the team, I will bring it up asap and let you know. |
The ES RestClient does [not support](elastic/elasticsearch#21789 (comment)) a per request timeout configuration, however it does at least support a timeout configuration option `maxRetriesInMillis` that will timeout the overall request, including retries. It can be useful for others to configure this timeout, especially given the fact that users can configure their gremlin-script-execution timeout. Issues: JanusGraph#688
The ES RestClient does [not support](elastic/elasticsearch#21789 (comment)) a per request timeout configuration, however it does at least support a timeout configuration option `maxRetriesInMillis` that will timeout the overall request, including retries. It can be useful for others to configure this timeout, especially given the fact that users can configure their gremlin-script-execution timeout. Issues: JanusGraph#688 Signed-off-by: dpitera <[email protected]>
The ES RestClient does [not support](elastic/elasticsearch#21789 (comment)) a per request timeout configuration, however it does at least support a timeout configuration option `maxRetriesInMillis` that will timeout the overall request, including retries. It can be useful for others to configure this timeout, especially given the fact that users can configure their gremlin-script-execution timeout. Issues: JanusGraph#688 Signed-off-by: David Pitera <[email protected]>
The ES RestClient does [not support](elastic/elasticsearch#21789 (comment)) a per request timeout configuration, however it does at least support a timeout configuration option `maxRetriesInMillis` that will timeout the overall request, including retries. It can be useful for others to configure this timeout, especially given the fact that users can configure their gremlin-script-execution timeout. Issues: JanusGraph#688 Signed-off-by: David Pitera <[email protected]>
The ES RestClient does [not support](elastic/elasticsearch#21789 (comment)) a per request timeout configuration, however it does at least support a timeout configuration option `maxRetriesInMillis` that will timeout the overall request, including retries. It can be useful for others to configure this timeout, especially given the fact that users can configure their gremlin-script-execution timeout. Issues: JanusGraph#688 Signed-off-by: David Pitera <[email protected]>
The ES RestClient does [not support](elastic/elasticsearch#21789 (comment)) a per request timeout configuration, however it does at least support a timeout configuration option `maxRetriesInMillis` that will timeout the overall request, including retries. It can be useful for others to configure this timeout, especially given the fact that users can configure their gremlin-script-execution timeout. Issues: #688 Signed-off-by: David Pitera <[email protected]>
I may have a use case that requires this: We are creating a new index and bulk loading it. Per the ES recommendation, we are starting out at 0 replicas, and only once the index is fully loaded do we set the replicas higher. Once we do that, the index health goes yellow while it copies the data to the other nodes. We don't want to set the index live until the replicas are ready, so we're calling the health endpoint with wait_for_no_initializing_shards set to true. This can often take more than 30 seconds. Thoughts? |
The ES RestClient does [not support](elastic/elasticsearch#21789 (comment)) a per request timeout configuration, however it does at least support a timeout configuration option `maxRetriesInMillis` that will timeout the overall request, including retries. It can be useful for others to configure this timeout, especially given the fact that users can configure their gremlin-script-execution timeout. Issues: JanusGraph#688 Signed-off-by: David Pitera <[email protected]>
The ES RestClient does [not support](elastic/elasticsearch#21789 (comment)) a per request timeout configuration, however it does at least support a timeout configuration option `maxRetriesInMillis` that will timeout the overall request, including retries. It can be useful for others to configure this timeout, especially given the fact that users can configure their gremlin-script-execution timeout. Issues: JanusGraph#688
Hi, I'd like to add our use case here as well. With the removal of |
Wasn't this solved in 7.9 by #57972 ? From what I understand, one can now provide per-request configuration, including timeouts. |
@yrodiere seems to me this only provides the ability to select a socket timeout (timeout when waiting for individual packets), and not a real client side timeout, like the |
@dorony As you noticed, This ticket (created by a coworker of mine) was initially about being able to raise the existing timeouts on a per-request basis so that a long-running operation (bulk updates, health check, ...) can be executed from time to time without raising the timeouts globally to some absurdly large value (e.g. 60s). As far as I can tell, that should now be doable. I still have to try and implement it in my library, though. As for the client timeout, I personally implement it with a separate |
Thanks @yrodiere. I think that from a user perspective a real client side timeout (implemented with a While implementing it outside the client is possible, it doesn't provide for the best experience - for example I won't be able to see which host the request has timed out against. Do you think I should open a different issue for that? |
I do. But I'm not a developer of this client, I'm a user just like you :) |
This has been open for quite a while, and hasn't had a lot of interest. For now I'm going to close this as something we aren't planning on implementing. We can re-open it later if needed. We also have the https://github.com/elastic/elasticsearch-java/ which is under more active development. |
Hi!
Describe the feature:
Currently, the new Rest client only allows to set a default
RequestConfigCallback
globally. It would be nice if we could set aRequestConfigCallback
specifically for a request.Typically, it can be very useful when you want to set a specific timeout for a long operation. It would be nice if this
RequestConfigCallback
could be provided with the information of the request (so that for instance you could set a different timeout for different types of requests).--
Guillaume
The text was updated successfully, but these errors were encountered: