-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Is the default max_concurrent_shard_requests
(5) too low?
#60197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Pinging @elastic/es-search (:Search/Search) |
Pinging @elastic/es-search (Team:Search) |
This came up recently as in some of our CCS benchmarks, we noticed that the default The current default of 5 requests per data node is arbitrary, and it is hard to tune for users too, as it depends on how many clients are sending requests too. The throttling happens on the coordinating node which has no information about how data nodes are doing, hence it limits sending shard level requests to them. When we discussed this topic we said that data nodes are better equipped to make decisions around how many requests to run in parallel, looking at current load, and that coordinators should just send what they have without applying any preventive throttling. One prerequisite though would be to batch shard level requests rather than sending them one by one which would help reduce the number of roundtrips and the load on the transport layer. |
We are getting our search performance impacted because of max_concurrent_shard_requests=5 being too low. its hitting hard on our cluster stability despite having enough compute power at hand. it would be really beneficial to at least make this parameter customizable. |
Pinging @elastic/es-search-foundations (Team:Search Foundations) |
There is agreement that max_concurrent_shard_requests is not the right knob, because it gives control to the coordinating node, and is quite a global knob too as it affects how shard level requests are throttled on each data node. It is hard to determine the right number as well, and it turned out to be a bottleneck in quite a few cases. We are not planning to increase the default value though, rather we will prioritize #112306 which moves the control to each data nodes, that are better placed to decide the execution pace of shard level requests belonging to a certain search request. |
It's useful to cap the maximum number of shard requests that can go to a single node. However I worry that the default value of 5 might be too low as it prevents users from leveraging the entire computing power of their cluster if their nodes have more than 5 cores and they don't run concurrent requests?
The text was updated successfully, but these errors were encountered: