A data node is both source and destination during cluster recovery. #88876

linker-c · 2022-07-27T22:14:53Z

Elasticsearch Version

7.15

Installed Plugins

No response

Java Version

bundled

OS Version

5.4.0-1045-aws #47-Ubuntu

Problem Description

When adding several new data nodes to my existing cluster. The cluster will start the recovery process.
If I have adjusted setting for "cluster_concurrent_rebalance" and "node_concurrent_recoveries" to some values greater than 5, I start to see shards being moved in and out of the same data node. This caused the recovery to never end.
I even have the "cluster.routing.allocation.balance.threshold" set to 6. I assume this means each data node can have up to 6 shard difference without needing to rebalance.
Please see below image where data node 102 is both the source and destination for different shards being moved.
This data comes from "GET /_cat/recovery?v&active_only"

Steps to Reproduce

With an existing ES cluster (I have 30 data nodes), adding 3 new data nodes with the following cluster setting.

{
"persistent" : {
"action" : {
"destructive_requires_name" : "true"
},
"cluster" : {
"routing" : {
"allocation" : {
"balance" : {
"threshold" : "6"
},
"cluster_concurrent_rebalance" : "10",
"node_concurrent_recoveries" : "5"
}
}
},
"search" : {
"default_search_timeout" : "30s"
},
"ingest" : {
"geoip" : {
"downloader" : {
"enabled" : "false"
}
}
}
},
"transient" : { }
}

Logs (if relevant)

No response

DaveCTurner · 2022-08-01T11:11:54Z

Closing as a duplicate of #87279.

linker-c added >bug needs:triage Requires assignment of a team area label labels Jul 27, 2022

DaveCTurner closed this as completed Aug 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A data node is both source and destination during cluster recovery. #88876

A data node is both source and destination during cluster recovery. #88876

linker-c commented Jul 27, 2022 •

edited

Loading

DaveCTurner commented Aug 1, 2022

A data node is both source and destination during cluster recovery. #88876

A data node is both source and destination during cluster recovery. #88876

Comments

linker-c commented Jul 27, 2022 • edited Loading

Elasticsearch Version

Installed Plugins

Java Version

OS Version

Problem Description

Steps to Reproduce

Logs (if relevant)

DaveCTurner commented Aug 1, 2022

linker-c commented Jul 27, 2022 •

edited

Loading