|
| 1 | +[[system-config-tcpretries]] |
| 2 | +=== TCP retransmission timeout |
| 3 | + |
| 4 | +Each pair of nodes in a cluster communicates via a number of TCP connections |
| 5 | +which remain open until one of the nodes shuts down or communication between |
| 6 | +the nodes is disrupted by a failure in the underlying infrastructure. |
| 7 | + |
| 8 | +TCP provides reliable communication over occasionally-unreliable networks by |
| 9 | +hiding temporary network disruptions from the communicating applications. Your |
| 10 | +operating system will retransmit any lost messages a number of times before |
| 11 | +informing the sender of any problem. Most Linux distributions default to |
| 12 | +retransmitting any lost packets 15 times. Retransmissions back off |
| 13 | +exponentially, so these 15 retransmissions take over 900 seconds to complete. |
| 14 | +This means it takes Linux many minutes to detect a network partition or a |
| 15 | +failed node with this method. Windows defaults to just 5 retransmissions which |
| 16 | +corresponds with a timeout of around 6 seconds. |
| 17 | + |
| 18 | +The Linux default allows for communication over networks that may experience |
| 19 | +very long periods of packet loss, but this default is excessive for production |
| 20 | +networks within a single data centre as is the case for most {es} clusters. |
| 21 | +Highly-available clusters must be able to detect node failures quickly so that |
| 22 | +they can react promptly by reallocating lost shards, rerouting searches and |
| 23 | +perhaps electing a new master node. Linux users should therefore reduce the |
| 24 | +maximum number of TCP retransmissions. |
| 25 | + |
| 26 | +You can decrease the maximum number of TCP retransmissions to `5` by running |
| 27 | +the following command as `root`. Five retransmissions corresponds with a |
| 28 | +timeout of around 6 seconds. |
| 29 | + |
| 30 | +[source,sh] |
| 31 | +------------------------------------- |
| 32 | +sysctl -w net.ipv4.tcp_retries2=5 |
| 33 | +------------------------------------- |
| 34 | + |
| 35 | +To set this value permanently, update the `net.ipv4.tcp_retries2` setting in |
| 36 | +`/etc/sysctl.conf`. To verify after rebooting, run `sysctl |
| 37 | +net.ipv4.tcp_retries2`. |
| 38 | + |
| 39 | +{es} also implements its own health checks with timeouts that are much shorter |
| 40 | +than the default retransmission timeout on Linux. However these health checks |
| 41 | +must allow for application-level effects such as garbage collection pauses. We |
| 42 | +do not recommend reducing any timeouts related to these application-level |
| 43 | +health checks. |
| 44 | + |
| 45 | +IMPORTANT: This setting applies to all TCP connections and will affect the |
| 46 | +reliability of communication with systems outside your cluster too. If your |
| 47 | +cluster communicates with external systems over an unreliable network then you |
| 48 | +may need to select a higher value for `net.ipv4.tcp_retries2`. For this reason, |
| 49 | +{es} does not adjust this setting automatically. |
0 commit comments