You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/master-node-sizing.adoc
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -49,10 +49,10 @@ On a large and dense cluster with three control plane nodes, the CPU and memory
49
49
50
50
[IMPORTANT]
51
51
====
52
-
The node sizing varies depending on the number of nodes and object counts in the cluster. It also depends on whether the objects are actively being created on the cluster. During object creation, the control plane is more active in terms of resource usage compared to when the objects are in the `running` phase.
52
+
The node sizing varies depending on the number of nodes and object counts in the cluster. It also depends on whether the objects are actively being created on the cluster. During object creation, the control plane is more active in terms of resource usage compared to when the objects are in the `Running` phase.
53
53
====
54
54
55
-
Operator Lifecycle Manager (OLM) runs on the control plane nodes and its memory footprint depends on the number of namespaces and user installed operators that OLM needs to manage on the cluster. Control plane nodes need to be sized accordingly to avoid OOM kills. Following data points are based on the results from cluster maximums testing.
55
+
Operator Lifecycle Manager (OLM) runs on the control plane nodes and its memory footprint depends on the number of namespaces and user installed operators that OLM needs to manage on the cluster. Control plane nodes need to be sized accordingly to avoid OOM kills. Following data points are based on the results from cluster maximums testing.
Copy file name to clipboardExpand all lines: modules/nodes-scheduler-taints-tolerations-about.adoc
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -18,7 +18,7 @@ endif::[]
18
18
19
19
A _taint_ allows a node to refuse a pod to be scheduled unless that pod has a matching _toleration_.
20
20
21
-
You apply taints to a node through the `Node` specification (`NodeSpec`) and apply tolerations to a pod through the `Pod` specification (`PodSpec`). When you apply a taint a node, the scheduler cannot place a pod on that node unless the pod can tolerate the taint.
21
+
You apply taints to a node through the `Node` specification (`NodeSpec`) and apply tolerations to a pod through the `Pod` specification (`PodSpec`). When you apply a taint to a node, the scheduler cannot place a pod on that node unless the pod can tolerate the taint.
Copy file name to clipboardExpand all lines: modules/recommended-etcd-practices.adoc
+2-2Lines changed: 2 additions & 2 deletions
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ Because etcd writes data to disk and persists proposals on disk, its performance
10
10
11
11
Those latencies can cause etcd to miss heartbeats, not commit new proposals to the disk on time, and ultimately experience request timeouts and temporary leader loss. High write latencies also lead to an OpenShift API slowness, which affects cluster performance. Because of these reasons, avoid colocating other workloads on the control-plane nodes that are I/O sensitive or intensive and share the same underlying I/O infrastructure.
12
12
13
-
Run etcd on a block device that can write at least 50 IOPS of 8KB sequentially, including fdatasync, in under 10ms. For heavy loaded clusters, sequential 500 IOPS of 8000 bytes (2 ms) are recommended. To measure those numbers, you can use a benchmarking tool, such as fio.
13
+
Run etcd on a block device that can write at least 50 IOPS of 8KB sequentially, including fdatasync, in under 10ms. For heavy loaded clusters, sequential 500 IOPS of 8000 bytes (2 ms) are recommended. To measure those numbers, you can use a benchmarking tool, such as the `fio` command.
14
14
15
15
To achieve such performance, run etcd on machines that are backed by SSD or NVMe disks with low latency and high throughput. Consider single-level cell (SLC) solid-state drives (SSDs), which provide 1 bit per memory cell, are durable and reliable, and are ideal for write-intensive workloads.
16
16
@@ -28,7 +28,7 @@ The following hard drive practices provide optimal etcd performance:
28
28
* Use solid state drives as a minimum selection. Prefer NVMe drives for production environments.
29
29
* Use server-grade hardware for increased reliability.
30
30
* Avoid NAS or SAN setups and spinning drives. Ceph Rados Block Device (RBD) and other types of network-attached storage can result in unpredictable network latency. To provide fast storage to etcd nodes at scale, use PCI passthrough to pass NVM devices directly to the nodes.
31
-
* Always benchmark by using utilities such as fio. You can use such utilities to continuously monitor the cluster performance as it increases.
31
+
* Always benchmark by using utilities such as `fio`. You can use such utilities to continuously monitor the cluster performance as it increases.
32
32
* Avoid using the Network File System (NFS) protocol or other network based file systems.
33
33
34
34
Some key metrics to monitor on a deployed {product-title} cluster are p99 of etcd disk write ahead log duration and the number of etcd leader changes. Use Prometheus to track these metrics.
0 commit comments