Skip to content

Commit 0dc40f9

Browse files
Add a note about WMCO kubelet dependency
1 parent edd7c29 commit 0dc40f9

File tree

1 file changed

+10
-6
lines changed

1 file changed

+10
-6
lines changed

Diff for: enhancements/machine-api/out-of-tree-provider-support.md

+10-6
Original file line numberDiff line numberDiff line change
@@ -203,6 +203,8 @@ Without this information, the `scheduler` cannot schedule `Pods` that have any s
203203
To ensure that cluster disaster recovery procedures can still operate smoothly, we will ensure that core control plane components and their operators tolerate the uninitialized taint, to prevent `CCM` blocking new control plane hosts being added if `CCM` is non-functional.
204204
This will include, but is not limited to: Kube Controller Manager, Etcd, Kube API Server, Networking, Cluster Machine Approver.
205205

206+
*Note: Windows Machine Config Operator (WMCO) manages `kubelet` configuration for Windows nodes, for which described changes in this chapter would still apply.*
207+
206208
##### Example flag changes for kubelet
207209

208210
Current flag configuration for kubelet in AWS provider:
@@ -332,18 +334,19 @@ Once the provider is moved to out-of-tree, the migration mechanism will be disab
332334

333335
#### Bootstrap changes
334336

335-
One of the responsibilities of the initialisation process for Kubelet is to set the `Node`’s IP addresses within the status of the `Node` object. The remaining responsibilities are not important for this document bar the removal of a taint which prevents workloads running on the `Node` until the initialisation has completed.
337+
One of the responsibilities of the initialisation process for Kubelet is to set the `Node`’s IP addresses within the status of the `Node` object. The remaining responsibilities are not important for bootstrapping, bar the removal of a taint which prevents workloads running on the `Node` until the initialisation has completed.
336338

337339
A second part of the bootstrap process for a new `Node`, is to initialise the `CNI` (networking). Typically in an OpenShift cluster, this is handled once the Networking Operator starts.
338-
The Networking operator will create the `CNI` pods (typically OpenShift SDN), which schedule on the `Node`, use the `Node` IP addresses to create a `HostSubnet` resource within Kubernetes and then mark then complete the initialisation process for the `CNI`, in doing so, marking the `Node` as ready and allowing the remaining workloads to start.
340+
The Networking operator will create the `CNI` pods (typically OpenShift SDN), which schedule on the `Node`, use the `Node` IP addresses to create a `HostSubnet` resource within Kubernetes and then complete the initialisation process for the `CNI`, in doing so, marking the `Node` as ready and allowing the remaining workloads to start.
339341

340-
Before the `CNI` is initialized on a `Node`, in-cluster networking such as Service IPs, in particular the API server Service, will not work for any `Pod` on the `Node`. Additionally, any `Pod` that requires the Pod Networking implemented by `CNI`, cannot start. For this reason, `Pods` such as the Networking Operator must use host networking and the “API Int” load balancer to contact the Kube API Server.
342+
Before the `CNI` is initialized on a `Node`, in-cluster networking such as Service IPs, in particular the API server Service, will not work for any `Pod` on the `Node`. Additionally, any `Pod` that requires the Pod Networking implemented by `CNI`, cannot start.
343+
For this reason, `Pods` such as the Networking Operator must use host networking and the “API Int” load balancer to contact the Kube API Server.
341344

342345
Because the `CCM` is taking over the responsibility of setting the `Node` IP addresses, `CCM` will become a prerequisite for networking to become functional within any Cluster. Because the `CNI` is not initialised, we must ensure that the `CCCMO` and `CCM` Pods tolerate the scenario where `CNI` is non-functional.
343346

344347
To do so, we must tolerate the not-ready taint for these pods and they must all run with host networking and use the API load balancer, rather than using the internal Service. This will ensure that the cluster can bootstrap successfully and recover from any disaster recovery scenario.
345348

346-
Our operator will take precedence for CNI operator. It will tolerate `NotReady` `NoSchedule` taint and `CCM` specific `Uninitialized` taint. Operator would start as the first operator in the cluster when first `control-plane` is created, and be responsible for initializing `Nodes` which will allow latter operators to start.
349+
Our operator will become a prerequisite for the Network Operator. CCCMO will tolerate the `Node` `NotReady:NoSchedule` and `CCM` specific `Uninitialized` taints. CCCMO will start as the first operator on the control plane hosts, and be responsible for initializing `Nodes`, allowing other operators to start.
347350

348351
#### Metrics
349352

@@ -553,7 +556,8 @@ This functionality should not be required as OpenShift handles certificate appro
553556

554557
Q: Does every node need a CCM?
555558

556-
- A: No. The cluster only needs one active `CCM` at any time. A `Deployment` will manage the `CCM` pod and will have 2 replicas which will use leader election to nominate an active leader.
559+
- A: No. The cluster only needs one active `CCM` at any time. A `Deployment` will manage the `CCM` pod and will have 2 replicas which will use leader election to nominate an active leader and maintain HA by scheduling on control-plane nodes located in the different regions.
560+
Only in some scenarios (depending on the cloud provider implementation) like Azure `cloud-node-manager` has to run on all the `Nodes` due to 1:1 relation betwen `Node` and a `CNM` replica in their case.
557561
[Source](https://kubernetes.io/docs/concepts/overview/components/#cloud-controller-manager) This assumption may change in the future, as `CCM` may run in worker nodes to determine the state of the instance.
558562

559563
Q: How metrics are affected by the CCM migration?
@@ -708,4 +712,4 @@ Mandatory operator repository:
708712
- [The Kubernetes Cloud Controller Manager](https://medium.com/@m.json/the-kubernetes-cloud-controller-manager-d440af0d2be5) article
709713
https://hackmd.io/00IoVWBiSVm8mMByxerTPA#
710714
- [CSI support](https://github.com/openshift/enhancements/blob/master/enhancements/storage/csi-driver-install.md#ocp-45-kubernetes-118)
711-
- [CNI ]
715+
- [CCM role in bootstrap process](https://docs.google.com/document/d/1yAczhHNJ4rDqVFFvyi7AZ27DEQdvx8DmLNbavIjrjn0)

0 commit comments

Comments
 (0)