Skip to content

Commit de9897b

Browse files
Refactor InfraMachine contract
1 parent 404084f commit de9897b

File tree

5 files changed

+525
-294
lines changed

5 files changed

+525
-294
lines changed

docs/book/src/developer/core/controllers/cluster.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ The Cluster controller is responsible for reconciling the Cluster resource.
55
In order to allow Cluster provisioning on different type of infrastructure, The Cluster resource references
66
an InfraCluster object, e.g. AWSCluster, GCPCluster etc.
77

8-
The [InfraCluster resource contract](../../providers/contracts/infra-cluster.md) defines a set of rules a provider is expected to comply in order to allow
8+
The [InfraCluster resource contract](../../providers/contracts/infra-cluster.md) defines a set of rules a provider is expected to comply with in order to allow
99
the expected interactions with the Cluster controller.
1010

1111
Among those rules:
@@ -18,7 +18,7 @@ Among those rules:
1818
Similarly, in order to support different solutions for control plane management, The Cluster resource references
1919
an ControlPlane object, e.g. KubeadmControlPlane, EKSControlPlane etc.
2020

21-
The [ControlPlane resource contract](../../providers/contracts/control-plane.md) defines a set of rules a provider is expected to comply in order to allow
21+
The [ControlPlane resource contract](../../providers/contracts/control-plane.md) defines a set of rules a provider is expected to comply with in order to allow
2222
the expected interactions with the Cluster controller.
2323

2424
Considering all the info above, the Cluster controller's main responsibilities are:
Original file line numberDiff line numberDiff line change
@@ -1,20 +1,39 @@
1-
# Machine Controller
1+
# Machine Controller
22

3-
![](../../../images/cluster-admission-machine-controller.png)
3+
The Machine controller is responsible for reconciling the Machine resource.
4+
5+
In order to allow Machine provisioning on different type of infrastructure, The Machine resource references
6+
an InfraMachine object, e.g. AWSMachine, GCMachine etc.
7+
8+
The [InfraMachine resource contract](../../providers/contracts/infra-machine.md) defines a set of rules a provider is expected to comply with in order to allow
9+
the expected interactions with the Machine controller.
10+
11+
Among those rules:
12+
- InfraMachine MUST report a [provider ID](../../providers/contracts/infra-machine.md#inframachine-provider-id) for the Machine
13+
- InfraMachine SHOULD take into account the [failure domain](../../providers/contracts/infra-machine.md#inframachine-failure-domain) where machines should be placed in
14+
- InfraMachine SHOULD surface machine's [addresses](../../providers/contracts/infra-machine.md#inframachine-addresses) to help operators when troubleshooting issues
15+
- InfraMachine MUST report when Machine's infrastructure is [fully provisioned](../../providers/contracts/infra-machine.md#inframachine-initialization-completed)
16+
- InfraMachine SHOULD report [conditions](../../providers/contracts/infra-machine.md#inframachine-conditions)
17+
- InfraMachine SHOULD report [terminal failures](../../providers/contracts/infra-machine.md#inframachine-terminal-failures)
18+
19+
Similarly, in order to support different machine bootstrappers, The Machine resource references
20+
a BootstrapConfig object, e.g. KubeadmBoostrapConfig etc.
21+
22+
The [BootstrapConfig resource contract](../../providers/contracts/bootstrap-config.md) defines a set of rules a provider is expected to comply with in order to allow
23+
the expected interactions with the Machine controller.
424

5-
The Machine controller's main responsibilities are:
25+
Considering all the info above, the Machine controller's main responsibilities are:
626

7-
* Setting an OwnerReference on:
8-
* Each Machine object to the Cluster object.
9-
* The associated BootstrapConfig object.
10-
* The associated InfrastructureMachine object.
11-
* Copy data from `BootstrapConfig.Status.DataSecretName` to `Machine.Spec.Bootstrap.DataSecretName` if
12-
`Machine.Spec.Bootstrap.DataSecretName` is empty.
13-
* Setting NodeRefs to be able to associate machines and Kubernetes nodes.
14-
* Deleting Nodes in the target cluster when the associated machine is deleted.
15-
* Cleanup of related objects.
16-
* Keeping the Machine's Status object up to date with the InfrastructureMachine's Status object.
17-
* Finding Kubernetes nodes matching the expected providerID in the workload cluster.
27+
* Setting an OwnerReference on the infrastructure object referenced in `Machine.spec.infrastructureRef`.
28+
* Setting an OwnerReference on the bootstrap object referenced in `Machine.spec.bootstrap.configRef`.
29+
* Keeping the Machine's status in sync with the InfraMachine and BootstrapConfig's status.
30+
* Finding Kubernetes nodes matching the expected providerID in the workload cluster.
31+
* Setting NodeRefs to be able to associate machines and Kubernetes nodes.
32+
* Monitor Kubernetes nodes and propagate labels to them.
33+
* Cleanup of all owned objects so that nothing is dangling after deletion.
34+
* Drain nodes and wait for volumes being detached by CSI plugins.
35+
36+
![](../../../images/cluster-admission-machine-controller.png)
1837

1938
After the machine controller sets the OwnerReferences on the associated objects, it waits for the bootstrap
2039
and infrastructure objects referenced by the machine to have the `Status.Ready` field set to `true`. When
@@ -25,108 +44,3 @@ The machine controller uses the kubeconfig for the new workload cluster to watch
2544
When a node appears with `Node.Spec.ProviderID` matching `Machine.Spec.ProviderID`, the machine controller
2645
transitions the associated machine into the `Provisioned` state. When the infrastructure ref is also
2746
`Ready`, the machine controller marks the machine as `Running`.
28-
29-
## Contracts
30-
31-
### Cluster API
32-
33-
Cluster associations are made via labels.
34-
35-
#### Expected labels
36-
37-
| what | label | value | meaning |
38-
| --- | --- | --- | --- |
39-
| Machine | `cluster.x-k8s.io/cluster-name` | `<cluster-name>` | Identify a machine as belonging to a cluster with the name `<cluster-name>`|
40-
| Machine | `cluster.x-k8s.io/control-plane` | `true` | Identifies a machine as a control-plane node |
41-
42-
### Bootstrap provider
43-
44-
The BootstrapConfig object **must** have a `status` object.
45-
46-
To override the bootstrap provider, a user (or external system) can directly set the `Machine.Spec.Bootstrap.Data`
47-
field. This will mark the machine as ready for bootstrapping and no bootstrap data will be copied from the
48-
BootstrapConfig object.
49-
50-
#### Required `status` fields
51-
52-
The `status` object **must** have several fields defined:
53-
54-
* `ready` - a boolean field indicating the bootstrap config data is generated and ready for use.
55-
* `dataSecretName` - a string field referencing the name of the secret that stores the generated bootstrap data.
56-
57-
#### Optional `status` fields
58-
59-
The `status` object **may** define several fields that do not affect functionality if missing:
60-
61-
* `failureReason` - a string field explaining why a fatal error has occurred, if possible.
62-
* `failureMessage` - a string field that holds the message contained by the error.
63-
64-
Note: once any of `failureReason` or `failureMessage` surface on the machine who is referencing the bootstrap config object,
65-
they cannot be restored anymore (it is considered a terminal error; the only way to recover is to delete and recreate the machine).
66-
Also, if the machine is under control of a MachineHealthCheck instance, the machine will be automatically remediated.
67-
68-
Example:
69-
70-
```yaml
71-
kind: MyBootstrapProviderConfig
72-
apiVersion: bootstrap.cluster.x-k8s.io/v1alpha3
73-
status:
74-
ready: true
75-
dataSecretName: "MyBootstrapSecret"
76-
```
77-
78-
### Infrastructure provider
79-
80-
The InfrastructureMachine object **must** have both `spec` and `status` objects.
81-
82-
#### Required `spec` fields
83-
84-
The `spec` object **must** at least one field defined:
85-
86-
* `providerID` - a cloud provider ID identifying the machine.
87-
88-
#### Optional `spec` fields
89-
90-
The `spec` object **may** define several fields that do not affect functionality if missing:
91-
92-
* `failureDomain` - is a string identifying the failure domain the instance is running in.
93-
94-
#### Required `status` fields
95-
96-
The `status` object **must** at least one field defined:
97-
98-
* `ready` - a boolean field indicating if the infrastructure is ready to be used or not.
99-
100-
#### Optional `status` fields
101-
102-
The `status` object **may** define several fields that do not affect functionality if missing:
103-
104-
* `failureReason` - is a string that explains why a fatal error has occurred, if possible.
105-
* `failureMessage` - is a string that holds the message contained by the error.
106-
* `addresses` - is a `MachineAddresses` (a list of `MachineAddress`) which represents host names, external IP addresses, internal IP addresses,
107-
external DNS names, and/or internal DNS names for the provider's machine instance. `MachineAddress` is
108-
defined as:
109-
- `type` (string): one of `Hostname`, `ExternalIP`, `InternalIP`, `ExternalDNS`, `InternalDNS`
110-
- `address` (string)
111-
112-
Note: once any of `failureReason` or `failureMessage` surface on the machine who is referencing the infrastructureMachine object,
113-
they cannot be restored anymore (it is considered a terminal error; the only way to recover is to delete and recreate the machine).
114-
Also, if the machine is under control of a MachineHealthCheck instance, the machine will be automatically remediated.
115-
116-
Example:
117-
```yaml
118-
kind: MyMachine
119-
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
120-
spec:
121-
providerID: cloud:////my-cloud-provider-id
122-
status:
123-
ready: true
124-
```
125-
126-
### Secrets
127-
128-
The Machine controller will create a secret or use an existing secret in the following format:
129-
130-
| secret name | field name | content |
131-
|:---:|:---:|---|
132-
|`<cluster-name>-kubeconfig`|`value`|base64 encoded kubeconfig that is authenticated with the child cluster|

docs/book/src/developer/providers/contracts/infra-cluster.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ rules:
119119
- watch
120120
```
121121
122-
Note: The write permissions allow the Cluster controller to set owner references and labels on the InfraCluster resources;
122+
Note: The write permissions allow the Cluster controller to set owner references and labels on the InfraCluster resources;
123123
write permissions are not used for general mutations of InfraCluster resources, unless specifically required (e.g. when
124124
using ClusterClass and managed topologies).
125125
@@ -271,7 +271,7 @@ Each InfraCluster MUST report when Cluster's infrastructure is fully provisioned
271271

272272
```go
273273
type FooClusterStatus struct {
274-
// Ready denotes that the foo cluster infrastructure fully provisioned.
274+
// Ready denotes that the foo cluster infrastructure is fully provisioned.
275275
// +optional
276276
Ready bool `json:"ready"`
277277

@@ -282,7 +282,7 @@ type FooClusterStatus struct {
282282

283283
Once `status.ready` the Cluster "core" controller will bubbles up this info in Cluster's `status.infrastructureReady`;
284284
If defined, also InfraCluster's `spec.controlPlaneEndpoint` and `status.failureDomains` will be surfaced on Cluster's
285-
corresponding field at the same time.
285+
corresponding fields at the same time.
286286

287287
<aside class="note warning">
288288

0 commit comments

Comments
 (0)