-
Notifications
You must be signed in to change notification settings - Fork 1.4k
⚠️ Add ClusterClass types #4928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -60,6 +60,73 @@ type ClusterSpec struct { | |
// for provisioning infrastructure for a cluster in said provider. | ||
// +optional | ||
InfrastructureRef *corev1.ObjectReference `json:"infrastructureRef,omitempty"` | ||
|
||
// This encapsulates the topology for the cluster. | ||
// NOTE: It is required to enable the ClusterTopology | ||
// feature gate flag to activate managed topologies support; | ||
// this feature is highly experimental, and parts of it might still be not implemented. | ||
// +optional | ||
Topology *Topology `json:"topology,omitempty"` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. General question about the PR: does it make master unreleasable? Should we be merging new types without corresponding implementation/controllers since we're not in a "breaking change" development phase anymore and might need to release v0.4.1 any time to fix bugs? Would it be a better strategy to start a new branch for ClusterClass like we did for the provider operator work? Apologies if this was already discussed and I missed it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No. Master should always be ready for a release. In order to make it possible while adding ClusterClass and managed topologies on the master branch we are avoiding breaking changes and we are keeping all the new features behind a feature flag, and if the flag is false (which is the default), Cluster API just works as of today with web hooks enforcing you can't set Cluster.Topology or Create/Update ClusterClass types. Also, keeping "strict" isolation between ClusterClass and existing controllers is a driving principle of the next steps of this effort, see e.g. notes on #4933. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1, as long as webhooks are installed (which we always assume) we should be fine. While the fields show up within the CRDs, enabling the feature gate won't do much more than enabling something non functional. One thing I'd add into the comment for now is |
||
} | ||
|
||
// Topology encapsulates the information of the managed resources. | ||
type Topology struct { | ||
// The name of the ClusterClass object to create the topology. | ||
Class string `json:"class"` | ||
|
||
// The Kubernetes version of the cluster. | ||
Version string `json:"version"` | ||
|
||
// RolloutAfter performs a rollout of the entire cluster one component at a time, | ||
// control plane first and then machine deployments. | ||
// +optional | ||
RolloutAfter *metav1.Time `json:"rolloutAfter,omitempty"` | ||
|
||
// ControlPlane describes the cluster control plane. | ||
ControlPlane ControlPlaneTopology `json:"controlPlane"` | ||
|
||
// Workers encapsulates the different constructs that form the worker nodes | ||
// for the cluster. | ||
// +optional | ||
Workers *WorkersTopology `json:"workers,omitempty"` | ||
} | ||
|
||
// ControlPlaneTopology specifies the parameters for the control plane nodes in the cluster. | ||
type ControlPlaneTopology struct { | ||
Metadata ObjectMeta `json:"metadata,omitempty"` | ||
vincepri marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// Replicas is the number of control plane nodes. | ||
Replicas int `json:"replicas"` | ||
} | ||
|
||
// WorkersTopology represents the different sets of worker nodes in the cluster. | ||
type WorkersTopology struct { | ||
// MachineDeployments is a list of machine deployments in the cluster. | ||
MachineDeployments []MachineDeploymentTopology `json:"machineDeployments,omitempty"` | ||
} | ||
|
||
// MachineDeploymentTopology specifies the different parameters for a set of worker nodes in the topology. | ||
// This set of nodes is managed by a MachineDeployment object whose lifecycle is managed by the Cluster controller. | ||
type MachineDeploymentTopology struct { | ||
Metadata ObjectMeta `json:"metadata,omitempty"` | ||
vincepri marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// Class is the name of the MachineDeploymentClass used to create the set of worker nodes. | ||
// This should match one of the deployment classes defined in the ClusterClass object | ||
// mentioned in the `Cluster.Spec.Class` field. | ||
Class string `json:"class"` | ||
|
||
// Name is the unique identifier for this MachineDeploymentTopology. | ||
// The value is used with other unique identifiers to create a MachineDeployment's Name | ||
// (e.g. cluster's name, etc). In case the name is greater than the allowed maximum length, | ||
// the values are hashed together. | ||
Name string `json:"name"` | ||
|
||
// Replicas is the number of worker nodes belonging to this set. | ||
// If the value is nil, the MachineDeployment is created without the number of Replicas (defaulting to zero) | ||
// and it's assumed that an external entity (like cluster autoscaler) is responsible for the management | ||
// of this value. | ||
// +optional | ||
Replicas *int `json:"replicas,omitempty"` | ||
} | ||
|
||
// ANCHOR_END: ClusterSpec | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,9 +17,16 @@ limitations under the License. | |
package v1alpha4 | ||
|
||
import ( | ||
"fmt" | ||
"strings" | ||
|
||
"github.com/blang/semver" | ||
apierrors "k8s.io/apimachinery/pkg/api/errors" | ||
"k8s.io/apimachinery/pkg/runtime" | ||
"k8s.io/apimachinery/pkg/util/sets" | ||
"k8s.io/apimachinery/pkg/util/validation/field" | ||
"sigs.k8s.io/cluster-api/feature" | ||
"sigs.k8s.io/cluster-api/util/version" | ||
ctrl "sigs.k8s.io/controller-runtime" | ||
"sigs.k8s.io/controller-runtime/pkg/webhook" | ||
) | ||
|
@@ -45,24 +52,36 @@ func (c *Cluster) Default() { | |
if c.Spec.ControlPlaneRef != nil && len(c.Spec.ControlPlaneRef.Namespace) == 0 { | ||
c.Spec.ControlPlaneRef.Namespace = c.Namespace | ||
} | ||
|
||
// If the Cluster uses a managed topology | ||
if c.Spec.Topology != nil { | ||
// tolerate version strings without a "v" prefix: prepend it if it's not there | ||
if !strings.HasPrefix(c.Spec.Topology.Version, "v") { | ||
c.Spec.Topology.Version = "v" + c.Spec.Topology.Version | ||
} | ||
Comment on lines
+58
to
+61
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's remove this assumption and only accept versions with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will make this version field behave differently from the other version fields, I would prefer consistency vs being strict. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @CecileRobertMichon thoughts? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm with Fabrizio on this one, consistency is better. But we can open an issue to consider requiring the "v" prefix on all version fields across the API. Although IMO it's nice to not have to worry about it as a user, functionally it's the same. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is similar toleration currently on Machine and MachineDeployment. We should have a single approach for all version fields. Either all or none! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll open an issue to not make it easier for users but rather acknowledge that it should always be prefixed with |
||
} | ||
} | ||
|
||
// ValidateCreate implements webhook.Validator so a webhook will be registered for the type. | ||
func (c *Cluster) ValidateCreate() error { | ||
return c.validate() | ||
return c.validate(nil) | ||
} | ||
|
||
// ValidateUpdate implements webhook.Validator so a webhook will be registered for the type. | ||
func (c *Cluster) ValidateUpdate(old runtime.Object) error { | ||
return c.validate() | ||
oldCluster, ok := old.(*Cluster) | ||
if !ok { | ||
return apierrors.NewBadRequest(fmt.Sprintf("expected a Cluster but got a %T", old)) | ||
} | ||
return c.validate(oldCluster) | ||
} | ||
|
||
// ValidateDelete implements webhook.Validator so a webhook will be registered for the type. | ||
func (c *Cluster) ValidateDelete() error { | ||
return nil | ||
} | ||
|
||
func (c *Cluster) validate() error { | ||
func (c *Cluster) validate(old *Cluster) error { | ||
var allErrs field.ErrorList | ||
if c.Spec.InfrastructureRef != nil && c.Spec.InfrastructureRef.Namespace != c.Namespace { | ||
allErrs = append( | ||
|
@@ -86,8 +105,145 @@ func (c *Cluster) validate() error { | |
) | ||
} | ||
|
||
// Validate the managed topology, if defined. | ||
if c.Spec.Topology != nil { | ||
fabriziopandini marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if topologyErrs := c.validateTopology(old); len(topologyErrs) > 0 { | ||
allErrs = append(allErrs, topologyErrs...) | ||
} | ||
} | ||
|
||
if len(allErrs) == 0 { | ||
return nil | ||
} | ||
return apierrors.NewInvalid(GroupVersion.WithKind("Cluster").GroupKind(), c.Name, allErrs) | ||
} | ||
|
||
func (c *Cluster) validateTopology(old *Cluster) field.ErrorList { | ||
// NOTE: ClusterClass and managed topologies are behind ClusterTopology feature gate flag; the web hook | ||
// must prevent the usage of Cluster.Topology in case the feature flag is disabled. | ||
if !feature.Gates.Enabled(feature.ClusterTopology) { | ||
return field.ErrorList{ | ||
field.Forbidden( | ||
field.NewPath("spec", "topology"), | ||
"can be set only if the ClusterTopology feature flag is enabled", | ||
), | ||
} | ||
} | ||
|
||
fabriziopandini marked this conversation as resolved.
Show resolved
Hide resolved
|
||
var allErrs field.ErrorList | ||
|
||
// class should be defined. | ||
if len(c.Spec.Topology.Class) == 0 { | ||
allErrs = append( | ||
allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "topology", "class"), | ||
c.Spec.Topology.Class, | ||
"cannot be empty", | ||
), | ||
) | ||
} | ||
|
||
// version should be valid. | ||
if !version.KubeSemver.MatchString(c.Spec.Topology.Version) { | ||
allErrs = append( | ||
allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "topology", "version"), | ||
c.Spec.Topology.Version, | ||
"must be a valid semantic version", | ||
), | ||
) | ||
} | ||
|
||
// MachineDeployment names must be unique. | ||
if c.Spec.Topology.Workers != nil { | ||
names := sets.String{} | ||
for _, md := range c.Spec.Topology.Workers.MachineDeployments { | ||
if names.Has(md.Name) { | ||
allErrs = append(allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "topology", "workers", "machineDeployments"), | ||
md, | ||
fmt.Sprintf("MachineDeployment names should be unique. MachineDeployment with name %q is defined more than once.", md.Name), | ||
), | ||
) | ||
} | ||
names.Insert(md.Name) | ||
} | ||
} | ||
|
||
switch old { | ||
case nil: // On create | ||
// c.Spec.InfrastructureRef and c.Spec.ControlPlaneRef could not be set | ||
if c.Spec.InfrastructureRef != nil { | ||
allErrs = append( | ||
allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "infrastructureRef"), | ||
c.Spec.InfrastructureRef, | ||
"cannot be set when a Topology is defined", | ||
), | ||
) | ||
} | ||
if c.Spec.ControlPlaneRef != nil { | ||
allErrs = append( | ||
allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "controlPlaneRef"), | ||
c.Spec.ControlPlaneRef, | ||
"cannot be set when a Topology is defined", | ||
), | ||
) | ||
} | ||
default: // On update | ||
// Class could not be mutated. | ||
if c.Spec.Topology.Class != old.Spec.Topology.Class { | ||
allErrs = append( | ||
allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "topology", "class"), | ||
c.Spec.Topology.Class, | ||
"class cannot be changed", | ||
), | ||
) | ||
} | ||
|
||
// Version could only be increased. | ||
inVersion, err := semver.ParseTolerant(c.Spec.Topology.Version) | ||
if err != nil { | ||
allErrs = append( | ||
allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "topology", "version"), | ||
c.Spec.Topology.Version, | ||
"is not a valid version", | ||
), | ||
) | ||
} | ||
oldVersion, err := semver.ParseTolerant(old.Spec.Topology.Version) | ||
if err != nil { | ||
// NOTE: this should never happen. Nevertheless, handling this for extra caution. | ||
allErrs = append( | ||
allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "topology", "version"), | ||
c.Spec.Topology.Class, | ||
"cannot be compared with the old version", | ||
), | ||
) | ||
} | ||
if inVersion.NE(semver.Version{}) && oldVersion.NE(semver.Version{}) && !inVersion.GTE(oldVersion) { | ||
allErrs = append( | ||
allErrs, | ||
field.Invalid( | ||
field.NewPath("spec", "topology", "version"), | ||
c.Spec.Topology.Version, | ||
"cannot be decreased", | ||
), | ||
) | ||
} | ||
} | ||
|
||
return allErrs | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is our strategy for introducing new CRDs in terms of API versions? We should discuss, there was also #4917 on the topic
seems like it'd make more sense to introduce new APIs at alpha once we move to beta for the rest of the project.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 in defining policies, but I think the two usa case have a key difference.
In the case of ClusterClass we are adding a new type to the main CAPI group, so it shares the API version of the existing types (the strategy is implicit here). In the case of the operator instead we are creating a new, separated API group so we have freedom to choose a different API version.