From c7ac6895c0111218d53f51405d7732d66206a53b Mon Sep 17 00:00:00 2001 From: Connor Rogers Date: Mon, 28 Sep 2020 17:39:19 +0000 Subject: [PATCH 1/4] WIP: Proposal to reduce size of CRD --- docs/proposals/20200928-crd-size.md | 86 +++++++++++++++++++++++++++++ 1 file changed, 86 insertions(+) create mode 100644 docs/proposals/20200928-crd-size.md diff --git a/docs/proposals/20200928-crd-size.md b/docs/proposals/20200928-crd-size.md new file mode 100644 index 000000000..a5bb96cbc --- /dev/null +++ b/docs/proposals/20200928-crd-size.md @@ -0,0 +1,86 @@ +--- +title: Proposal for reducing the size of the CRD +authors: + - "@coro" +reviewers: + - "unassigned" +creation-date: 2020-09-28 +last-updated: 2020-09-28 +status: provisional +--- + +# Proposal for reducing the size of the CRD + +## Table of Contents + + + * [Proposal for reducing the size of the CRD](#proposal-for-reducing-the-size-of-the-crd) + * [Table of Contents](#table-of-contents) + * [Summary](#summary) + * [Motivation](#motivation) + * [Why is our CRD so large?](#why-is-our-crd-so-large) + * [Upstream discussions](#upstream-discussions) + * [Proposal](#proposal) + * [Strip CRD field descriptions](#strip-crd-field-descriptions) + * [Only create / replace CRD, don't apply](#only-create--replace-crd-dont-apply) + * [Implementation History](#implementation-history) + + + + + +## Summary +There are two size limits that I know of that we may be close to overflowing. +Objects stored in etcd have a hard limit of 1MB in size. Etcd does not (or cannot?) support larger objects, and so our entire CRD must be smaller than 1MB. + +There is also a size limit on `.metadata.annotations` for objects of 256 KiB. + +At time of writing, after a `kubectl apply` our CRD is 857kB, and the annotations block within the CRD is 252KiB. This puts us at 85.7% and 98.4% of the limits, respectively. This proposal +outlines methods by which we might reduce these sizes, and the consequences of doing so. + +## Motivation +As per the [Kubernetes Deprecation Policy](https://kubernetes.io/docs/reference/using-api/deprecation-policy/), we must maintain older versions of our CRD for some time +after we create a new API version. As it stands, a single version of the CRD constitutes a whopping 510KiB, or 522kB. + +As a result, if we ever introduce a new version of the CRD, we instantly go over the 1MB limit in etcd, and install operations on the CRD will fail. +Similarly, any new fields will appear in the annotations field for the CRD (see [below](#why-is-our-crd-so-large)), which is already very close to being over the limit. + +### Why is our CRD so large? +Firstly, the Makefile generated by kubebuilder installs the CRD into a cluster by running `kubectl apply`. This results in an object with a `kubectl.kubernetes.io/last-applied-configuration` +annotation (see [Kubernetes Declarative Configuration](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/)). Given this contains the entire config of the CRD, this +becomes quite large. + +Secondly, our CRD embeds several large Core API types, such as [PodSpec](https://github.com/rabbitmq/cluster-operator/blob/main/api/v1beta1/rabbitmqcluster_types.go#L214) and +[PersistentVolumeClaimSpec](https://github.com/rabbitmq/cluster-operator/blob/main/api/v1beta1/rabbitmqcluster_types.go#L228). When generating the manifest for the CRD to install +on the cluster, controller-gen recursively includes the field names, descriptions, etc. of any nested objects. As we include several large core object types, this massively inflates +the size of our CRD. + +### Upstream discussions + +The reason the `kubectl.kubernetes.io/last-applied-configuration` annotation needs to be so large is due to the `kubectl apply` logic being largely clientside. The kubectl CLI uses this +annotation to calculate the patch to send to the API server (see [documentation](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/#how-apply-calculates-differences-and-merges-changes)). +Others have hit this issue with specifically the annotation: https://github.com/kubernetes/kubectl/issues/712 + +There is a Kubernetes enhancement (https://github.com/kubernetes/enhancements/issues/555) that has been in beta since August 2019, which changes `apply` to be serverside, handled by `apiserver` instead of `kubectl`. +Once this is released as GA, it will be possible to reduce the amount of information sent to the apiserver through annotations, since the decalrative state management is already performed by the server. + +There is also a k/k issue (https://github.com/kubernetes/kubernetes/issues/82292) which addresses the potential issue of CRDs going over the 1MB limit. There is an interesting analysis +of CRD sizes that currently exist, though ours seems far larger than any of the ones mentioned in the issue (ours is 614kB, if I've measured it correctly!). + +## Proposal + +### Strip CRD field descriptions +`controller-gen` supports the option `crd:maxDescLen`, which truncates the length of field descriptions in CRD manifests to the integer specified in the option. +If set to `-1`, this removes descriptions entirely from the CRD. This was introduced in https://github.com/kubernetes-sigs/kubebuilder/issues/906 in order to prevent +large CRDs from going past the annotations limit with `kubectl apply`. + +Doing this would save 610kB in the CRD, and 190KiB in the annotations. However, this would effectively remove any useful API documentation from the CRD. +For example, the output of `kubectl explain` would contain no field descriptions, only the name of the fields. + +### Only create / replace CRD, don't apply +https://community.spinnaker.io/t/halyard-with-v2-metadata-annotations-too-long/894/2 + +## Implementation History + +- [ ] MM/DD/YYYY: Open proposal PR + From a5ac88eddd699aa8867493735b94460b0178313f Mon Sep 17 00:00:00 2001 From: Connor Rogers Date: Tue, 29 Sep 2020 12:05:16 +0100 Subject: [PATCH 2/4] Update 20200928-crd-size.md --- docs/proposals/20200928-crd-size.md | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/docs/proposals/20200928-crd-size.md b/docs/proposals/20200928-crd-size.md index a5bb96cbc..e8da5ef24 100644 --- a/docs/proposals/20200928-crd-size.md +++ b/docs/proposals/20200928-crd-size.md @@ -5,7 +5,7 @@ authors: reviewers: - "unassigned" creation-date: 2020-09-28 -last-updated: 2020-09-28 +last-updated: 2020-09-29 status: provisional --- @@ -69,6 +69,11 @@ of CRD sizes that currently exist, though ours seems far larger than any of the ## Proposal +### Wait for Server-side Apply +Once server-side apply is released as GA, the apiserver will be responsible for declarative state management of resources. This means that kubectl no longer needs +the last-applied-configuration annotation in order to calculate the scope of the patch request to the server on `kubectl apply`. This will reduce the footprint of +our CRD when `kubectl apply`-ed significantly. + ### Strip CRD field descriptions `controller-gen` supports the option `crd:maxDescLen`, which truncates the length of field descriptions in CRD manifests to the integer specified in the option. If set to `-1`, this removes descriptions entirely from the CRD. This was introduced in https://github.com/kubernetes-sigs/kubebuilder/issues/906 in order to prevent @@ -77,10 +82,23 @@ large CRDs from going past the annotations limit with `kubectl apply`. Doing this would save 610kB in the CRD, and 190KiB in the annotations. However, this would effectively remove any useful API documentation from the CRD. For example, the output of `kubectl explain` would contain no field descriptions, only the name of the fields. +We will implement this as a stop-gap solution until Server-side apply or kubebuilder enhancements are released. + +### Contribute to kubebuilder +There is a [great idea](https://github.com/kubernetes/kubernetes/issues/82292#issuecomment-601851309) by a Kubernetes maintainer for an enhancement to kubebuilder +which has not been implemented yet. The `crd:maxDescLen` option in controller-gen is applied to an entire CRD, with no option to only trim descriptions of certain fields. + +We will investigate & hopefully create a PR to enhance kubebuilder to allow for recursive stripping of descriptions in CRDs, similar to as described in the linked comment. + +## Not implementing + ### Only create / replace CRD, don't apply -https://community.spinnaker.io/t/halyard-with-v2-metadata-annotations-too-long/894/2 +It is possible to install & upgrade CRDs by using kubectl create/replace, rather than apply: https://community.spinnaker.io/t/halyard-with-v2-metadata-annotations-too-long/894/2 + +We believe this will result in operator pods being garbage collected once the CRD is deleted as part of `kubectl replace`, and so is unsuitable for our needs. ## Implementation History -- [ ] MM/DD/YYYY: Open proposal PR +- [ ] 28/09/2020: Open draft proposal PR +- [ ] 29/09/2020: Discussed in sync-up From e3802974585b1b8edc0b9fa18712e399fbd5d90c Mon Sep 17 00:00:00 2001 From: Connor Rogers Date: Tue, 29 Sep 2020 11:08:42 +0000 Subject: [PATCH 3/4] Update TOC & Implementation History --- docs/proposals/20200928-crd-size.md | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/docs/proposals/20200928-crd-size.md b/docs/proposals/20200928-crd-size.md index e8da5ef24..0f943e699 100644 --- a/docs/proposals/20200928-crd-size.md +++ b/docs/proposals/20200928-crd-size.md @@ -6,7 +6,7 @@ reviewers: - "unassigned" creation-date: 2020-09-28 last-updated: 2020-09-29 -status: provisional +status: implementable --- # Proposal for reducing the size of the CRD @@ -21,11 +21,14 @@ status: provisional * [Why is our CRD so large?](#why-is-our-crd-so-large) * [Upstream discussions](#upstream-discussions) * [Proposal](#proposal) + * [Wait for Server-side Apply](#wait-for-server-side-apply) * [Strip CRD field descriptions](#strip-crd-field-descriptions) + * [Contribute to kubebuilder](#contribute-to-kubebuilder) + * [Not implementing](#not-implementing) * [Only create / replace CRD, don't apply](#only-create--replace-crd-dont-apply) * [Implementation History](#implementation-history) - + @@ -99,6 +102,6 @@ We believe this will result in operator pods being garbage collected once the CR ## Implementation History -- [ ] 28/09/2020: Open draft proposal PR -- [ ] 29/09/2020: Discussed in sync-up +- [x] 28/09/2020: Open draft proposal PR +- [x] 29/09/2020: Discussed in sync-up From 59308c3af04e3334255f80de265dbb9625ebf7bd Mon Sep 17 00:00:00 2001 From: Connor Rogers Date: Tue, 29 Sep 2020 15:59:47 +0100 Subject: [PATCH 4/4] Update sizing proposal to set maxDescLen to 0 --- docs/proposals/20200928-crd-size.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/proposals/20200928-crd-size.md b/docs/proposals/20200928-crd-size.md index 0f943e699..3eedfd5fc 100644 --- a/docs/proposals/20200928-crd-size.md +++ b/docs/proposals/20200928-crd-size.md @@ -79,7 +79,7 @@ our CRD when `kubectl apply`-ed significantly. ### Strip CRD field descriptions `controller-gen` supports the option `crd:maxDescLen`, which truncates the length of field descriptions in CRD manifests to the integer specified in the option. -If set to `-1`, this removes descriptions entirely from the CRD. This was introduced in https://github.com/kubernetes-sigs/kubebuilder/issues/906 in order to prevent +If set to `0`, this removes descriptions entirely from the CRD. This was introduced in https://github.com/kubernetes-sigs/kubebuilder/issues/906 in order to prevent large CRDs from going past the annotations limit with `kubectl apply`. Doing this would save 610kB in the CRD, and 190KiB in the annotations. However, this would effectively remove any useful API documentation from the CRD.