rabbitmq · coro · Sep 30, 2020 · Sep 28, 2020 · Sep 29, 2020 · Sep 29, 2020
diff --git a/docs/proposals/20200928-crd-size.md b/docs/proposals/20200928-crd-size.md
@@ -0,0 +1,107 @@
+---
+title: Proposal for reducing the size of the CRD
+authors:
+  - "@coro"
+reviewers:
+  - "unassigned"
+creation-date: 2020-09-28
+last-updated: 2020-09-29
+status: implementable
+---
+
+# Proposal for reducing the size of the CRD
+
+## Table of Contents
+
+<!--ts-->
+   * [Proposal for reducing the size of the CRD](#proposal-for-reducing-the-size-of-the-crd)
+      * [Table of Contents](#table-of-contents)
+      * [Summary](#summary)
+      * [Motivation](#motivation)
+         * [Why is our CRD so large?](#why-is-our-crd-so-large)
+         * [Upstream discussions](#upstream-discussions)
+      * [Proposal](#proposal)
+         * [Wait for Server-side Apply](#wait-for-server-side-apply)
+         * [Strip CRD field descriptions](#strip-crd-field-descriptions)
+         * [Contribute to kubebuilder](#contribute-to-kubebuilder)
+      * [Not implementing](#not-implementing)
+         * [Only create / replace CRD, don't apply](#only-create--replace-crd-dont-apply)
+      * [Implementation History](#implementation-history)
+
+<!-- Added by: coro, at: Tue Sep 29 11:07:35 UTC 2020 -->
+
+<!--te-->
+
+## Summary
+There are two size limits that I know of that we may be close to overflowing.
+Objects stored in etcd have a hard limit of 1MB in size. Etcd does not (or cannot?) support larger objects, and so our entire CRD must be smaller than 1MB.
+
+There is also a size limit on `.metadata.annotations` for objects of 256 KiB.
+
+At time of writing, after a `kubectl apply` our CRD is 857kB, and the annotations block within the CRD is 252KiB. This puts us at 85.7% and 98.4% of the limits, respectively. This proposal
+outlines methods by which we might reduce these sizes, and the consequences of doing so.
+
+## Motivation
+As per the [Kubernetes Deprecation Policy](https://kubernetes.io/docs/reference/using-api/deprecation-policy/), we must maintain older versions of our CRD for some time
+after we create a new API version. As it stands, a single version of the CRD constitutes a whopping 510KiB, or 522kB.
+
+As a result, if we ever introduce a new version of the CRD, we instantly go over the 1MB limit in etcd, and install operations on the CRD will fail.
+Similarly, any new fields will appear in the annotations field for the CRD (see [below](#why-is-our-crd-so-large)), which is already very close to being over the limit.
+
+### Why is our CRD so large?
+Firstly, the Makefile generated by kubebuilder installs the CRD into a cluster by running `kubectl apply`. This results in an object with a `kubectl.kubernetes.io/last-applied-configuration`
+annotation (see [Kubernetes Declarative Configuration](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/)). Given this contains the entire config of the CRD, this
+becomes quite large.
+
+Secondly, our CRD embeds several large Core API types, such as [PodSpec](https://github.com/rabbitmq/cluster-operator/blob/main/api/v1beta1/rabbitmqcluster_types.go#L214) and
+[PersistentVolumeClaimSpec](https://github.com/rabbitmq/cluster-operator/blob/main/api/v1beta1/rabbitmqcluster_types.go#L228). When generating the manifest for the CRD to install
+on the cluster, controller-gen recursively includes the field names, descriptions, etc. of any nested objects. As we include several large core object types, this massively inflates
+the size of our CRD.
+
+### Upstream discussions
+
+The reason the `kubectl.kubernetes.io/last-applied-configuration` annotation needs to be so large is due to the `kubectl apply` logic being largely clientside. The kubectl CLI uses this
+annotation to calculate the patch to send to the API server (see [documentation](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/#how-apply-calculates-differences-and-merges-changes)).
+Others have hit this issue with specifically the annotation: https://github.com/kubernetes/kubectl/issues/712
+
+There is a Kubernetes enhancement (https://github.com/kubernetes/enhancements/issues/555) that has been in beta since August 2019, which changes `apply` to be serverside, handled by `apiserver` instead of `kubectl`.
+Once this is released as GA, it will be possible to reduce the amount of information sent to the apiserver through annotations, since the decalrative state management is already performed by the server.
+
+There is also a k/k issue (https://github.com/kubernetes/kubernetes/issues/82292) which addresses the potential issue of CRDs going over the 1MB limit. There is an interesting analysis
+of CRD sizes that currently exist, though ours seems far larger than any of the ones mentioned in the issue (ours is 614kB, if I've measured it correctly!).
+
+## Proposal
+
+### Wait for Server-side Apply
+Once server-side apply is released as GA, the apiserver will be responsible for declarative state management of resources. This means that kubectl no longer needs
+the last-applied-configuration annotation in order to calculate the scope of the patch request to the server on `kubectl apply`. This will reduce the footprint of
+our CRD when `kubectl apply`-ed significantly.
+
+### Strip CRD field descriptions
+`controller-gen` supports the option `crd:maxDescLen`, which truncates the length of field descriptions in CRD manifests to the integer specified in the option.
+If set to `0`, this removes descriptions entirely from the CRD. This was introduced in https://github.com/kubernetes-sigs/kubebuilder/issues/906 in order to prevent
+large CRDs from going past the annotations limit with `kubectl apply`.
+
+Doing this would save 610kB in the CRD, and 190KiB in the annotations. However, this would effectively remove any useful API documentation from the CRD.
+For example, the output of `kubectl explain` would contain no field descriptions, only the name of the fields.
+
+We will implement this as a stop-gap solution until Server-side apply or kubebuilder enhancements are released.
+
+### Contribute to kubebuilder
+There is a [great idea](https://github.com/kubernetes/kubernetes/issues/82292#issuecomment-601851309) by a Kubernetes maintainer for an enhancement to kubebuilder
+which has not been implemented yet. The `crd:maxDescLen` option in controller-gen is applied to an entire CRD, with no option to only trim descriptions of certain fields.
+
+We will investigate & hopefully create a PR to enhance kubebuilder to allow for recursive stripping of descriptions in CRDs, similar to as described in the linked comment.
+
+## Not implementing
+
+### Only create / replace CRD, don't apply
+It is possible to install & upgrade CRDs by using kubectl create/replace, rather than apply: https://community.spinnaker.io/t/halyard-with-v2-metadata-annotations-too-long/894/2
+
+We believe this will result in operator pods being garbage collected once the CRD is deleted as part of `kubectl replace`, and so is unsuitable for our needs.
+
+## Implementation History
+
+- [x] 28/09/2020: Open draft proposal PR
+- [x] 29/09/2020: Discussed in sync-up
+