Skip to content

Proposal to reduce size of CRD #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Sep 30, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 107 additions & 0 deletions docs/proposals/20200928-crd-size.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: Proposal for reducing the size of the CRD
authors:
- "@coro"
reviewers:
- "unassigned"
creation-date: 2020-09-28
last-updated: 2020-09-29
status: implementable
---

# Proposal for reducing the size of the CRD

## Table of Contents

<!--ts-->
* [Proposal for reducing the size of the CRD](#proposal-for-reducing-the-size-of-the-crd)
* [Table of Contents](#table-of-contents)
* [Summary](#summary)
* [Motivation](#motivation)
* [Why is our CRD so large?](#why-is-our-crd-so-large)
* [Upstream discussions](#upstream-discussions)
* [Proposal](#proposal)
* [Wait for Server-side Apply](#wait-for-server-side-apply)
* [Strip CRD field descriptions](#strip-crd-field-descriptions)
* [Contribute to kubebuilder](#contribute-to-kubebuilder)
* [Not implementing](#not-implementing)
* [Only create / replace CRD, don't apply](#only-create--replace-crd-dont-apply)
* [Implementation History](#implementation-history)

<!-- Added by: coro, at: Tue Sep 29 11:07:35 UTC 2020 -->

<!--te-->

## Summary
There are two size limits that I know of that we may be close to overflowing.
Objects stored in etcd have a hard limit of 1MB in size. Etcd does not (or cannot?) support larger objects, and so our entire CRD must be smaller than 1MB.

There is also a size limit on `.metadata.annotations` for objects of 256 KiB.

At time of writing, after a `kubectl apply` our CRD is 857kB, and the annotations block within the CRD is 252KiB. This puts us at 85.7% and 98.4% of the limits, respectively. This proposal
outlines methods by which we might reduce these sizes, and the consequences of doing so.

## Motivation
As per the [Kubernetes Deprecation Policy](https://kubernetes.io/docs/reference/using-api/deprecation-policy/), we must maintain older versions of our CRD for some time
after we create a new API version. As it stands, a single version of the CRD constitutes a whopping 510KiB, or 522kB.

As a result, if we ever introduce a new version of the CRD, we instantly go over the 1MB limit in etcd, and install operations on the CRD will fail.
Similarly, any new fields will appear in the annotations field for the CRD (see [below](#why-is-our-crd-so-large)), which is already very close to being over the limit.

### Why is our CRD so large?
Firstly, the Makefile generated by kubebuilder installs the CRD into a cluster by running `kubectl apply`. This results in an object with a `kubectl.kubernetes.io/last-applied-configuration`
annotation (see [Kubernetes Declarative Configuration](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/)). Given this contains the entire config of the CRD, this
becomes quite large.

Secondly, our CRD embeds several large Core API types, such as [PodSpec](https://github.com/rabbitmq/cluster-operator/blob/main/api/v1beta1/rabbitmqcluster_types.go#L214) and
[PersistentVolumeClaimSpec](https://github.com/rabbitmq/cluster-operator/blob/main/api/v1beta1/rabbitmqcluster_types.go#L228). When generating the manifest for the CRD to install
on the cluster, controller-gen recursively includes the field names, descriptions, etc. of any nested objects. As we include several large core object types, this massively inflates
the size of our CRD.

### Upstream discussions

The reason the `kubectl.kubernetes.io/last-applied-configuration` annotation needs to be so large is due to the `kubectl apply` logic being largely clientside. The kubectl CLI uses this
annotation to calculate the patch to send to the API server (see [documentation](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/#how-apply-calculates-differences-and-merges-changes)).
Others have hit this issue with specifically the annotation: https://github.com/kubernetes/kubectl/issues/712

There is a Kubernetes enhancement (https://github.com/kubernetes/enhancements/issues/555) that has been in beta since August 2019, which changes `apply` to be serverside, handled by `apiserver` instead of `kubectl`.
Once this is released as GA, it will be possible to reduce the amount of information sent to the apiserver through annotations, since the decalrative state management is already performed by the server.

There is also a k/k issue (https://github.com/kubernetes/kubernetes/issues/82292) which addresses the potential issue of CRDs going over the 1MB limit. There is an interesting analysis
of CRD sizes that currently exist, though ours seems far larger than any of the ones mentioned in the issue (ours is 614kB, if I've measured it correctly!).

## Proposal

### Wait for Server-side Apply
Once server-side apply is released as GA, the apiserver will be responsible for declarative state management of resources. This means that kubectl no longer needs
the last-applied-configuration annotation in order to calculate the scope of the patch request to the server on `kubectl apply`. This will reduce the footprint of
our CRD when `kubectl apply`-ed significantly.

### Strip CRD field descriptions
`controller-gen` supports the option `crd:maxDescLen`, which truncates the length of field descriptions in CRD manifests to the integer specified in the option.
If set to `0`, this removes descriptions entirely from the CRD. This was introduced in https://github.com/kubernetes-sigs/kubebuilder/issues/906 in order to prevent
large CRDs from going past the annotations limit with `kubectl apply`.

Doing this would save 610kB in the CRD, and 190KiB in the annotations. However, this would effectively remove any useful API documentation from the CRD.
For example, the output of `kubectl explain` would contain no field descriptions, only the name of the fields.

We will implement this as a stop-gap solution until Server-side apply or kubebuilder enhancements are released.

### Contribute to kubebuilder
There is a [great idea](https://github.com/kubernetes/kubernetes/issues/82292#issuecomment-601851309) by a Kubernetes maintainer for an enhancement to kubebuilder
which has not been implemented yet. The `crd:maxDescLen` option in controller-gen is applied to an entire CRD, with no option to only trim descriptions of certain fields.

We will investigate & hopefully create a PR to enhance kubebuilder to allow for recursive stripping of descriptions in CRDs, similar to as described in the linked comment.

## Not implementing

### Only create / replace CRD, don't apply
It is possible to install & upgrade CRDs by using kubectl create/replace, rather than apply: https://community.spinnaker.io/t/halyard-with-v2-metadata-annotations-too-long/894/2

We believe this will result in operator pods being garbage collected once the CRD is deleted as part of `kubectl replace`, and so is unsuitable for our needs.

## Implementation History

- [x] 28/09/2020: Open draft proposal PR
- [x] 29/09/2020: Discussed in sync-up