Skip to content

Commit 175ac9b

Browse files
authoredSep 30, 2020
Proposal to reduce size of CRD (#365)
* WIP: Proposal to reduce size of CRD * Update 20200928-crd-size.md * Update TOC & Implementation History * Update sizing proposal to set maxDescLen to 0
1 parent 6e86482 commit 175ac9b

File tree

1 file changed

+107
-0
lines changed

1 file changed

+107
-0
lines changed
 

Diff for: ‎docs/proposals/20200928-crd-size.md

+107
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
---
2+
title: Proposal for reducing the size of the CRD
3+
authors:
4+
- "@coro"
5+
reviewers:
6+
- "unassigned"
7+
creation-date: 2020-09-28
8+
last-updated: 2020-09-29
9+
status: implementable
10+
---
11+
12+
# Proposal for reducing the size of the CRD
13+
14+
## Table of Contents
15+
16+
<!--ts-->
17+
* [Proposal for reducing the size of the CRD](#proposal-for-reducing-the-size-of-the-crd)
18+
* [Table of Contents](#table-of-contents)
19+
* [Summary](#summary)
20+
* [Motivation](#motivation)
21+
* [Why is our CRD so large?](#why-is-our-crd-so-large)
22+
* [Upstream discussions](#upstream-discussions)
23+
* [Proposal](#proposal)
24+
* [Wait for Server-side Apply](#wait-for-server-side-apply)
25+
* [Strip CRD field descriptions](#strip-crd-field-descriptions)
26+
* [Contribute to kubebuilder](#contribute-to-kubebuilder)
27+
* [Not implementing](#not-implementing)
28+
* [Only create / replace CRD, don't apply](#only-create--replace-crd-dont-apply)
29+
* [Implementation History](#implementation-history)
30+
31+
<!-- Added by: coro, at: Tue Sep 29 11:07:35 UTC 2020 -->
32+
33+
<!--te-->
34+
35+
## Summary
36+
There are two size limits that I know of that we may be close to overflowing.
37+
Objects stored in etcd have a hard limit of 1MB in size. Etcd does not (or cannot?) support larger objects, and so our entire CRD must be smaller than 1MB.
38+
39+
There is also a size limit on `.metadata.annotations` for objects of 256 KiB.
40+
41+
At time of writing, after a `kubectl apply` our CRD is 857kB, and the annotations block within the CRD is 252KiB. This puts us at 85.7% and 98.4% of the limits, respectively. This proposal
42+
outlines methods by which we might reduce these sizes, and the consequences of doing so.
43+
44+
## Motivation
45+
As per the [Kubernetes Deprecation Policy](https://kubernetes.io/docs/reference/using-api/deprecation-policy/), we must maintain older versions of our CRD for some time
46+
after we create a new API version. As it stands, a single version of the CRD constitutes a whopping 510KiB, or 522kB.
47+
48+
As a result, if we ever introduce a new version of the CRD, we instantly go over the 1MB limit in etcd, and install operations on the CRD will fail.
49+
Similarly, any new fields will appear in the annotations field for the CRD (see [below](#why-is-our-crd-so-large)), which is already very close to being over the limit.
50+
51+
### Why is our CRD so large?
52+
Firstly, the Makefile generated by kubebuilder installs the CRD into a cluster by running `kubectl apply`. This results in an object with a `kubectl.kubernetes.io/last-applied-configuration`
53+
annotation (see [Kubernetes Declarative Configuration](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/)). Given this contains the entire config of the CRD, this
54+
becomes quite large.
55+
56+
Secondly, our CRD embeds several large Core API types, such as [PodSpec](https://github.com/rabbitmq/cluster-operator/blob/main/api/v1beta1/rabbitmqcluster_types.go#L214) and
57+
[PersistentVolumeClaimSpec](https://github.com/rabbitmq/cluster-operator/blob/main/api/v1beta1/rabbitmqcluster_types.go#L228). When generating the manifest for the CRD to install
58+
on the cluster, controller-gen recursively includes the field names, descriptions, etc. of any nested objects. As we include several large core object types, this massively inflates
59+
the size of our CRD.
60+
61+
### Upstream discussions
62+
63+
The reason the `kubectl.kubernetes.io/last-applied-configuration` annotation needs to be so large is due to the `kubectl apply` logic being largely clientside. The kubectl CLI uses this
64+
annotation to calculate the patch to send to the API server (see [documentation](https://kubernetes.io/docs/tasks/manage-kubernetes-objects/declarative-config/#how-apply-calculates-differences-and-merges-changes)).
65+
Others have hit this issue with specifically the annotation: https://github.com/kubernetes/kubectl/issues/712
66+
67+
There is a Kubernetes enhancement (https://github.com/kubernetes/enhancements/issues/555) that has been in beta since August 2019, which changes `apply` to be serverside, handled by `apiserver` instead of `kubectl`.
68+
Once this is released as GA, it will be possible to reduce the amount of information sent to the apiserver through annotations, since the decalrative state management is already performed by the server.
69+
70+
There is also a k/k issue (https://github.com/kubernetes/kubernetes/issues/82292) which addresses the potential issue of CRDs going over the 1MB limit. There is an interesting analysis
71+
of CRD sizes that currently exist, though ours seems far larger than any of the ones mentioned in the issue (ours is 614kB, if I've measured it correctly!).
72+
73+
## Proposal
74+
75+
### Wait for Server-side Apply
76+
Once server-side apply is released as GA, the apiserver will be responsible for declarative state management of resources. This means that kubectl no longer needs
77+
the last-applied-configuration annotation in order to calculate the scope of the patch request to the server on `kubectl apply`. This will reduce the footprint of
78+
our CRD when `kubectl apply`-ed significantly.
79+
80+
### Strip CRD field descriptions
81+
`controller-gen` supports the option `crd:maxDescLen`, which truncates the length of field descriptions in CRD manifests to the integer specified in the option.
82+
If set to `0`, this removes descriptions entirely from the CRD. This was introduced in https://github.com/kubernetes-sigs/kubebuilder/issues/906 in order to prevent
83+
large CRDs from going past the annotations limit with `kubectl apply`.
84+
85+
Doing this would save 610kB in the CRD, and 190KiB in the annotations. However, this would effectively remove any useful API documentation from the CRD.
86+
For example, the output of `kubectl explain` would contain no field descriptions, only the name of the fields.
87+
88+
We will implement this as a stop-gap solution until Server-side apply or kubebuilder enhancements are released.
89+
90+
### Contribute to kubebuilder
91+
There is a [great idea](https://github.com/kubernetes/kubernetes/issues/82292#issuecomment-601851309) by a Kubernetes maintainer for an enhancement to kubebuilder
92+
which has not been implemented yet. The `crd:maxDescLen` option in controller-gen is applied to an entire CRD, with no option to only trim descriptions of certain fields.
93+
94+
We will investigate & hopefully create a PR to enhance kubebuilder to allow for recursive stripping of descriptions in CRDs, similar to as described in the linked comment.
95+
96+
## Not implementing
97+
98+
### Only create / replace CRD, don't apply
99+
It is possible to install & upgrade CRDs by using kubectl create/replace, rather than apply: https://community.spinnaker.io/t/halyard-with-v2-metadata-annotations-too-long/894/2
100+
101+
We believe this will result in operator pods being garbage collected once the CRD is deleted as part of `kubectl replace`, and so is unsuitable for our needs.
102+
103+
## Implementation History
104+
105+
- [x] 28/09/2020: Open draft proposal PR
106+
- [x] 29/09/2020: Discussed in sync-up
107+

0 commit comments

Comments
 (0)
Please sign in to comment.