Skip to content

Commit eda69e7

Browse files
committed
Add cluster total CR limit, explain CR per CRD limits in more detail
1 parent 5d95aee commit eda69e7

File tree

1 file changed

+44
-17
lines changed

1 file changed

+44
-17
lines changed

keps/sig-api-machinery/20180415-crds-to-ga.md

Lines changed: 44 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -170,13 +170,53 @@ The targets are defined by the below suggested maximum limits, which are organiz
170170

171171
- Since custom resources can be arbitrarily large, we have broken down the limit by custom resource object size.
172172

173+
**Custom Resource Definitions:**
174+
175+
| Suggested Maximum Limit: scope=cluster |
176+
| --- |
177+
| 500 |
178+
179+
_Note: The Custom Resource Definition suggested maximum limit was selected not
180+
due to the above SLI/SLOs, but instead due to the latency OpenAPI publishing,
181+
which is a background process that occurs asychroniously each time a Custom
182+
Resource Definition schema is updated. For 500 Custom Resource Definitions it takes
183+
slightly over 35 seconds for a definition change to be visible via the OpenAPI
184+
spec endpoint._
185+
186+
**Custom Resources, Cluster Wide:**
187+
188+
Cluster wide limits for custom resources are storage bound and custom resources
189+
share the storage space with all other objects. While determining the
190+
appropriate storage limit for a cluster is out-of-scope for this document, once
191+
a etcd storage limit selected, suggested maximum limits for custom resources
192+
are:
193+
194+
| etcd storage limit | Suggested Maximum Limit: scope=cluster |
195+
| --- | --- |
196+
| 4GB | 40000 |
197+
| 8GB | 80000 |
198+
199+
These limits aim to keep custom resource storage usage to less than half of the
200+
total cluster storage capacity for custom resources of 50kb or less in size.
201+
173202
**Custom Resources per Definition:**
174203

204+
For each custom resource definition, the limit on the number of custom resources
205+
can be found by taking the (median) object size of the custom resource and finding
206+
the the matching row in this table:
207+
175208
| Object size | Suggested Maximum Limit: scope=namespace (5s p99 SLO) | Suggested Maximum Limit: scope=cluster (30s p99 SLO) |
176209
| --- | --- | --- |
177-
| 10kb | 1500 | 10000 |
178-
| 25kb | 600 | 4000 |
179-
| 50kb | 300 | 2000 |
210+
| <=10kb | 1500 | 10000 |
211+
| (10kb - 25kb] | 600 | 4000 |
212+
| (25kb - 50kb] | 300 | 2000 |
213+
214+
The cluster scope indicates the total number of custom resources for that
215+
definition allowed in the entire cluster.
216+
217+
The namespace scope indicates the total number of custom resources for that
218+
definition allowed in any particular namespace. The cumulative count of the
219+
custom resource across all namespaces must not exceed the cluster limit.
180220

181221
Since, in practice, custom resources scale farther without conversion webhooks
182222
within the SLI/SLOs (roughly 2x according to our scale tests), custom resource
@@ -190,19 +230,6 @@ and the scope=cluster suggested maximum limit indicates how many custom resource
190230
be in the cluster total. For custom resources of custom resource definitions using `scope: Cluster`: only
191231
the scope=cluster suggested maximum limit applies._
192232

193-
**Custom Resource Definitions:**
194-
195-
| Suggested Maximum Limit: scope=cluster |
196-
| --- |
197-
| 500 |
198-
199-
_Note: The Custom Resource Definition suggested maximum limit was selected not
200-
due to the above SLI/SLOs, but instead due to the latency OpenAPI publishing,
201-
which is a background process that occurs asychroniously each time a Custom
202-
Resource Definition schema is updated. For 500 Custom Resource Definitions it takes
203-
slightly over 35 seconds for a definition change to be visible via the OpenAPI
204-
spec endpoint._
205-
206233
**Conversion Webhooks:**
207234

208235
Conversion Webhook SLOs are defined from the perspective of the conversion
@@ -211,7 +238,7 @@ making the request to the webhook, but it does include network latency.
211238

212239
Given that the performance and scalability of conversion webhooks are the
213240
responsibility of their author, Custom resource scale targets are applied only for
214-
conversion webhooks that are within the follow latencies for the above suggested
241+
conversion webhooks that are within the following latencies for the above suggested
215242
maximum limits.
216243

217244
| scope | Expected conversion Webhook SLO: p99 latency |

0 commit comments

Comments
 (0)