Memory and CPU usage of operator in practice #1055
-
In 2e2f758 the resource requests and limits doubled the cpu and increased memory by 20x -- currently 200m cpu and 500Mi requested. For larger clusters those numbers are probably sized like a rounding error, but for a minimally sized cluster, that could be a large fraction of the resources and could cause another node to be auto-provisioned resulting in lower utilization. Is there any justification why the numbers are that high? I assume for CPU I can just cut it down to 10m and just let it do things slower. Is it single-threaded so only processing one resource at a time? But is there any reason why the memory requested is that high? When it is idle I'm seeing it only using 16Mi. And I think when creating a cluster it was only about 18Mi. What's the max resources used seen so far by an operator instance? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
We changed the default resource requests/limits based on experimentation. Our goals were to be able to handle 50 RabbitMQ instances at any given time. In the initial resource requests, our Operator was getting killed because it consumed over the memory limit, and at some point, the creation of RabbitMQ instances stalled. The current resource requests are able to handle this scenario. The Operator is not single threaded, however. The Back when we did this experiment, October 2019, the Spec was a lot simpler, so it probably required less resources than today. Back then, the Operator was able to handle 40 instances of Feel free to tweak down the numbers to what they used to be if you are not expecting to handle more than ~10 rabbits at any given time. |
Beta Was this translation helpful? Give feedback.
We changed the default resource requests/limits based on experimentation. Our goals were to be able to handle 50 RabbitMQ instances at any given time. In the initial resource requests, our Operator was getting killed because it consumed over the memory limit, and at some point, the creation of RabbitMQ instances stalled. The current resource requests are able to handle this scenario.
The Operator is not single threaded, however. The
Reconcile()
function can run in parallel, for example, when 2+ instances ofRabbitmqCluster
are applied at the same time. Another example of parallel run is when a reconcile is requeued for some reason, and anotherRabbitmqCluster
object receives an update.Ba…