Large memory usage despite vm_memory_high_watermark being set #993

choppedpork · 2016-10-10T12:38:59Z

Hello,

I've been running some load tests of an application developed by my company which has recently started using RabbitMQ as the messaging backend. I'm currently co-hosting RabbitMQ with a few other services and have started running into problems with these competing for memory. I never got to the stage where the OOM-killer would get involved but I have had Rabbit shut itself down as it could not allocate memory. To alleviate this I tried restricting rabbits memory usage by using the vm_memory_high_watermark config option and setting it to {absolute, "716MiB"}.

What I can see in the management web UI is making a lot of sense - as soon as the node memory usage goes past 716MB, the node starts throttling traffic and to the best of my understanding flushes the messages to disk and eventually frees up enough memory to keep going. I've also verified this valued being picked up correctly by looking at the logs. What doesn't make sense to me is the resulting total memory usage. I have read that the vm_memory_high_watermark setting does not cover all the memory needs which can be twice that (I believe this was the mailing list but I can't find the exact message now). Attempting to account for some overhead on top of that, I've tried combining this setting with limiting the memory available to the rabbit processes using a 2g memory limit on the docker container it's running in. This was working fine until I've run a load test and Rabbit got killed by the OOM killer due to allocating memory past the 2g limit (docker does this using cgroups which rabbit does not understand hence thinks there's more memory available). This has made me a bit suspicious and what I turned to was running the rabbitmq container with unbounded memory but monitoring the total container memory usage. Here's what it looks like (the memory stats are collected by cAdvisor, the red area in the graph is memory used > 716MB):

The dip in memory usage around 19:00 is me restarting RabbitMQ due to it crashing because of erlang not being able to grab a sufficiently big chunk of memory from the system.

Here's the message counts on all queues from the same time period (the graph is stacked):

Could this be a memory issue with RabbitMQ (eg. I would expect the memory usage once the queues are cleared to go down to original levels but that has not been the case for me) or is it my misunderstanding of the configuration? If so - how can I ensure RabbiMQ running healthily on a host which has other processes that might require more memory at times (ie. the memory reported by free / procfs is not all available to RabbitMQ)?

I'm running: RabbitMQ 3.6.5, Erlang 17.3, Linux 3.13
RabbitMQ is running in docker containers (docker 1.12.1)
This is a three-node deployment (no RAM nodes) with all the application queues being mirrored to all nodes. Here's the policy: {"vhost":"/","name":"ha-app","pattern":"^app\\.*","apply-to":"all","definition":{"ha-mode":"all","ha-sync-mode":"automatic"},"priority":1}

The text was updated successfully, but these errors were encountered:

michaelklishin · 2016-10-10T12:50:35Z

Please post questions to rabbitmq-users or Stack Overflow. RabbitMQ uses GitHub issues for specific actionable items engineers can work on, not questions. Thank you.

michaelklishin closed this as completed Oct 10, 2016

gerhard mentioned this issue May 15, 2017

Alternative memory used by Erlang VM calculation strategy #1223

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Large memory usage despite vm_memory_high_watermark being set #993

Large memory usage despite vm_memory_high_watermark being set #993

choppedpork commented Oct 10, 2016

michaelklishin commented Oct 10, 2016

Uh oh!

Large memory usage despite vm_memory_high_watermark being set #993

Large memory usage despite vm_memory_high_watermark being set #993

Comments

choppedpork commented Oct 10, 2016

michaelklishin commented Oct 10, 2016

Uh oh!