神刀安全网

Take Control of your RabbitMQ queues

2016-03-23 by Ayanda Dube

You’re a support engineer and your organisation uses a 3-node RabbitMQ cluster to manage its inter-application interactions across your network. On a regular basis, different departments in your organisation approach you with requests to integrate their new microservices into the network, for communication with other microservices via RabbitMQ.

With the organisation being so huge, and offices spread across the globe, onus is on each department’s application developers to handle the integration to the RabbitMQ cluster, only after you’ve approved and given them the green light to do so. Along with approving integration requests, you also provide general conventions which you’ve adopted from prior experience. Part of the conventions you enforce is that the connecting microservices must create their own dedicated queues on integration to the cluster, as the best approach to isolating services and easily managing them. Unless of course, the microservices would be seeking to only consume messages from already existing queues.

So, average message rate across your cluster is almost stable at 1k/s , both from internal traffic, and external traffic which is being generated by some mobile apps publicised by the organisation. Everything is smooth sailing, till you get to a point where you realise that the total number of queues in your cluster is nearing the order of thousands, and one of the three servers seems to be over burdened, using more system resources than rest. Memory utilisation on that server starts reaching alarming thresholds. At this point, you realise that things can only get worse, yet you still have more pending requests for integration of more microservices onto the cluster, but can’t approve them without figuring out how to solve the growing imbalance in system resources across your deployment.

Take Control of your RabbitMQ queues

Fig 1. RabbitMQ cluster imbalance illustration

After digging up on some RabbitMQ documentation , you come to light with the fact that since you’re using HA queues, which you’ve adopted to enforce availability of service, all your message operations only reference your master queues. Microservices have been creating queues on certain nodes at will, implying that the provisioning of queues has been random and unstructured across the cluster. If fact, how would this be structured anyway, when there’re numerous microservices, managed by different personnel, across different geographical locations? Concentration of HA queue masters on one node significantly surpass that on the other nodes, and as a result, with all message consumptions referencing master queues only, the server with the most master queues is feeling the operational burden in comparison to the rest. You can’t afford to purge any of the queues to relieve the memory footprint on the burdened server, as most queued up messages are crucial to all business operations transacting through the cluster. New and existing microservices also can’t continue timelessly creating and adding more queues into the cluster until this is resolved. So what do you do?

Well, as of version 3.6.0 , RabbitMQ has introduced a mechanism to grant its users more control in determining a queue master’s location in a cluster, on creation. This is based on some predefined rules and strategies, configured prior to the queue declaration operations. If you can relate with the situation above, or would like to plan ahead and make necessary amendments to your RabbitMQ installation before encountering similar problems, then read on, and give this feature a go.

So how does it work?

Prior to introducing the queue master location mechanism, declaration of queues, by default, had been characterized by the queue master being located on the local node on which the declare operation was being executed on. This is somewhat very limiting, and has been the main reason behind the inefficient imbalance of system resources on a RabbitMQ cluster when the number of queues become significantly large.

Upon introducing this mechanism, the node on which the queue master will be located is now first computed from a configurable strategy, prior to the queue being created.

Configurable strategy is key here, as it leverages full control to RabbitMQ users to dictate the distribution of queue masters across their cluster. There are three means by which a queue master location strategy may be configured;

  1. Queue declare arguments : This is at AMQP level, where the queue master location strategy is defined as part of the queue’s declaration arguments
  2. Policy : Here the strategy is defined as a RabbitMQ policy.
  3. Configuration file : Location strategy is defined in the rabbitmq.config file.

Once set, the internal execution order of declaring a queue would be as follows;

Take Control of your RabbitMQ queues

Fig 2. Queue master location execution flow

These are the three ways in which a queue master location strategy may be configured, and how the execution flow is ordered upon queue declaration. Next, you may be asking yourself the following question;

What are these strategies anyway?

Queue master location strategies are basically the rules which govern the selection of the node on which the queue master will reside, on declaration. If you’re from an Erlang background, you’d understand when I say these strategies are nothing but callback modules of a certain behaviour pattern in RabbitMQ known as the rabbit_queue_master_locator . If you aren’t from an Erlang background, no worries, all you need to know is what strategies are available to you, and how to make use of them. Currently, there are three queue master location strategies available;

  1. Min-Masters: Selects the master node as the one with the least running master queues. Configured as min-masters .

  2. Client-local: Like previous default node selection policy, this strategy selects the queue master node as the local node on which the queue is being declared. Configured as client-local .

  3. Random: Selects the queue master node based on random selection. Configured as random .

So in a nutshell, this is the general theory behind controlling and dictating the location of a queue master’s node. Syntax rules differ for each case, depending on whether the strategy is defined as part of the queue’s declare arguments, as a policy, or as part of the rabbitmq.config file.

NOTE: When both, a queue master location strategy and HA nodes policy have been configured, a conflict could arise in the resulting queue master node. For instance, if one of the slave nodes defined by the HA nodes policy becomes the queue master node computed by the location strategy. In such a scenario, the HA nodes policy would always take precedence over the queue master location strategy .

With this knowledge at hand, the engineer in the situation mentioned above would simply enforce the use of the min-masters queue location strategy as part of the queue declaration arguments for all microservices connecting to the RabbitMQ cluster. Or even easier, he’d simply set the min-masters policy on the cluster nodes, using the match-all wildcard for the queue name match pattern. This would ensure that all newly created queues would be automatically distributed across the cluster until there’s a balance in the number of queue masters per node, and ultimately, a balance in the utilization of system resources across all three servers.

Going forward

At the moment, only three location strategies have been implemented; min-masters , client-local and random . More strategies are yet to be brewed up, and if you feel you’d like to contribute a rule by which the distribution of queues can be carried out to better improve the performance of a RabbitMQ cluster, please feel free to drop a comment. These will go through some rounds of review, and could possibly be implemented and included in near future releases of RabbitMQ .

Quick experiment

I’ll illustrate how the queue master location strategy is put into effect with a simple experiment to carry out on your local machine. We’re going make things easy by using some direct commands to avoid the whole creating of connections, channels, and so forth.

  1. Download and install a RabbitMQ package specific for your platform. If you’re on a UNIX based OS, you can just quickly download and extract the generic unix package , and navigate to the sbin directory.

  2. Create a 3-node cluster by executing the following;

    export RABBITMQ_NODE_PORT=5672 && export RABBITMQ_NODENAME=rabbit   && ./rabbitmq-server -detached export RABBITMQ_NODE_PORT=5673 && export RABBITMQ_NODENAME=rabbit_1 && ./rabbitmq-server -detached export RABBITMQ_NODE_PORT=5674 && export RABBITMQ_NODENAME=rabbit_2 && ./rabbitmq-server -detached  ./rabbitmqctl -n rabbit_1@hostname stop_app ./rabbitmqctl -n rabbit_1@hostname join_cluster rabbit@hostname ./rabbitmqctl -n rabbit_1@hostname start_app  ./rabbitmqctl -n rabbit_2@hostname stop_app ./rabbitmqctl -n rabbit_2@hostname join_cluster rabbit@hostname ./rabbitmqctl -n rabbit_2@hostname start_app 
  3. You can also enable the rabbitmq_management plugin to keep track of proceedings on from the WebUI .

    ./rabbitmq-plugins enable rabbitmq_management 
  4. Now verify the status of your cluster status with;

    ./rabbitmqctl cluster_status 

    or from the Overview page of the WebUI.

  5. Next, declare a different number of queues on each node as follows;

    3qeueues on the first node, rabbit@hostname ,

    ./rabbitmqctl -n rabbit@hostname eval 'Queues = 3, L=[{_,_} = rabbit_amqqueue:declare(rabbit_misc:r(<<"/">>, queue, list_to_binary("rabbit.queue."++integer_to_list(N))), false, false, [], none) || N <- lists:seq(1, Queues)], {ok, {queues, length(L)}}.'  

    5qeueues on the second node, rabbit_1@hostname ,

    ./rabbitmqctl -n rabbit_1@hostname eval 'Queues = 5, L=[{_,_} = rabbit_amqqueue:declare(rabbit_misc:r(<<"/">>, queue, list_to_binary("rabbit_1.queue."++integer_to_list(N))), false, false, [], none) || N <- lists:seq(1, Queues)], {ok, {queues, length(L)}}.'  

    9qeueues on the third node, rabbit_2@hostname ,

    ./rabbitmqctl -n rabbit_2@hostname eval 'Queues = 9, L=[{_,_} = rabbit_amqqueue:declare(rabbit_misc:r(<<"/">>, queue, list_to_binary("rabbit_2.queue."++integer_to_list(N))), false, false, [], none) || N <- lists:seq(1, Queues)], {ok, {queues, length(L)}}.'   
  6. Verify the declared queues and their home node locations by executing the following command;

    ./rabbitmqctl list_queues -q name pid 

    The pid values will also be prefixed with the home node of each queue. For instance;

    rabbit_2.queue.3    <rabbit_2@hostname.1.7586.0> rabbit.queue.3      <rabbit@hostname.2.700.0> rabbit_2.queue.8    <rabbit_2@hostname.1.7606.0> rabbit.queue.2      <rabbit@hostname.2.696.0> rabbit_1.queue.2    <rabbit_1@hostname.1.1042.0> rabbit_2.queue.7    <rabbit_2@hostname.1.7602.0>  

    Or you can verify the home nodes of each queue from the management UI;

Take Control of your RabbitMQ queues

Fig 3. Declared queues

  1. Now we’re going to declare a queue with the min-masters queue master location strategy configured. We’ll carry this out on the node with the most queues, i.e. node rabbit_2@hostname (from step 5), and verify that the queue is created on the node with the minimum number of master queues, node rabbit@hostname . Execute the command below. We’ll call our queue "MinMasterQueue.1" .

    First lets define the queue master location policy which will be applied to our queue, "MinMasterQueue.1" . We’ll name this policy qml-policy , and set it to be applied to all queues, of names prefixed with MinMasterQueue ;

    ./rabbitmqctl set_policy qml-policy "^MinMasterQueue/." '{"queue-master-locator":"min-masters"}' --apply-to queues 

    With our policy defined, we can now declare our "MinMasterQueue.1" queue as follows;

    ./rabbitmqctl -n rabbit_2@hostname eval 'QueueName = << "MinMasterQueue.1" >>, {_,_} = rabbit_amqqueue:declare(rabbit_misc:r(<<"/">>, queue, QueueName), false, false, [], none).' 
  2. Final step is to verify that the queue was created on the correct node by executing the following;

    ./rabbitmqctl list_queues -q name pid 

    You should see the "MinMasterQueue.1" queue listed along with the rest of the test queues, similar to the following;

    MinMasterQueue.1    <rabbit@hostname.2.1033.0>  

    Or from the queue listing on the management UI.

Take Control of your RabbitMQ queues

Fig 4. Min-master queue

The results are indeed correct, with the home node of "MinMasterQueue.1" being the one which had the least number queue masters, rabbit@hostname .

You can repeatedly execute step 8, changing the QueueName variable on each run, to see the queue master location strategy in effect. The home node of the created queues will interchange from one node to another, depending on the queue master count per node, at each point in time.

Take Control of your RabbitMQ queues

Fig 5. Min-master queues

This is a quick experimental illustration of this mechanism at work. There are other ways of defining this policy which I illustrate in the Examples section up next.

Examples

Following are some examples of how to configure queue master location strategies.

1. rabbitmq.config

Firstly, to set the location strategy from the rabbitmq.config file, simply add the following configuration entry;

{rabbit,[ .           .           {queue_master_locator, <<"min-masters">>},           .           . ]}, 

NOTE: The strategy is configured as an Erlang binary data type i.e. <<"min-masters">> .

2. Policy

As already shown in the experiment , to set the strategy as a policy , on a UNIX environment for example, simply execute the following control command;

rabbitmqctl set_policy qml-policy ".*" '{"queue-master-locator":"min-masters"}' --apply-to queues 

This creates a min-masters queue location strategy policy, of name qml-policy , which, from the ".*" wildcard match pattern, will be applied to all queues created on the node/cluster.

3. Declare arguments

I illustrate setting the queue location strategy from declare arguments using three examples, in Erlang , Java and Python .

In Erlang , you’d simply specify the location strategy as part of the ‘queue.declare’ record as follows;

  Args = [{<<"x-queue-master-locator">>, <<"min-masters">>}],   QueueDeclare = #'queue.declare'{queue      = <<"microservices.1.queue">>,                                   auto_delete= true,                                   durable    = false,                                   arguments  = Args }, #'queue.declare_ok'{} = amqp_channel:call(Channel, QueueDeclare), 

In Java , just create an arguments map, define the queue master location strategy and declare the queue as follows;

Map args = new HashMap();  args.put("x-queue-master-locator", "min-masters");  channel.queueDeclare("microservice.1.queue", false, false, false, args);  

Similarly, in Python , using Pika AMQP library, you’d carry out something similar to the following;

channel = connection.channel() args = {"x-queue-master-locator": "min-masters"} channel.queue_declare(queue=queue_name, durable=True, arguments=args ) 

You can find some complete versions of these examples here . These are simplified, to illustrate the concepts. If your requirement is to implement something more complex and you need some assistance, don’t hesitate to get in touch !

This feature was my first contribution to RabbitMQ as part of the  Erlang Solutions RabbitMQ Support team . Thank you everybody!

Go back to the blog

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » Take Control of your RabbitMQ queues

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址
分享按钮