Taking a Node Offline for Maintenance

Applies to: Verify Privilege Vault On-Premises only. Verify Privilege Vault Cloud and platform use Azure Service Bus and are not affected by this procedure.

This procedure describes how to temporarily remove a RabbitMQ cluster node from service for maintenance — such as applying operating system patches, restarting the host server, or performing hardware maintenance — and then return it to service without altering cluster membership.

This procedure is for temporary maintenance only. It stops and restarts the RabbitMQ application process on the node. It does not remove the node from the cluster and does not affect cluster membership or queue data.

If you want to permanently decommission a node, see Removing a Node from a Cluster. If you are using quorum queues, additional steps are required before permanent removal; see Clustering Prerequisites.

Prerequisites

  • Administrator access on the RabbitMQ node being taken offline.

  • The cluster must have at least three nodes and must retain quorum (a majority of nodes running) while the node is offline. For a three-node cluster, only one node may be offline at a time.

  • Access to the RabbitMQ management UI on a peer node (not the node being taken offline) to verify status.

Taking the Node Offline

  1. On the RabbitMQ node to be taken offline, open the Start menu and navigate to the RabbitMQ Server folder.

  2. Right-click RabbitMQ Command Prompt and select Run as Administrator.

  3. In the command prompt, run the following command:

    Copy
    rabbitmqctl stop_app

    This stops the RabbitMQ application on the current node while leaving the Erlang runtime running. The node remains a cluster member in a stopped state.

  4. Perform the required maintenance work on the server.

Verifying the Node Is Offline

  1. From a different node in the cluster, open the RabbitMQ management UI (typically at http://<peer-node-hostname>:15672).

  2. Navigate to the Overview tab. The node you stopped should appear in the node list with a status indicating it is not running.

    Do not verify from the node you just stopped — its management UI will also be unavailable.

Bringing the Node Back Online

  1. On the RabbitMQ node that was taken offline, open RabbitMQ Command Prompt as Administrator (as in steps 1–2).

  2. Run the following command:

    Copy
    rabbitmqctl start_app

    This restarts the RabbitMQ application and rejoins the node to the cluster automatically.

  3. Return to the RabbitMQ management UI on the peer node and refresh the Overview tab. The node should now appear as running.

  4. In Verify Privilege Vault, navigate to Admin > Distributed Engine > Site Connectors and verify that the site connector associated with this RabbitMQ cluster shows as healthy. If any distributed engines lost connectivity during the maintenance window, they should reconnect automatically within a few minutes.