Chapter 17. Replacing Controller nodes

In certain circumstances a Controller node in a high availability cluster might fail. In these situations, you must remove the node from the cluster and replace it with a new Controller node.

Complete the steps in this section to replace a Controller node. The Controller node replacement process involves running the openstack overcloud deploy command to update the overcloud with a request to replace a Controller node.

Important

The following procedure applies only to high availability environments. Do not use this procedure if you are using only one Controller node.

17.1. Preparing for Controller replacement

Before you replace an overcloud Controller node, it is important to check the current state of your Red Hat OpenStack Platform environment. Checking the current state can help avoid complications during the Controller replacement process. Use the following list of preliminary checks to determine if it is safe to perform a Controller node replacement. Run all commands for these checks on the undercloud.

Procedure

  1. Check the current status of the overcloud stack on the undercloud:

    $ source stackrc
    (undercloud)$ openstack stack list --nested

    The overcloud stack and its subsequent child stacks should have either a CREATE_COMPLETE or UPDATE_COMPLETE.

  2. Install the database client tools:

    (undercloud)$ sudo dnf -y install mariadb
  3. Configure root user access to the database:

    (undercloud)$ sudo cp /var/lib/config-data/puppet-generated/mysql/root/.my.cnf /root/.
  4. Perform a backup of the undercloud databases:

    (undercloud)$ mkdir /home/stack/backup
    (undercloud)$ sudo mysqldump --all-databases --quick --single-transaction | gzip > /home/stack/backup/dump_db_undercloud.sql.gz
  5. Check that your undercloud contains 10 GB free storage to accommodate for image caching and conversion when you provision the new node:

    (undercloud)$ df -h
  6. If you are reusing the IP address for the new controller node, ensure that you delete the port used by the old controller:

    (undercloud)$ openstack port delete <port>
  7. Check the status of Pacemaker on the running Controller nodes. For example, if 192.168.0.47 is the IP address of a running Controller node, use the following command to view the Pacemaker status:

    (undercloud)$ ssh heat-admin@192.168.0.47 'sudo pcs status'

    The output shows all services that are running on the existing nodes and that are stopped on the failed node.

  8. Check the following parameters on each node of the overcloud MariaDB cluster:

    • wsrep_local_state_comment: Synced
    • wsrep_cluster_size: 2

      Use the following command to check these parameters on each running Controller node. In this example, the Controller node IP addresses are 192.168.0.47 and 192.168.0.46:

      (undercloud)$ for i in 192.168.0.46 192.168.0.47 ; do echo "*** $i ***" ; ssh heat-admin@$i "sudo podman exec \$(sudo podman ps --filter name=galera-bundle -q) mysql -e \"SHOW STATUS LIKE 'wsrep_local_state_comment'; SHOW STATUS LIKE 'wsrep_cluster_size';\""; done
  9. Check the RabbitMQ status. For example, if 192.168.0.47 is the IP address of a running Controller node, use the following command to view the RabbitMQ status:

    (undercloud)$ ssh heat-admin@192.168.0.47 "sudo podman exec \$(sudo podman ps -f name=rabbitmq-bundle -q) rabbitmqctl cluster_status"

    The running_nodes key should show only the two available nodes and not the failed node.

  10. If fencing is enabled, disable it. For example, if 192.168.0.47 is the IP address of a running Controller node, use the following command to check the status of fencing:

    (undercloud)$ ssh heat-admin@192.168.0.47 "sudo pcs property show stonith-enabled"

    Run the following command to disable fencing:

    (undercloud)$ ssh heat-admin@192.168.0.47 "sudo pcs property set stonith-enabled=false"
  11. Check the Compute services are active on the director node:

    (undercloud)$ openstack hypervisor list

    The output should show all non-maintenance mode nodes as up.

  12. Ensure all undercloud containers are running:

    (undercloud)$ sudo podman ps
  13. Stop all the nova_* containers running on the failed Controller node:

    [root@controller-0 ~]$ sudo systemctl stop tripleo_nova_api.service
    [root@controller-0 ~]$ sudo systemctl stop tripleo_nova_api_cron.service
    [root@controller-0 ~]$ sudo systemctl stop tripleo_nova_compute.service
    [root@controller-0 ~]$ sudo systemctl stop tripleo_nova_conductor.service
    [root@controller-0 ~]$ sudo systemctl stop tripleo_nova_metadata.service
    [root@controller-0 ~]$ sudo systemctl stop tripleo_nova_placement.service
    [root@controller-0 ~]$ sudo systemctl stop tripleo_nova_scheduler.service
  14. Optional: If you are using the Bare Metal Service (ironic) as the virt driver, you must manually update the service entries in your cell database for any bare metal instances whose instances.host is set to the controller that you are removing. Contact Red Hat Support for assistance.

    Note

    This manual update of the cell database when using Bare Metal Service (ironic) as the virt driver is a temporary workaround to ensure the nodes are rebalanced, until BZ2017980 is complete.

17.2. Removing a Ceph Monitor daemon

If your Controller node is running a Ceph monitor service, complete the following steps to remove the ceph-mon daemon.

Note

Adding a new Controller node to the cluster also adds a new Ceph monitor daemon automatically.

Procedure

  1. Connect to the Controller node that you want to replace and become the root user:

    # ssh heat-admin@192.168.0.47
    # sudo su -
    Note

    If the Controller node is unreachable, skip steps 1 and 2 and continue the procedure at step 3 on any working Controller node.

  2. Stop the monitor:

    # systemctl stop ceph-mon@<monitor_hostname>

    For example:

    # systemctl stop ceph-mon@overcloud-controller-1
  3. Disconnect from the Controller node that you want to replace.
  4. Connect to one of the existing Controller nodes.

    # ssh heat-admin@192.168.0.46
    # sudo su -
  5. Remove the monitor from the cluster:

    # sudo podman exec -it ceph-mon-controller-0 ceph mon remove overcloud-controller-1
  6. On all Controller nodes, remove the v1 and v2 monitor entries from /etc/ceph/ceph.conf. For example, if you remove controller-1, then remove the IPs and hostname for controller-1.

    Before:

    mon host = [v2:172.18.0.21:3300,v1:172.18.0.21:6789],[v2:172.18.0.22:3300,v1:172.18.0.22:6789],[v2:172.18.0.24:3300,v1:172.18.0.24:6789]
    mon initial members = overcloud-controller-2,overcloud-controller-1,overcloud-controller-0

    After:

    mon host = [v2:172.18.0.21:3300,v1:172.18.0.21:6789],[v2:172.18.0.24:3300,v1:172.18.0.24:6789]
    mon initial members = overcloud-controller-2,overcloud-controller-0
    Note

    Director updates the ceph.conf file on the relevant overcloud nodes when you add the replacement Controller node. Normally, director manages this configuration file exclusively and you should not edit the file manually. However, you can edit the file manually if you want to ensure consistency in case the other nodes restart before you add the new node.

  7. (Optional) Archive the monitor data and save the archive on another server:

    # mv /var/lib/ceph/mon/<cluster>-<daemon_id> /var/lib/ceph/mon/removed-<cluster>-<daemon_id>

17.3. Preparing the cluster for Controller node replacement

Before you replace the old node, you must ensure that Pacemaker is not running on the node and then remove that node from the Pacemaker cluster.

Procedure

  1. To view the list of IP addresses for the Controller nodes, run the following command:

    (undercloud) $ openstack server list -c Name -c Networks
    +------------------------+-----------------------+
    | Name                   | Networks              |
    +------------------------+-----------------------+
    | overcloud-compute-0    | ctlplane=192.168.0.44 |
    | overcloud-controller-0 | ctlplane=192.168.0.47 |
    | overcloud-controller-1 | ctlplane=192.168.0.45 |
    | overcloud-controller-2 | ctlplane=192.168.0.46 |
    +------------------------+-----------------------+
  2. If the old node is still reachable, log in to one of the remaining nodes and stop pacemaker on the old node. For this example, stop pacemaker on overcloud-controller-1:

    (undercloud) $ ssh heat-admin@192.168.0.47 "sudo pcs status | grep -w Online | grep -w overcloud-controller-1"
    (undercloud) $ ssh heat-admin@192.168.0.47 "sudo pcs cluster stop overcloud-controller-1"
    Note

    In case the old node is physically unavailable or stopped, it is not necessary to perform the previous operation, as pacemaker is already stopped on that node.

  3. After you stop Pacemaker on the old node, delete the old node from the pacemaker cluster. The following example command logs in to overcloud-controller-0 to remove overcloud-controller-1:

    (undercloud) $ ssh heat-admin@192.168.0.47 "sudo pcs cluster node remove overcloud-controller-1"

    If the node that that you want to replace is unreachable (for example, due to a hardware failure), run the pcs command with additional --skip-offline and --force options to forcibly remove the node from the cluster:

    (undercloud) $ ssh heat-admin@192.168.0.47 "sudo pcs cluster node remove overcloud-controller-1 --skip-offline --force"
  4. After you remove the old node from the pacemaker cluster, remove the node from the list of known hosts in pacemaker:

    (undercloud) $ ssh heat-admin@192.168.0.47 "sudo pcs host deauth overcloud-controller-1"

    You can run this command whether the node is reachable or not.

  5. To ensure that the new Controller node uses the correct STONITH fencing device after the replacement, delete the old devices from the node by entering the following command:

    (undercloud) $ ssh heat-admin@192.168.0.47 "sudo pcs stonith delete <stonith_resource_name>"
    • Replace <stonith_resource_name> with the name of the STONITH resource that corresponds to the old node. The resource name uses the the format <resource_agent>-<host_mac>. You can find the resource agent and the host MAC address in the FencingConfig section of the fencing.yaml file.
  6. The overcloud database must continue to run during the replacement procedure. To ensure that Pacemaker does not stop Galera during this procedure, select a running Controller node and run the following command on the undercloud with the IP address of the Controller node:

    (undercloud) $ ssh heat-admin@192.168.0.47 "sudo pcs resource unmanage galera-bundle"

17.4. Replacing a Controller node

To replace a Controller node, identify the index of the node that you want to replace.

  • If the node is a virtual node, identify the node that contains the failed disk and restore the disk from a backup. Ensure that the MAC address of the NIC used for PXE boot on the failed server remains the same after disk replacement.
  • If the node is a bare metal node, replace the disk, prepare the new disk with your overcloud configuration, and perform a node introspection on the new hardware.
  • If the node is a part of a high availability cluster with fencing, you might need recover the Galera nodes separately. For more information, see the article How Galera works and how to rescue Galera clusters in the context of Red Hat OpenStack Platform.

Complete the following example steps to replace the overcloud-controller-1 node with the overcloud-controller-3 node. The overcloud-controller-3 node has the ID 75b25e9a-948d-424a-9b3b-f0ef70a6eacf.

Important

To replace the node with an existing bare metal node, enable maintenance mode on the outgoing node so that the director does not automatically reprovision the node.

Procedure

  1. Source the stackrc file:

    $ source ~/stackrc
  2. Identify the index of the overcloud-controller-1 node:

    $ INSTANCE=$(openstack server list --name overcloud-controller-1 -f value -c ID)
  3. Identify the bare metal node associated with the instance:

    $ NODE=$(openstack baremetal node list -f csv --quote minimal | grep $INSTANCE | cut -f1 -d,)
  4. Set the node to maintenance mode:

    $ openstack baremetal node maintenance set $NODE
  5. If the Controller node is a virtual node, run the following command on the Controller host to replace the virtual disk from a backup:

    $ cp <VIRTUAL_DISK_BACKUP> /var/lib/libvirt/images/<VIRTUAL_DISK>
    • Replace <VIRTUAL_DISK_BACKUP> with the path to the backup of the failed virtual disk, and replace <VIRTUAL_DISK> with the name of the virtual disk that you want to replace.

      If you do not have a backup of the outgoing node, you must use a new virtualized node.

      If the Controller node is a bare metal node, complete the following steps to replace the disk with a new bare metal disk:

      1. Replace the physical hard drive or solid state drive.
      2. Prepare the node with the same configuration as the failed node.
  6. List unassociated nodes and identify the ID of the new node:

    $ openstack baremetal node list --unassociated
  7. Tag the new node with the control profile:

    (undercloud) $ openstack baremetal node set --property capabilities='profile:control,boot_option:local' 75b25e9a-948d-424a-9b3b-f0ef70a6eacf

17.5. Replacing a bootstrap Controller node

If you want to replace the Controller node that you use for bootstrap operations and keep the node name, complete the following steps to set the name of the bootstrap Controller node after the replacement process.

Procedure

  1. Find the name of the bootstrap Controller node:

    $ ssh heat-admin@<controller_ip> "sudo hiera -c /etc/puppet/hiera.yaml pacemaker_short_bootstrap_node_name"
    • Replace <controller_ip> with the IP address of any active Controller node.
  2. Check if your environment files include the ExtraConfig section. If the ExtraConfig parameter does not exist, create the following environment file ~/templates/bootstrap-controller.yaml and add the following content:

    parameter_defaults:
      ExtraConfig:
        pacemaker_short_bootstrap_node_name: <node_name>
        mysql_short_bootstrap_node_name: <node_name>
    • Replace <node_name> with the name of an existing Controller node that you want to use in bootstrap operations after the replacement process.

      If your environment files already include the ExtraConfig parameter, add only the lines that set the pacemaker_short_bootstrap_node_name and mysql_short_bootstrap_node_name parameters.

  3. Follow the steps to trigger the Controller node replacement and include the environment files in the overcloud deploy command. For more information, see Triggering the Controller node replacement.

For information about troubleshooting the bootstrap Controller node replacement, see the article Replacement of the first Controller node fails at step 1 if the same hostname is used for a new node.

17.6. Preserving hostnames when replacing nodes that use predictable IP addresses and HostNameMap

If you configured your overcloud to use predictable IP addresses, and HostNameMap to map heat-based hostnames to the hostnames of pre-provisioned nodes, then you must configure your overcloud to map the new replacement node index to an IP address and hostname.

Procedure

  1. Log in to the undercloud as the stack user.
  2. Source the stackrc file:

    $ source ~/stackrc
  3. Retrieve the physical_resource_id and the removed_rsrc_list for the resource you want to replace:

    (undercloud)$ openstack stack resource show <stack> <role>
    • Replace <stack> with the name of the stack the resource belongs to, for example, overcloud.
    • Replace <role> with the name of the role that you want to replace the node for, for example, Compute.

      Example output:

      +------------------------+-----------------------------------------------------------+
      | Field                  | Value                                                     |
      +------------------------+-----------------------------------------------------------+
      | attributes             | {u'attributes': None, u'refs': None, u'refs_map': None,   |
      |                        | u'removed_rsrc_list': [u'2', u'3']}          | 1
      | creation_time          | 2017-09-05T09:10:42Z                                      |
      | description            |                                                           |
      | links                  | [{u'href': u'http://192.168.24.1:8004/v1/bd9e6da805594de9 |
      |                        | 8d4a1d3a3ee874dd/stacks/overcloud/1c7810c4-8a1e-          |
      |                        | 4d61-a5d8-9f964915d503/resources/Compute', u'rel':        |
      |                        | u'self'}, {u'href': u'http://192.168.24.1:8004/v1/bd9e6da |
      |                        | 805594de98d4a1d3a3ee874dd/stacks/overcloud/1c7810c4-8a1e- |
      |                        | 4d61-a5d8-9f964915d503', u'rel': u'stack'}, {u'href': u'h |
      |                        | ttp://192.168.24.1:8004/v1/bd9e6da805594de98d4a1d3a3ee874 |
      |                        | dd/stacks/overcloud-Compute-zkjccox63svg/7632fb0b-        |
      |                        | 80b1-42b3-9ea7-6114c89adc29', u'rel': u'nested'}]         |
      | logical_resource_id    | Compute                                                   |
      | physical_resource_id   | 7632fb0b-80b1-42b3-9ea7-6114c89adc29                      |
      | required_by            | [u'AllNodesDeploySteps',                                  |
      |                        | u'ComputeAllNodesValidationDeployment',                   |
      |                        | u'AllNodesExtraConfig', u'ComputeIpListMap',              |
      |                        | u'ComputeHostsDeployment', u'UpdateWorkflow',             |
      |                        | u'ComputeSshKnownHostsDeployment', u'hostsConfig',        |
      |                        | u'SshKnownHostsConfig', u'ComputeAllNodesDeployment']     |
      | resource_name          | Compute                                                   |
      | resource_status        | CREATE_COMPLETE                                           |
      | resource_status_reason | state changed                                             |
      | resource_type          | OS::Heat::ResourceGroup                                   |
      | updated_time           | 2017-09-05T09:10:42Z                                      |
      +------------------------+-----------------------------------------------------------+
      1
      The removed_rsrc_list lists the indexes of nodes that have already been removed for the resource.
  4. Retrieve the resource_name to determine the maximum index that heat has applied to a node for this resource:

    (undercloud)$ openstack stack resource list <physical_resource_id>
    • Replace <physical_resource_id> with the ID you retrieved in step 3.
  5. Use the resource_name and the removed_rsrc_list to determine the next index that heat will apply to a new node:

    • If removed_rsrc_list is empty, then the next index will be (current_maximum_index) + 1.
    • If removed_rsrc_list includes the value (current_maximum_index) + 1, then the next index will be the next available index.
  6. Retrieve the ID of the replacement bare-metal node:

    (undercloud)$ openstack baremetal node list
  7. Update the capability of the replacement node with the new index:

    openstack baremetal node set --property capabilities='node:<role>-<index>,boot_option:local' <node>
    • Replace <role> with the name of the role that you want to replace the node for, for example, compute.
    • Replace <index> with the index calculated in step 5.
    • Replace <node> with the ID of the bare metal node.

    The Compute scheduler uses the node capability to match the node on deployment.

  8. Assign a hostname to the new node by adding the index to the HostnameMap configuration, for example:

    parameter_defaults:
      ControllerSchedulerHints:
        'capabilities:node': 'controller-%index%'
      ComputeSchedulerHints:
        'capabilities:node': 'compute-%index%'
      HostnameMap:
        overcloud-controller-0: overcloud-controller-prod-123-0
        overcloud-controller-1: overcloud-controller-prod-456-0 1
        overcloud-controller-2: overcloud-controller-prod-789-0
        overcloud-controller-3: overcloud-controller-prod-456-0 2
        overcloud-compute-0: overcloud-compute-prod-abc-0
        overcloud-compute-3: overcloud-compute-prod-abc-3 3
        overcloud-compute-8: overcloud-compute-prod-abc-3 4
        ....
    1
    Node that you are removing and replacing with the new node.
    2
    New node.
    3
    Node that you are removing and replacing with the new node.
    4
    New node.
    Note

    Do not delete the mapping for the removed node from HostnameMap.

  9. Add the IP address for the replacement node to the end of each network IP address list in your network IP address mapping file, ips-from-pool-all.yaml. In the following example, the IP address for the new index, overcloud-controller-3, is added to the end of the IP address list for each ControllerIPs network, and is assigned the same IP address as overcloud-controller-1 because it replaces overcloud-controller-1. The IP address for the new index, overcloud-compute-8, is also added to the end of the IP address list for each ComputeIPs network, and is assigned the same IP address as the index it replaces, overcloud-compute-3:

    parameter_defaults:
      ControllerIPs:
        ...
        internal_api:
          - 192.168.1.10  1
          - 192.168.1.11  2
          - 192.168.1.12  3
          - 192.168.1.11  4
        ...
        storage:
          - 192.168.2.10
          - 192.168.2.11
          - 192.168.2.12
          - 192.168.2.11
        ...
    
      ComputeIPs:
        ...
        internal_api:
          - 172.17.0.10 5
          - 172.17.0.11 6
          - 172.17.0.11 7
        ...
        storage:
          - 172.17.0.10
          - 172.17.0.11
          - 172.17.0.11
        ...
    1
    IP address assigned to index 0, host name overcloud-controller-prod-123-0.
    2
    IP address assigned to index 1, host name overcloud-controller-prod-456-0. This node is replaced by index 3. Do not remove this entry.
    3
    IP address assigned to index 2, host name overcloud-controller-prod-789-0.
    4
    IP address assigned to index 3, host name overcloud-controller-prod-456-0. This is the new node that replaces index 1.
    5
    IP address assigned to index 0, host name overcloud-compute-0.
    6
    IP address assigned to index 1, host name overcloud-compute-3. This node is replaced by index 2. Do not remove this entry.
    7
    IP address assigned to index 2, host name overcloud-compute-8. This is the new node that replaces index 1.

17.7. Triggering the Controller node replacement

Complete the following steps to remove the old Controller node and replace it with a new Controller node.

Procedure

  1. Determine the UUID of the Controller node that you want to remove and store it in the <NODEID> variable. Ensure that you replace <node_name> with the name of the node that you want to remove:

    (undercloud)[stack@director ~]$ NODEID=$(openstack server list -f value -c ID --name <node_name>)
  2. To identify the Heat resource ID, enter the following command:

    (undercloud)[stack@director ~]$ openstack stack resource show overcloud ControllerServers -f json -c attributes | jq --arg NODEID "$NODEID" -c '.attributes.value | keys[] as $k | if .[$k] == $NODEID then "Node index \($k) for \(.[$k])" else empty end'
  3. Create the following environment file ~/templates/remove-controller.yaml and include the node index of the Controller node that you want to remove:

    parameters:
      ControllerRemovalPolicies:
        [{'resource_list': ['<node_index>']}]
  4. Enter the overcloud deployment command, and include the remove-controller.yaml environment file and any other environment files relevant to your environment:

    (undercloud) $ openstack overcloud deploy --templates \
        -e /home/stack/templates/remove-controller.yaml \
        [OTHER OPTIONS]
    Note
    • Include -e ~/templates/remove-controller.yaml only for this instance of the deployment command. Remove this environment file from subsequent deployment operations.
    • Include ~/templates/bootstrap-controller.yaml if you are replacing a bootstrap Controller node and want to keep the node name. For more information, see Replacing a bootstrap Controller node.
  5. Director removes the old node, creates a new node, and updates the overcloud stack. You can check the status of the overcloud stack with the following command:

    (undercloud)$ openstack stack list --nested
  6. When the deployment command completes, confirm that the old node is replaced with the new node:

    (undercloud) $ openstack server list -c Name -c Networks
    +------------------------+-----------------------+
    | Name                   | Networks              |
    +------------------------+-----------------------+
    | overcloud-compute-0    | ctlplane=192.168.0.44 |
    | overcloud-controller-0 | ctlplane=192.168.0.47 |
    | overcloud-controller-2 | ctlplane=192.168.0.46 |
    | overcloud-controller-3 | ctlplane=192.168.0.48 |
    +------------------------+-----------------------+

    The new node now hosts running control plane services.

17.8. Cleaning up after Controller node replacement

After you complete the node replacement, complete the following steps to finalize the Controller cluster.

Procedure

  1. Log into a Controller node.
  2. Enable Pacemaker management of the Galera cluster and start Galera on the new node:

    [heat-admin@overcloud-controller-0 ~]$ sudo pcs resource refresh galera-bundle
    [heat-admin@overcloud-controller-0 ~]$ sudo pcs resource manage galera-bundle
  3. Perform a final status check to ensure that the services are running correctly:

    [heat-admin@overcloud-controller-0 ~]$ sudo pcs status
    Note

    If any services have failed, use the pcs resource refresh command to resolve and restart the failed services.

  4. Exit to director:

    [heat-admin@overcloud-controller-0 ~]$ exit
  5. Source the overcloudrc file so that you can interact with the overcloud:

    $ source ~/overcloudrc
  6. Check the network agents in your overcloud environment:

    (overcloud) $ openstack network agent list
  7. If any agents appear for the old node, remove them:

    (overcloud) $ for AGENT in $(openstack network agent list --host overcloud-controller-1.localdomain -c ID -f value) ; do openstack network agent delete $AGENT ; done
  8. If necessary, add your router to the L3 agent host on the new node. Use the following example command to add a router named r1 to the L3 agent using the UUID 2d1c1dc1-d9d4-4fa9-b2c8-f29cd1a649d4:

    (overcloud) $ openstack network agent add router --l3 2d1c1dc1-d9d4-4fa9-b2c8-f29cd1a649d4 r1
  9. Clean the cinder services.

    1. List the cinder services:

      (overcloud) $ openstack volume service list
    2. Log in to a controller node, connect to the cinder-api container and use the cinder-manage service remove command to remove leftover services:

      [heat-admin@overcloud-controller-0 ~]$ sudo podman exec -it cinder_api cinder-manage service remove cinder-backup <host>
      [heat-admin@overcloud-controller-0 ~]$ sudo podman exec -it cinder_api cinder-manage service remove cinder-scheduler <host>
  10. Clean the RabbitMQ cluster.

    1. Log into a Controller node.
    2. Use the podman exec command to launch bash, and verify the status of the RabbitMQ cluster:

      [heat-admin@overcloud-controller-0 ~]$ podman exec -it rabbitmq-bundle-podman-0 bash
      [heat-admin@overcloud-controller-0 ~]$ rabbitmqctl cluster_status
    3. Use the rabbitmqctl command to forget the replaced controller node:

      [heat-admin@overcloud-controller-0 ~]$ rabbitmqctl forget_cluster_node <node_name>
  11. If you replaced a bootstrap Controller node, you must remove the environment file ~/templates/bootstrap-controller.yaml after the replacement process, or delete the pacemaker_short_bootstrap_node_name and mysql_short_bootstrap_node_name parameters from your existing environment file. This step prevents director from attempting to override the Controller node name in subsequent replacements. For more information, see Replacing a bootstrap controller node.