Chapter 4. Performing maintenance on Compute nodes and Controller nodes with Instance HA
To perform maintenance on a Compute node or a Controller node with Instance HA, stop the node by setting it in
standby mode and disabling the Pacemaker resources on the node. After you complete the maintenance work, you start the node and check that the Pacemaker resources are healthy.
- A running overcloud with Instance HA enabled
Log in to a Controller node and stop the Compute or Controller node:
# pcs node standby <node UUID>Important
You must log in to a different node from the node you want to stop.
Disable the Pacemaker resources on the node:
# pcs resource disable <ocf::pacemaker:remote on the node>
- Perform any maintenance work on the node.
- Restore the IPMI connection and start the node. Wait until the node is ready before proceeding.
Enable the Pacemaker resources on the node and start the node:
# pcs resource enable <ocf::pacemaker:remote on the node> # pcs node unstandby <node UUID>
If you set the node to maintenance mode, source the credential file for your overcloud and unset the node from maintenance mode:
# source stackrc # openstack baremetal node maintenance unset <baremetal node UUID>
Check that the Pacemaker resources are active and healthy:
# pcs status
If any Pacemaker resources fail to start during the startup process, run the
pcs resource cleanupcommand to reset the status and the fail count of the resource.
If you evacuated instances from a Compute node before you stopped the node, check that the instances are migrated to a different node:
# openstack server list --long # nova migration-list