Chapter 10. Rebooting the Overcloud

Some situations require a reboot of nodes in the undercloud and overcloud. The following procedures show how to reboot different node types. Be aware of the following notes:

  • If rebooting all nodes in one role, it is advisable to reboot each node individually. This helps retain services for that role during the reboot.
  • If rebooting all nodes in your OpenStack Platform environment, use the following list to guide the reboot order:

Recommended Node Reboot Order

  1. Reboot the director
  2. Reboot Controller nodes
  3. Reboot Ceph Storage nodes
  4. Reboot Compute nodes
  5. Reboot object Storage nodes

10.1. Rebooting the Director

To reboot the director node, follow this process:

  1. Reboot the node:

    $ sudo reboot
  2. Wait until the node boots.

When the node boots, check the status of all services:

$ sudo systemctl list-units "openstack*" "neutron*" "openvswitch*"

Verify the existence of your Overcloud and its nodes:

$ source ~/stackrc
$ openstack server list
$ ironic node-list
$ openstack stack list

10.2. Rebooting Controller Nodes

To reboot the Controller nodes, follow this process:

  1. Select a node to reboot. Log into it and reboot it:

    $ sudo reboot

    The remaining Controller Nodes in the cluster retain the high availability services during the reboot.

  2. Wait until the node boots.
  3. Log into the node and check the cluster status:

    $ sudo pcs status

    The node rejoins the cluster.

    Note

    If any services fail after the reboot, run sudo pcs resource cleanup, which cleans the errors and sets the state of each resource to Started. If any errors persist, contact Red Hat and request guidance and assistance.

  4. Check all systemd services on the Controller Node are active:

    $ sudo systemctl list-units "openstack*" "neutron*" "openvswitch*"
  5. Log out of the node, select the next Controller Node to reboot, and repeat this procedure until you have rebooted all Controller Nodes.

10.3. Rebooting Ceph Storage Nodes

To reboot the Ceph Storage nodes, follow this process:

  1. Select the first Ceph Storage node to reboot and log into it.
  2. Disable Ceph Storage cluster rebalancing temporarily:

    $ sudo ceph osd set noout
    $ sudo ceph osd set norebalance
  3. Reboot the node:

    $ sudo reboot
  4. Wait until the node boots.
  5. Log into the node and check the cluster status:

    $ sudo ceph -s

    Check that the pgmap reports all pgs as normal (active+clean).

  6. Log out of the node, reboot the next node, and check its status. Repeat this process until you have rebooted all Ceph storage nodes.
  7. When complete, enable cluster rebalancing again:

    $ sudo ceph osd unset noout
    $ sudo ceph osd unset norebalance
  8. Perform a final status check to make sure the cluster reports HEALTH_OK:

    $ sudo ceph status

10.4. Rebooting Compute Nodes

Reboot each Compute node individually and ensure zero downtime of instances in your OpenStack Platform environment. This involves the following workflow:

  1. Select a Compute node to reboot
  2. Migrate its instances to another Compute node
  3. Reboot the empty Compute node

List all Compute nodes and their UUIDs:

$ nova list | grep "compute"

Select a Compute node to reboot and first migrate its instances using the following process:

  1. From the undercloud, select a Compute Node to reboot and disable it:

    $ source ~/overcloudrc
    $ openstack compute service list
    $ openstack compute service set [hostname] nova-compute --disable
  2. List all instances on the Compute node:

    $ openstack server list --host [hostname]
  3. Select a second Compute Node to act as the target host for migrating instances. This host needs enough resources to host the migrated instances. From the undercloud, migrate each instance from the disabled host to the target host.

    $ nova live-migration [instance-name] [target-hostname]
    $ nova migration-list
    $ nova resize-confirm [instance-name]
  4. Repeat this step until you have migrated all instances from the Compute Node.
Important

For full instructions on configuring and migrating instances, see Section 8.9, “Migrating VMs from an Overcloud Compute Node”.

Reboot the Compute node using the following process

  1. Log into the Compute Node and reboot it:

    $ sudo reboot
  2. Wait until the node boots.
  3. Enable the Compute Node again:

    $ source ~/overcloudrc
    $ openstack compute service set [hostname] nova-compute --enable
  4. Select the next node to reboot.

10.5. Rebooting Object Storage Nodes

To reboot the Object Storage nodes, follow this process:

  1. Select a Object Storage node to reboot. Log into it and reboot it:

    $ sudo reboot
  2. Wait until the node boots.
  3. Log into the node and check the status:

    $ sudo systemctl list-units "openstack-swift*"
  4. Log out of the node and repeat this process on the next Object Storage node.