Language:
Format:

Chapter 4. Rebooting the overcloud

After performing a minor version update, perform a reboot of your overcloud in case the nodes use a new kernel or new system-level components.

4.1. Rebooting Controller and composable nodes

Complete the following steps to reboot controller nodes and standalone nodes based on composable roles, excluding Compute nodes and Ceph Storage nodes.

Procedure

Log in to the node that you want to reboot.
Optional: If the node uses Pacemaker resources, stop the cluster:
```
[heat-admin@overcloud-controller-0 ~]$ sudo pcs cluster stop
```

Reboot the node:

[heat-admin@overcloud-controller-0 ~]$ sudo reboot

Wait until the node boots.
Check the services. For example:
1. If the node uses Pacemaker services, check the node has rejoined the cluster:
```
[heat-admin@overcloud-controller-0 ~]$ sudo pcs status
```
2. If the node uses Systemd services, check all services are enabled:
```
[heat-admin@overcloud-controller-0 ~]$ sudo systemctl status
```
3. If the node uses containerized services, check all containers on the node are active:
```
[heat-admin@overcloud-controller-0 ~]$ sudo podman ps
```

4.2. Rebooting a Ceph Storage (OSD) cluster

Complete the following steps to reboot a cluster of Ceph Storage (OSD) nodes.

Procedure

$ sudo podman exec -it ceph-mon-controller-0 ceph osd set noout
$ sudo podman exec -it ceph-mon-controller-0 ceph osd set norebalance

Select the first Ceph Storage node to reboot and log into the node.
Reboot the node:
```
$ sudo reboot
```
Wait until the node boots.
Log in to the node and check the cluster status:
```
$ sudo podman exec -it ceph-mon-controller-0 ceph status
```
Check the pgmap reports all pgs as normal (active+clean).
Log out of the node, reboot the next node, and check its status. Repeat this process until you have rebooted all Ceph storage nodes.

When complete, log into a Ceph MON or Controller node and enable cluster rebalancing again:

$ sudo podman exec -it ceph-mon-controller-0 ceph osd unset noout
$ sudo podman exec -it ceph-mon-controller-0 ceph osd unset norebalance

Perform a final status check to verify the cluster reports HEALTH_OK:
```
$ sudo podman exec -it ceph-mon-controller-0 ceph status
```

4.3. Rebooting Compute nodes

Complete the following steps to reboot Compute nodes. To ensure minimal downtime of instances in your OpenStack Platform environment, this procedure also includes instructions about migrating instances from the Compute node you want to reboot. This involves the following workflow:

Decide whether to migrate instances to another Compute node before rebooting the node
Select and disable the Compute node you want to reboot so that it does not provision new instances
Migrate the instances to another Compute node
Reboot the empty Compute node
Enable the empty Compute node

Prerequisites

Before you reboot the Compute node, you must decide whether to migrate instances to another Compute node while the node is rebooting.

If for some reason you cannot or do not want to migrate the instances, you can set the following core template parameters to control the state of the instances after the Compute node reboots:

NovaResumeGuestsStateOnHostBoot: Determines whether to return instances to the same state on the Compute node after reboot. When set to False, the instances will remain down and you must start them manually. Default value is: False
NovaResumeGuestsShutdownTimeout: Number of seconds to wait for an instance to shut down before rebooting. It is not recommended to set this value to 0. Default value is: 300

For general information about overcloud parameters and their usage, see Overcloud Parameters.

Procedure

Log in to the undercloud as the stack user.
List all Compute nodes and their UUIDs:
```
$ source ~/stackrc
(undercloud) $ openstack server list --name compute
```
Identify the UUID of the Compute node you want to reboot.

From the undercloud, select a Compute Node. Disable the node:

$ source ~/overcloudrc
(overcloud) $ openstack compute service list
(overcloud) $ openstack compute service set [hostname] nova-compute --disable

List all instances on the Compute node:

(overcloud) $ openstack server list --host [hostname] --all-projects

If you decided not to migrate instances, skip to this step.
If you decided to migrate the instances to another Compute node, use one of the following commands:
1. Migrate the instance to a different host:
```
(overcloud) $ openstack server migrate [instance-id] --live [target-host]--wait
```
2. Let nova-scheduler automatically select the target host:
```
(overcloud) $ nova live-migration [instance-id]
```
3. Live migrate all instances at once:
```
$ nova host-evacuate-live [hostname]
```
  Note
  The nova command might cause some deprecation warnings, which are safe to ignore.
Wait until migration completes.

Confirm the migration was successful:

(overcloud) $ openstack server list --host [hostname] --all-projects

Continue migrating instances until none remain on the chosen Compute Node.

[heat-admin@overcloud-compute-0 ~]$ sudo reboot

Wait until the node boots.

Enable the Compute Node again:

$ source ~/overcloudrc
(overcloud) $ openstack compute service set [hostname] nova-compute --enable

Check whether the Compute node is enabled:

(overcloud) $ openstack compute service list

4.4. Rebooting HCI Compute nodes

The following procedure reboots Compute hyperconverged infrastructure (HCI) nodes.

Procedure

$ sudo podman ps | grep -i ceph | grep -i mon

45fe68d340e5  docker-registry.upshift.redhat.com/ceph/rhceph-4.0-rhel8:latest

Set the CEPH_MON_CONTAINER variable to the name of the container:
```
$ CEPH_MON_CONTAINER=ceph-mon-controller-0
```
Verify that you can use the CEPH_MON_CONTAINER variable to run Ceph commands:
```
$ sudo podman exec $CEPH_MON_CONTAINER ceph -s
```

From the Ceph MON or Controller node, disable Ceph Storage cluster rebalancing temporarily:

$ sudo podman exec $CEPH_MON_CONTAINER ceph osd set noout
$ sudo podman exec $CEPH_MON_CONTAINER ceph osd set norebalance

Log in to the undercloud as the stack user.
List all Compute nodes and their UUIDs:
```
$ source ~/stackrc
(undercloud) $ openstack server list --name compute
```
Identify the UUID of the Compute node you aim to reboot.

From the undercloud, select a Compute node and disable it:

$ source ~/overcloudrc
(overcloud) $ openstack compute service list
(overcloud) $ openstack compute service set [hostname] nova-compute --disable

List all instances on the Compute node:

(overcloud) $ openstack server list --host [hostname] --all-projects

Use one of the following commands to migrate your instances:
1. Migrate the instance to a specific host of your choice:
```
(overcloud) $ openstack server migrate [instance-id] --live [target-host]--wait
```
2. Let nova-scheduler automatically select the target host:
```
(overcloud) $ nova live-migration [instance-id]
```
3. Live migrate all instances at once:
```
$ nova host-evacuate-live [hostname]
```
  Note
  The nova command might cause some deprecation warnings, which are safe to ignore.
Wait until the migration completes.

Confirm that the migration was successful:

(overcloud) $ openstack server list --host [hostname] --all-projects

Continue migrating instances until none remain on the chosen Compute node.
Log in to a Ceph MON or a Controller node and check the cluster status:
```
$ sudo podman exec $CEPH_MON_CONTAINER ceph -s
```
Check that the pgmap reports all pgs as normal (active+clean).
Reboot the Compute HCI node:
```
$ sudo reboot
```
Wait until the node boots.

Enable the Compute node again:

$ source ~/overcloudrc
(overcloud) $ openstack compute service set [hostname] nova-compute --enable

Verify that the Compute node is enabled:

(overcloud) $ openstack compute service list

Log out of the node, reboot the next node, and check its status. Repeat this process until you have rebooted all Ceph storage nodes.

When complete, log in to a Ceph MON or Controller node and enable cluster rebalancing again:

$ sudo podman exec $CEPH_MON_CONTAINER ceph osd unset noout
$ sudo podman exec $CEPH_MON_CONTAINER ceph osd unset norebalance

Perform a final status check to verify the cluster reports HEALTH_OK:
```
$ sudo podman exec $CEPH_MON_CONTAINER ceph status
```

Select Your Language

Chapter 4. Rebooting the overcloud

4.1. Rebooting Controller and composable nodes

4.2. Rebooting a Ceph Storage (OSD) cluster

4.3. Rebooting Compute nodes

4.4. Rebooting HCI Compute nodes

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Language and Page Formatting Options

Chapter 4. Rebooting the overcloud

4.1. Rebooting Controller and composable nodes

4.2. Rebooting a Ceph Storage (OSD) cluster

4.3. Rebooting Compute nodes

4.4. Rebooting HCI Compute nodes

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links