Chapter 6. Non-Director Environments: Upgrading Individual OpenStack Services (Live Compute) in a High Availability Environment
This chapter describes the steps you should follow to upgrade your cloud deployment by updating one service at a time with live compute in a High Availability (HA) environment. This scenario upgrades from Red Hat OpenStack Platform 9 to Red Hat OpenStack Platform 10 in environments that do not use the director.
A live Compute upgrade minimizes interruptions to your Compute service, with only a few minutes for the smaller services, and a longer migration interval for the workloads moving to newly-upgraded Compute hosts. Existing workloads can run indefinitely, and you do not need to wait for a database migration.
Due to certain package dependencies, upgrading the packages for one OpenStack service might cause Python libraries to upgrade before other OpenStack services upgrade. This might cause certain services to fail prematurely. In this situation, continue upgrading the remaining services. All services should be operational upon completion of this scenario.
This method may require additional hardware resources to bring up the Compute nodes.
The procedures in this chapter follow the architectural naming convention followed by all Red Hat OpenStack Platform documentation. If you are unfamiliar with this convention, refer to Architecture Guide available at Red Hat OpenStack Platform Documentation Suite before proceeding.
6.1. Pre-Upgrade Tasks
On each node, change to the Red Hat OpenStack Platform 10 repository using the subscription-manager command:
# subscription-manager repos --disable=rhel-7-server-openstack-9-rpms # subscription-manager repos --enable=rhel-7-server-openstack-10-rpms
Upgrade the openstack-selinux package:
# yum upgrade openstack-selinux
This is necessary to ensure that the upgraded services will run correctly on a system with SELinux enabled.
6.2. Upgrading MariaDB
Perform the follow steps on each host running MariaDB. Complete the steps on one host before starting the process on another host.
Stop the service from running on the local node:
# pcs resource ban galera-master $(crm_node -n)
Wait until
pcs statusshows that the service is no longer running on the local node. This may take a few minutes. The local node transitions to slave mode:Master/Slave Set: galera-master [galera] Masters: [ overcloud-controller-1 overcloud-controller-2 ] Slaves: [ overcloud-controller-0 ]
The node eventually transitions to stopped:
Master/Slave Set: galera-master [galera] Masters: [ overcloud-controller-1 overcloud-controller-2 ] Stopped: [ overcloud-controller-0 ]
Upgrade the relevant packages:
# yum upgrade '*mariadb*' '*galera*'
Allow Pacemaker to schedule the
galeraresource on the local node:# pcs resource clear galera-master
Wait until
pcs statusshows that the galera resource is running on the local node as a master. Thepcs statuscommand should provide output similar to the following:Master/Slave Set: galera-master [galera] Masters: [ overcloud-controller-0 overcloud-controller-1 overcloud-controller-2 ]
Perform this procedure on each node individually until the MariaDB cluster completes a full upgrade.
6.3. Upgrading MongoDB
This procedure upgrades MongoDB, which acts as the backend database for the OpenStack Telemetry service.
Remove the
mongodresource from Pacemaker’s control:# pcs resource unmanage mongod-clone
Stop the service on all Controller nodes. On each Controller node, run the following:
# systemctl stop mongod
Upgrade the relevant packages:
# yum upgrade 'mongodb*' 'python-pymongo*'
Reload
systemdto account for updated unit files:# systemctl daemon-reload
Restart the
mongodservice on your controllers by running, on each controller:# systemctl start mongod
Clean up the resource:
# pcs resource cleanup mongod-clone
Return the resource to Pacemaker control:
# pcs resource manage mongod-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
6.4. Upgrading WSGI Services
This procedure upgrades the packages for the WSGI services on all Controller nodes simultaneously. This includes OpenStack Identity (keystone) and OpenStack Dashboard (horizon).
Remove the service from Pacemaker’s control:
# pcs resource unmanage httpd-clone
Stop the
httpdservice by running the following on each Controller node:# systemctl stop httpd
Upgrade the relevant packages:
# yum -d1 -y upgrade \*keystone\* # yum -y upgrade \*horizon\* \*openstack-dashboard\* httpd # yum -d1 -y upgrade \*horizon\* \*python-django\*
Reload
systemdto account for updated unit files on each Controller node:# systemctl daemon-reload
Earlier versions of the installer may not have configured your system to automatically purge expired Keystone token, it is possible that your token table has a large number of expired entries. This can dramatically increase the time it takes to complete the database schema upgrade.
Flush expired tokens from the database to alleviate the problem. Run the
keystone-managecommand before running the Identity database upgrade.# keystone-manage token_flush
This flushes expired tokens from the database. You can arrange to run this command periodically (e.g., daily) using
cron.Update the Identity service database schema:
# su -s /bin/sh -c "keystone-manage db_sync" keystone
Restart the service by running the following on each Controller node:
# systemctl start httpd
Clean up the Identity service using Pacemaker:
# pcs resource cleanup httpd-clone
Return the resource to Pacemaker control:
# pcs resource manage httpd-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
6.5. Upgrading Image service (glance)
This procedure upgrades the packages for the Image service on all Controller nodes simultaneously.
Stop the Image service resources in Pacemaker:
# pcs resource disable openstack-glance-registry-clone # pcs resource disable openstack-glance-api-clone
-
Wait until the output of
pcs statusshows that both services have stopped running. Upgrade the relevant packages:
# yum upgrade '*glance*'
Reload
systemdto account for updated unit files:# systemctl daemon-reload
Update the Image service database schema:
# su -s /bin/sh -c "glance-manage db_sync" glance
Clean up the Image service using Pacemaker:
# pcs resource cleanup openstack-glance-api-clone # pcs resource cleanup openstack-glance-registry-clone
Restart Image service resources in Pacemaker:
# pcs resource enable openstack-glance-api-clone # pcs resource enable openstack-glance-registry-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
6.6. Upgrading Block Storage service (cinder)
This procedure upgrades the packages for the Block Storage service on all Controller nodes simultaneously.
Stop all Block Storage service resources in Pacemaker:
# pcs resource disable openstack-cinder-api-clone # pcs resource disable openstack-cinder-scheduler-clone # pcs resource disable openstack-cinder-volume
-
Wait until the output of
pcs statusshows that the above services have stopped running. Upgrade the relevant packages:
# yum upgrade '*cinder*'
Reload
systemdto account for updated unit files:# systemctl daemon-reload
Update the Block Storage service database schema:
# su -s /bin/sh -c "cinder-manage db sync" cinder
Clean up the Block Storage service using Pacemaker:
# pcs resource cleanup openstack-cinder-volume # pcs resource cleanup openstack-cinder-scheduler-clone # pcs resource cleanup openstack-cinder-api-clone
Restart all Block Storage service resources in Pacemaker:
# pcs resource enable openstack-cinder-volume # pcs resource enable openstack-cinder-scheduler-clone # pcs resource enable openstack-cinder-api-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
6.7. Upgrading Orchestration (heat)
This procedure upgrades the packages for the Orchestration service on all Controller nodes simultaneously.
Stop Orchestration resources in Pacemaker:
# pcs resource disable openstack-heat-api-clone # pcs resource disable openstack-heat-api-cfn-clone # pcs resource disable openstack-heat-api-cloudwatch-clone # pcs resource disable openstack-heat-engine-clone
-
Wait until the output of
pcs statusshows that the above services have stopped running. Upgrade the relevant packages:
# yum upgrade '*heat*'
Reload
systemdto account for updated unit files:# systemctl daemon-reload
Update the Orchestration database schema:
# su -s /bin/sh -c "heat-manage db_sync" heat
Clean up the Orchestration service using Pacemaker:
# pcs resource cleanup openstack-heat-clone # pcs resource cleanup openstack-heat-api-cloudwatch-clone # pcs resource cleanup openstack-heat-api-cfn-clone # pcs resource cleanup openstack-heat-api-clone
Restart Orchestration resources in Pacemaker:
# pcs resource enable openstack-heat-clone # pcs resource enable openstack-heat-api-cloudwatch-clone # pcs resource enable openstack-heat-api-cfn-clone # pcs resource enable openstack-heat-api-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
6.8. Upgrading Telemetry (ceilometer)
This procedure upgrades the packages for the Telemetry service on all Controller nodes simultaneously.
This component has some additional upgrade procedures detailed in Chapter 7, Additional Procedures for Non-Director Environments. These additional procedures are optional for manual environments but help align with the current OpenStack Platform recommendations.
Stop all Telemetry resources in Pacemaker:
# pcs resource disable openstack-ceilometer-api-clone # pcs resource disable openstack-ceilometer-collector-clone # pcs resource disable openstack-ceilometer-notification-clone # pcs resource disable openstack-ceilometer-central-clone # pcs resource disable openstack-aodh-evaluator-clone # pcs resource disable openstack-aodh-listener-clone # pcs resource disable openstack-aodh-notifier-clone # pcs resource disable openstack-gnocchi-metricd-clone # pcs resource disable openstack-gnocchi-statsd-clone # pcs resource disable delay-clone
-
Wait until the output of
pcs statusshows that the above services have stopped running. Upgrade the relevant packages:
# yum upgrade '*ceilometer*' '*aodh*' '*gnocchi*'
Reload
systemdto account for updated unit files:# systemctl daemon-reload
Use the following command to update Telemetry database schema.
# ceilometer-dbsync # aodh-dbsync # gnocchi-upgrade
Clean up the Telemetry service using Pacemaker:
# pcs resource cleanup delay-clone # pcs resource cleanup openstack-ceilometer-api-clone # pcs resource cleanup openstack-ceilometer-collector-clone # pcs resource cleanup openstack-ceilometer-notification-clone # pcs resource cleanup openstack-ceilometer-central-clone # pcs resource cleanup openstack-aodh-evaluator-clone # pcs resource cleanup openstack-aodh-listener-clone # pcs resource cleanup openstack-aodh-notifier-clone # pcs resource cleanup openstack-gnocchi-metricd-clone # pcs resource cleanup openstack-gnocchi-statsd-clone
Restart all Telemetry resources in Pacemaker:
# pcs resource enable delay-clone # pcs resource enable openstack-ceilometer-api-clone # pcs resource enable openstack-ceilometer-collector-clone # pcs resource enable openstack-ceilometer-notification-clone # pcs resource enable openstack-ceilometer-central-clone # pcs resource enable openstack-aodh-evaluator-clone # pcs resource enable openstack-aodh-listener-clone # pcs resource enable openstack-aodh-notifier-clone # pcs resource enable openstack-gnocchi-metricd-clone # pcs resource enable openstack-gnocchi-statsd-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
Previous versions of the Telemetry service used an value for the rpc_backend parameter that is now deprecated. Check the rpc_backend parameter in the /etc/ceilometer/ceilometer.conf file is set to the following:
rpc_backend=rabbit
6.9. Upgrading the Compute service (nova) on Controller nodes
This procedure upgrades the packages for the Compute service on all Controller nodes simultaneously.
Stop all Compute resources in Pacemaker:
# pcs resource disable openstack-nova-novncproxy-clone # pcs resource disable openstack-nova-consoleauth-clone # pcs resource disable openstack-nova-conductor-clone # pcs resource disable openstack-nova-api-clone # pcs resource disable openstack-nova-scheduler-clone
-
Wait until the output of
pcs statusshows that the above services have stopped running. Upgrade the relevant packages:
# yum upgrade '*nova*'
Reload
systemdto account for updated unit files:# systemctl daemon-reload
Update the Compute database schema:
# su -s /bin/sh -c "nova-manage api_db sync" nova # su -s /bin/sh -c "nova-manage db sync" nova
If you are performing a rolling upgrade of your compute hosts you need to set explicit API version limits to ensure compatibility between your Mitaka and Newton environments.
Before starting Compute services on Controller or Compute nodes, set the
computeoption in the[upgrade_levels]section ofnova.confto the previous Red Hat OpenStack Platform version (mitaka):# crudini --set /etc/nova/nova.conf upgrade_levels compute mitaka
This ensures the Controller node can still communicate to the Compute nodes, which are still using the previous version.
You will need to first unmanage the Compute resources by running
pcs resource unmanageon one Controller node:# pcs resource unmanage openstack-nova-novncproxy-clone # pcs resource unmanage openstack-nova-consoleauth-clone # pcs resource unmanage openstack-nova-conductor-clone # pcs resource unmanage openstack-nova-api-clone # pcs resource unmanage openstack-nova-scheduler-clone
Restart all the services on all controllers:
# openstack-service restart nova
You should return control to the Pacemaker after upgrading all of your compute hosts to Red Hat OpenStack Platform 10.
# pcs resource manage openstack-nova-scheduler-clone # pcs resource manage openstack-nova-api-clone # pcs resource manage openstack-nova-conductor-clone # pcs resource manage openstack-nova-consoleauth-clone # pcs resource manage openstack-nova-novncproxy-clone
Clean up all Compute resources in Pacemaker:
# pcs resource cleanup openstack-nova-scheduler-clone # pcs resource cleanup openstack-nova-api-clone # pcs resource cleanup openstack-nova-conductor-clone # pcs resource cleanup openstack-nova-consoleauth-clone # pcs resource cleanup openstack-nova-novncproxy-clone
Restart all Compute resources in Pacemaker:
# pcs resource enable openstack-nova-scheduler-clone # pcs resource enable openstack-nova-api-clone # pcs resource enable openstack-nova-conductor-clone # pcs resource enable openstack-nova-consoleauth-clone # pcs resource enable openstack-nova-novncproxy-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
6.10. Upgrading Clustering service (sahara)
This procedure upgrades the packages for the Clustering service on all Controller nodes simultaneously.
Stop all Clustering service resources in Pacemaker:
# pcs resource disable openstack-sahara-api-clone # pcs resource disable openstack-sahara-engine-clone
-
Wait until the output of
pcs statusshows that the above services have stopped running. Upgrade the relevant packages:
# yum upgrade '*sahara*'
Reload
systemdto account for updated unit files:# systemctl daemon-reload
Update the Clustering service database schema:
# su -s /bin/sh -c "sahara-db-manage upgrade heads" sahara
Clean up the Clustering service using Pacemaker:
# pcs resource cleanup openstack-sahara-api-clone # pcs resource cleanup openstack-sahara-engine-clone
Restart all Block Storage service resources in Pacemaker:
# pcs resource enable openstack-sahara-api-clone # pcs resource enable openstack-sahara-engine-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
6.11. Upgrading OpenStack Networking (neutron)
This procedure upgrades the packages for the Networking service on all Controller nodes simultaneously.
Prevent Pacemaker from triggering the OpenStack Networking cleanup scripts:
# pcs resource unmanage neutron-ovs-cleanup-clone # pcs resource unmanage neutron-netns-cleanup-clone
Stop OpenStack Networking resources in Pacemaker:
# pcs resource disable neutron-server-clone # pcs resource disable neutron-openvswitch-agent-clone # pcs resource disable neutron-dhcp-agent-clone # pcs resource disable neutron-l3-agent-clone # pcs resource disable neutron-metadata-agent-clone
Upgrade the relevant packages:
# yum upgrade 'openstack-neutron*' 'python-neutron*'
Update the OpenStack Networking database schema:
# su -s /bin/sh -c "neutron-db-manage upgrade heads" neutron
Clean up OpenStack Networking resources in Pacemaker:
# pcs resource cleanup neutron-metadata-agent-clone # pcs resource cleanup neutron-l3-agent-clone # pcs resource cleanup neutron-dhcp-agent-clone # pcs resource cleanup neutron-openvswitch-agent-clone # pcs resource cleanup neutron-server-clone
Restart OpenStack Networking resources in Pacemaker:
# pcs resource enable neutron-metadata-agent-clone # pcs resource enable neutron-l3-agent-clone # pcs resource enable neutron-dhcp-agent-clone # pcs resource enable neutron-openvswitch-agent-clone # pcs resource enable neutron-server-clone
Return the cleanup agents to Pacemaker control:
# pcs resource manage neutron-ovs-cleanup-clone # pcs resource manage neutron-netns-cleanup-clone
-
Wait until the output of
pcs statusshows that the above resources are running.
6.12. Upgrading Compute (nova) Nodes
This procedure upgrades the packages for on a single Compute node. Run this procedure on each Compute node individually.
If you are performing a rolling upgrade of your compute hosts you need to set explicit API version limits to ensure compatibility between your Mitaka and Newton environments.
Before starting Compute services on Controller or Compute nodes, set the compute option in the [upgrade_levels] section of nova.conf to the previous Red Hat OpenStack Platform version (mitaka):
# crudini --set /etc/nova/nova.conf upgrade_levels compute mitaka
Before updating, take a systemd snapshot of the OpenStack Platform services.
# systemctl snapshot openstack-services
This ensures the Controller node can still communicate to the Compute nodes, which are still using the previous version.
Stop all OpenStack services on the host:
# systemctl stop 'openstack*' '*nova*'
Upgrade all packages:
# yum upgrade
Start all OpenStack services on the host:
# openstack-service start
After you have upgraded all of your hosts, remove the API limits configured in the previous step. On all of your hosts:
# crudini --del /etc/nova/nova.conf upgrade_levels compute
Restart all OpenStack services on the host:
# systemctl isolate openstack-services.snapshot
6.13. Post-Upgrade Tasks
After completing all of your individual service upgrades, you should perform a complete package upgrade on all nodes:
# yum upgrade
This will ensure that all packages are up-to-date. You may want to schedule a restart of your OpenStack hosts at a future date in order to ensure that all running processes are using updated versions of the underlying binaries.
Review the resulting configuration files. The upgraded packages will have installed .rpmnew files appropriate to the Red Hat OpenStack Platform 10 version of the service.
New versions of OpenStack services may deprecate certain configuration options. You should also review your OpenStack logs for any deprecation warnings, because these may cause problems during a future upgrade. For more information on the new, updated and deprecated configuration options for each service, see Configuration Reference available from Red Hat OpenStack Platform Documentation Suite.
