4.2. RHEA-2018:2331 — Red Hat OpenStack Platform 12.0 Enhancement Advisory August 2018

The bugs contained in this section are addressed by advisory RHEA-2018:2331. Further information about this advisory is available at https://access.redhat.com/errata/RHSA-2018:2331.html.


Additional non-controller upgrade attempts after a failed upgrade can fail during service validation if services are not running. To prevent such upgrade failures you can skip services validation. Pass the option "--skip-tags validation" to the Ansible invocation. 

For example:
upgrade-non-controller.sh --upgrade compute-0 --ansible-opts "--skip-tags validation"
TripleO uses ceph-ansible to configure Ceph clients and servers. 
To reduce the undercloud memory requirement when deploying a large number of Compute nodes, the TripleO ceph-ansible fork count default was reduced from 50 to 25.

One result of the lower fork count is a reduction in the number of hosts that can be configured in parallel.

You can use a Heat environment file to override the default fork count. The following example sets the fork count to 10.

The TripleO Derived Parameters workflow now searches for nodes in either the 'active' or 'available' states. The TripleO Derived Parameters feature searches for overcloud nodes associated with each TripleO role. 

Previously, the search was limited to nodes in the 'available' state.

After the initial deployment, when nodes are typically in the 'active' state, stack updates failed because the Derived Parameters workflow did not find any nodes in the 'available' state.
The Derived Parameters workflow now supports the use of SchedulerHints to identify overcloud nodes.

Previously, the workflow could not use use SchedulerHints parameters to identify overcloud nodes associated with the corresponding TripleO overcloud role. This caused the overcloud deployment to fail. 

SchedulerHints support prevents these failures.


Connectivity problems that occurred after OSP11-to-OSP12 upgrades have been resolved by the removal of an obsolete network configuration file.

The file was /usr/libexec/os-apply-config/templates/etc/os-net-config/config.json. Its presence on post-upgrade systems caused connectivity problems after a reboot on any overcloud node. Interfaces set under OVS bridges had no connectivity. For example, controller nodes were unable to rejoin the pacemaker cluster.

The upgrade process now removes the file and prevents the connectivity problems.
The file driver for Gnocchi now works as expected in containerized installations. Previously the host directory was not mounted in the container.
Database credentials are no longer logged when a transient container initializes the MySQL database on disk during a fresh overcloud deployment.

Logging verbosity was limited to prevent the logging of database credentials in the container's logs and in the journal.
A change in the libvirtd live-migration port range prevents live migration failures.

Previously, libvirtd live-migration used ports 49152 to 49215,as specified in the qemu.conf file. 
On Linux, this range is a subset of the ephemeral port range 32768 to 61000. Any port in the ephemeral range can be consumed by any other service as well. 

As a result, live-migration failed with the error:
Live Migration failure: internal error: Unable to find an unused
port in range 'migration' (49152-49215)

The new libvirtd live-migration range of 61152-61215 is not in the ephemeral range. The related failures no longer occur.

This completes the port change work started in BZ1573791.
An error in the NovaSchedulerLoggingSource variable in the puppet/services/nova-conductor.yaml file has been corrected to properly update logs during fluentd configuration.

Previously, nova-scheduler.log was tailed twice and nova-conductor.log was not tailed at all.
To prevent failures caused by a gnocchi-upgrade race condition, gnocchi-upgrade is now called from the bootstrap node instead of from multiple nodes.

Previously, gnocchi-upgrade was called from each node where gnocchi-api is part of the role. This sometimes resulted in failures with the error shown in the following example:
2018-03-14 12:39:39,683 [1] ERROR oslo_db.sqlalchemy.exc_filters: DBAPIError exception wrapped from (pymysql.err.InternalError) (1050, u"Table 'archive_policy' already exists")
Prior to this update, when removing the ceph-osd RPM from overcloud nodes that do not require the package, the corresponding Ceph OSD product key was not removed. Consequently, the subscription-manager would incorrectly report that the Ceph OSD product was still installed. 

With this update, the script that handles removal of the ceph-osd RPM now also removes the Ceph OSD product key. As a result, after removing the ceph-osd RPM, the subscription-manager no longer errononeously reports the Ceph OSD product is installed.

Note: The script that removes the RPM and product key executes only during the overcloud update procedure.
OpenStack Director 13 can now successfully deploy an overcloud together with Ceph, using OpenStack 12 templates.

Prior to this update, Ceph deployment would fail during overcloud deployment step 2 because OpenStack Director failed to set the correct version of Ceph. Now OpenStack Director 12 templates always deploy the Ceph Jewel release.
This update adds the environment file /usr/share/openstack-tripleo-heat-templates/environments/ovs-dpdk-permissions.yaml for OVS-DPDK deployments (for new installations and minor updates).

Note: This environment file updates the parameter only for ComputeOvsDpdk role. If any other custom role is used with OvS-DPDK then the environment file should be extended to those custom roles as well.
This update helps operators locate log files after an upgrade from a non-containerized to a containerized deployment.

If old log files are present when the upgrade begins, a readme.txt file is placed in the old file location. The file points to the new log file location.

For example, if a /var/log/nova directory exists, a /var/log/nova/readme.txt file is created, advising the reader to look in the /var/log/containers/nova directory instead.
This update adds the service OS::TripleO::Services::NovaMigrationTarget to the service list of the ComputeOvsDpdk role in the roles_data.yaml. Prior to this update, the omission of the service caused Nova live migration to fail on the ComputeOvsDpdk roles. 

Before starting a minor update, ensure the service is present in the ComputeOvsDpdk role of the roles_data.yaml file.
This change allows TripleO to deploy Cinder with a Dell EMC VNX backend.
The TripleO environment files used for deploying Cinder's Netapp backend have been updated in this release to allow successful deployment of a Cinder Netapp backend. 

Prior to this update, obsolete data caused the overcloud deployment to fail.
The default age for purging deleted database records has been corrected so that deleted records are purged from Cinder's database.

Previously, the CinderCronDbPurgeAge value for Cinder's purge cron job used the wrong value and deleted records were not purged from Cinder's database when they reached the desired default age.
To enable the neutron-lbaas dashboard:

1. Enable the dashboard in the Horizon configuration file in all controller nodes:
File: /var/lib/config-data/puppet-generated/horizon/etc/openstack-dashboard/local_settings

'enable_distributed_router': False,
'enable_firewall': False,
'enable_ha_router': False,
'enable_lb': True, <----------

2. Restart the horizon container:
# docker restart horizon

A new "Load Balancers" tab will appear under the "Network" menu. The URL is http://<controller-vip>/dashboard/project/ngloadbalancersv2


Nova's libvirt driver now allows the specification of granular CPU feature flags when configuring CPU models.  

One benefit of this change is the alleviation of a performance degradation that has been experienced on guests running with certain Intel-based virtual CPU models after application of the "Meltdown" CVE fixes. This guest performance impact is reduced by exposing the CPU feature flag 'PCID' ("Process-Context ID") to the *guest* CPU, assuming that the PCID flag is available in the physical hardware itself.

For more details, refer to the documentation of ``[libvirt]/cpu_model_extra_flags`` in the ``nova.conf`` file for usage details.


Prior to this update, running a "stack update" operation on an existing stack to reassess the state of Heat resources caused a failure in container docker-puppet-rabbitmq. This failure prevented users from running stack update operations.

This update fixes the issue by changing the way puppet configuration is done in the rabbitmq container docker-puppet-rabbitmq.
This update allows a non-containerized OpenStack service to connect to the Ceph cluster.

Prior to this update, any non-containerized OpenStack service failed to connect to the Ceph cluster because
the file ACLs mask set on the CephX keyrings blocked read permissions for non-containerized OpenStack services.

Puppet now sets the file ACLs mask for the CephX keyrings so that it is allowed to grant read permissions to specific users.
This fix prevents a potential failure of ceilometer-upgrade during an OpenStack upgrade.

Prior to this fix, the ceilometer-upgrade sometimes failed during an OSP11-OSP12 upgrade because it ran before gnocchi-upgrade.

If you upgraded to OSP12 without this fix and ceilometer-upgrade failed, delete the /etc/gnocchi/gnocchi.conf file from the bootstrap node and re-run the upgrade process with the fixed package.
This update fixes an issue that prevented users from configuring Netapp NFS mount options via the TripleO Heat parameter.

Prior to this update, the Cinder Netapp backend ignored the CinderNetappNfsMountOptions TripleO Heat parameter, preventing configuration of the Netapp NFS mount options via the TripleO Heat parameter.

The code responsible for handling Cinder's Netapp configuration no longer ignores the CinderNetappNfsMountOptions parameter. The CinderNetappNfsMountOptions parameter correctly configures Cinder's Netapp NFS mount options.
During a version upgrade, Cinder's database synchronization is now executed only on the bootstrap node. This prevents database synchronization and upgrade failures that occurred when database synchronization was executed on all Controller nodes.


OS-Brick FC host bus adapter (HBA) scans have been limited to prevent the addition of unwanted devices.

Previously, the OS-Brick FC code always scanned all present HBAs.

Now the following limits apply:
--If an initiator map is present, only the mapped HBAs are scanned
--In the case of a single WWNN for all ports, only the connected HBAs are scanned
--Else, all HBAs are scanned with wildcards