Chapter 4. Technical notes

This chapter supplements the information contained in the text of Red Hat OpenStack Platform "Train" errata advisories released through the Content Delivery Network.

4.1. RHEA-2020:3148 — Red Hat OpenStack Platform 16.1 general availability advisory

The bugs contained in this section are addressed by advisory RHBA-2020:3148. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2020:3148.html.

Changes to the ansible-role-atos-hsm component:

  • With this enhancement, you can use ATOS HSM deployment with HA mode. (BZ#1676989)

Changes to the collectd component:

Changes to the openstack-cinder component:

  • With this enhancement, you can revert Block Storage (cinder) volumes to the most recent snapshot, if supported by the driver. This method of reverting a volume is more efficient than cloning from a snapshot and attaching a new volume. (BZ#1686001)
  • Director can now deploy the Block Storage Service in an active/active mode. This deployment scenario is supported only for Edge use cases. (BZ#1700402)
  • This update includes the following enhancements:

    • Support for revert-to-snapshot in VxFlex OS driver
    • Support for volume migration in VxFlex OS driver
    • Support for OpenStack volume replication v2.1 in VxFlex OS driver
    • Support for VxFlex OS 3.5 in the VxFlex OS driver

Changes to the openstack-designate component:

  • DNS-as-a-Service (designate) returns to technology preview status in Red Hat OpenStack Platform 16.1. (BZ#1603440)

Changes to the openstack-glance component:

  • The Image Service (glance) now supports multi stores with the Ceph RBD driver. (BZ#1225775)
  • In Red Hat OpenStack Platform 16.1, you can use the Image service (glance) to copy existing image data into multiple stores with a single command. This removes the need for the operator to copy data manually and update image locations. (BZ#1758416)
  • In Red Hat OpenStack Platform 16.1, you can use the Image Service (glance) to copy existing image data into multiple stores with a single command. This removes the need for the operator to copy data manually and update image locations. (BZ#1758420)
  • With this update, when using Image Service (glance) multi stores, the image owner can delete an Image copy from a specific store. (BZ#1758424)

Changes to the openstack-ironic component:

  • A regression was introduced in ipmitool-1.8.18-11 that caused IPMI access to take over 2 minutes for certain BMCs that did not support the "Get Cipher Suites". As a result, introspection could fail and deployments could take much longer than previously.

    With this update, ipmitool retries are handled differently, introspection passes, and deployments succeed.

    Note

    This issue with ipmitool is resolved in ipmitool-1.8.18-17. (BZ#1831893)

Changes to the openstack-ironic-python-agent component:

  • Before this update, there were no retries and no timeout when downloading a final instance image with the direct deploy interface in ironic. As a result, the deployment could fail if the server that hosts the image fails to respond.

    With this update, the image download process attempts 2 retries and has a connection timeout of 60 seconds. (BZ#1827721)

Changes to the openstack-neutron component:

  • Before this update, it was not possible to deploy the overcloud in a Distributed Compute Node (DCN) or spine-leaf configuration with stateless IPv6 on the control plane. Deployments in this scenario failed during ironic node server provisioning. With this update, you can now deploy successfully with stateless IPv6 on the control plane. (BZ#1803989)

Changes to the openstack-tripleo-common component:

  • When you update or upgrade python3-tripleoclient, Ansible does not receive the update or upgrade and Ansible or ceph-ansible tasks fail.

    When you update or upgrade, ensure that Ansible also receives the update so that playbook tasks can run successfully. (BZ#1852801)

  • With this update, the Red Hat Ceph Storage dashboard uses Ceph 4.1 and a Grafana container based on ceph4-rhel8. (BZ#1814166)
  • Before this update, during Red Hat Ceph Storage (RHCS) deployment, Red Hat OpenStack Platform (RHOSP) director generated the CephClusterFSID by passing the desired FSID to ceph-ansible and used the Python uuid1() function. With this update, director uses the Python uuid4() function, which generates UUIDs more randomly. (BZ#1784640)

Changes to the openstack-tripleo-heat-templates component:

  • There is an incomplete definition for TLS in the Orchestration service (heat) when you update from 16.0 to 16.1, and the update fails.

    To prevent this failure, you must set the following parameter and value: InternalTLSCAFile: ''. (BZ#1840640)

  • With this enhancement, you can configure Red Hat OpenStack Platform to use an external, pre-existing Ceph RadosGW cluster. You can manage this cluster externally as an object-store for OpenStack guests. (BZ#1440926)
  • With this enhancement, you can use director to deploy the Image Service (glance) with multiple image stores. For example, in a Distributed Compute Node (DCN) or Edge deployment, you can store images at each site. (BZ#1598716)
  • With this enhancement, HTTP traffic that travels from the HAProxy load balancer to Red Hat Ceph Storage RadosGW instances is encrypted. (BZ#1701416)
  • With this update, you can deploy pre-provisioned nodes with TLSe using the new 'tripleo-ipa' method. (BZ#1740946)
  • Before this update, in deployments with an IPv6 internal API network, the Block Storage Service (cinder) and Compute Service (nova) were configured with a malformed glance-api endpoint URI. As a result, cinder and nova services located in a DCN or Edge deployment could not access the Image Service (glance).

    With this update, the IPv6 addresses in the glance-api endpoint URI are correct and the cinder and nova services at Edge sites can access the Image Service successfully. (BZ#1815928)

  • With this enhancement, FreeIPA has DNS entries for the undercloud and overcloud nodes. DNS PTR records are necessary to generate certain types of certificates, particularly certificates for cinder active/active environments with etcd. You can disable this functionality with the IdMModifyDNS parameter in an environment file. (BZ#1823932)
  • In this release of Red Hat OpenStack Platform, you can no longer customize the Red Hat Ceph Storage cluster admin keyring secret. Instead, the admin keyring secret is generated randomly during initial deployment. (BZ#1832405)
  • Before this update, stale neutron-haproxy-qdhcp-* containers remained after you deleted the related network. With this update, all related containers are cleaned correctly when you delete a network. (BZ#1832720)
  • Before this update, the ExtraConfigPre per_node script was not compatible with Python 3. As a result, the overcloud deployment failed at the step TASK [Run deployment NodeSpecificDeployment] with the message SyntaxError: invalid syntax.

    With this update, the ExtraConfigPre per_node script is compatible with Python 3 and you can provision custom per_node hieradata. (BZ#1832920)

  • With this update, the swift_rsync container runs in unprivileged mode. This makes the swift_rsync container more secure. (BZ#1807841)
  • PowerMax configuration options have changed since Newton. This update includes the latest PowerMax configuration options and supports both iSCSI and FC drivers.

    The CinderPowermaxBackend parameter also supports multiple back ends. CinderPowermaxBackendName supports a list of back ends, and you can use the new CinderPowermaxMultiConfig parameter to specify parameter values for each back end. For example syntax, see environments/cinder-dellemc-powermax-config.yaml. (BZ#1813393)

  • Support for Xtremio Cinder Backend

    Updated the Xtremio cinder backend to support both iSCSI and FC drivers. It is also enhanceded to support multiple backends. (BZ#1852082)

  • Red Hat OpenStack Platform 16.1 includes tripleo-heat-templates support for VXFlexOS Volume Backend. (BZ#1852084)
  • Red Hat OpenStack Platform 16.1 includes support for SC Cinder Backend. The SC Cinder back end now supports both iSCSI and FC drivers, and can also support multiple back ends. You can use the CinderScBackendName parameter to list back ends, and the CinderScMultiConfig parameter to specify parameter values for each back end. For an example configuration file, see environments/cinder-dellemc-sc-config.yaml. (BZ#1852087)
  • PowerMax configuration options have changed since Newton. This update includes the latest PowerMax configuration options and supports both iSCSI and FC drivers.

    The CinderPowermaxBackend parameter also supports multiple back ends. CinderPowermaxBackendName supports a list of back ends, and you can use the new CinderPowermaxMultiConfig parameter to specify parameter values for each back end. For example syntax, see environments/cinder-dellemc-powermax-config.yaml. (BZ#1852088)

Changes to the openstack-tripleo-validations component:

  • Before this update, the data structure format that the ceph osd stat -f json command returns changed. As a result, the validation to stop the deployment unless a certain percentage of Red Hat Ceph Storage (RHCS) OSDs are running did not function correctly, and stopped the deployment regardless of how many OSDs were running.

    With this update, the new version of openstack-tripleo-validations computes the percentage of running RHCS OSDs correctly and the deployment stops early if a percentage of RHCS OSDs are not running. You can use the parameter CephOsdPercentageMin to customize the percentage of RHCS OSDs that must be running. The default value is 66%. Set this parameter to 0 to disable the validation. (BZ#1845079)

Changes to the puppet-cinder component:

Changes to the puppet-tripleo component:

  • Before this update, the etcd service was not configured properly to run in a container. As a result, an error occurred when the service tried to create the TLS certificate. With this update, the etcd service runs in a container and can create the TLS certificate. (BZ#1804079)

Changes to the python-cinderclient component:

  • Before this update, the latest volume attributes were not updated during poll, and the volume data was incorrect on the display screen. With this update, volume attributes update correctly during poll and the correct volume data appears on the display screen. (BZ#1594033)

Changes to the python-tripleoclient component:

  • With this enhancement, you can use the --limit, --skip-tags, and --tags Ansible options in the openstack overcloud deploy command. This is particularly useful when you want to run the deployment on specific nodes, for example, during scale-up operations. (BZ#1767581)
  • With this enhancement, there are new options in the openstack tripleo container image push command that you can use to provide credentials for the source registry. The new options are --source-username and --source-password.

    Before this update, you could not provide credentials when pushing a container image from a source registry that requires authentication. Instead, the only mechanism to push the container was to pull the image manually and push from the local system. (BZ#1811490)

  • With this update, the container_images_file parameter is now a required option in the undercloud.conf file. You must set this parameter before you install the undercloud.

    With the recent move to use registry.redhat.io as the container source, you must authenticate when you fetch containers. For the undercloud, the container_images_file is the recommended option to provide the credentials when you perform the installation. Before this update, if this parameter was not set, the deployment failed with authentication errors when trying to fetch containers. (BZ#1819016)

4.2. RHBA-2020:3542 — Red Hat OpenStack Platform 16.1.1 general availability advisory

The bugs contained in this section are addressed by advisory RHBA-2020:3542. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2020:3542.html.

Changes to the openstack-tripleo component:

  • The overcloud deployment steps included an older Ansible syntax that tagged the tripleo-bootstrap and tripleo-ssh-known-hosts roles as common_roles. This older syntax caused Ansible to run tasks tagged with the common_roles when Ansible did not use the common_roles tag. This syntax resulted in errors during the 13 to 16.1 system_upgrade process.

    This update uses a newer syntax to tag the tripleo-bootstrap and tripleo-ssh-known-hosts roles as common_roles. Errors do not appear during the 13 to 16.1 system_upgrade process and you no longer include the --playbook upgrade_steps_playbook.yaml option to the system_upgrade process as a workaround. (BZ#1851914)

Changes to the openstack-tripleo-heat-templates component:

  • This update fixes a GRUB parameter naming convention that led to unpredictable behaviors on compute nodes during leapp upgrades.

    Previously, the presence of the obsolete "TRIPELO" prefix on GRUB parameters caused problems.

    The file /etc/default/grub has been updated with GRUB for the tripleo kernel args parameter so that leapp can upgrade it correctly. This is done by adding "upgrade_tasks" to the service "OS::TripleO::Services::BootParams", which is a new service added to all roles in the roles_data.yaml file. (BZ#1858673)

  • This update fixes a problem that caused baremetal nodes to become non-responsive during Leapp upgrades.

    Previously, Leapp did not process transient interfaces like SR-IOV virtual functions (VF) during migration. As a result, Leapp did not find the VF interfaces during the upgrade, and nodes entered an unrecoverable state.

    Now the service "OS::TripleO::Services::NeutronSriovAgent" sets the physical function (PF) to remove all VFs, and migrates workloads before the upgrade. After the successful Leapp upgrade, os-net-config runs again with the "--no-activate" flag to re-establish the VFs. (BZ#1866372)

  • This director enhancement automatically installs the Leapp utility on overcloud nodes to prepare for OpenStack upgrades. This enhancement includes two new Heat parameters: LeappRepoInitCommand and LeappInitCommand. In addition, if you have the following repository defaults, you do not need to pass UpgradeLeappCommandOptions values.

    --enablerepo rhel-8-for-x86_64-baseos-eus-rpms --enablerepo rhel-8-for-x86_64-appstream-eus-rpms --enablerepo rhel-8-for-x86_64-highavailability-eus-rpms --enablerepo advanced-virt-for-rhel-8-x86_64-rpms --enablerepo ansible-2.9-for-rhel-8-x86_64-rpms --enablerepo fast-datapath-for-rhel-8-x86_64-rpms

    (BZ#1845726)

  • If you do not set the UpgradeLevelNovaCompute parameter to '', live migrations are not possible when you upgrade from RHOSP 13 to RHOSP 16. (BZ#1849235)
  • This update fixes a bug that prevented the successful deployment of transport layer security (TLS) everywhere with public TLS certifications. (BZ#1852620)
  • Before this update, director did not set the noout flag on Red Hat Ceph Storage OSDs before running a Leapp upgrade. As a result, additional time was required for the the OSDs to rebalance after the upgrade.

    With this update, director sets the noout flag before the Leapp upgrade, which accelerates the upgrade process. Director also unsets the noout flag after the Leapp upgrade. (BZ#1853275)

  • Before this update, the Leapp upgrade could fail if you had any NFS shares mounted. Specifically, the nodes that run the Compute Service (nova) or the Image Service (glance) services hung if they used an NFS mount.

    With this update, before the Leapp upgrade, director unmounts /var/lib/nova/instances, /var/lib/glance/images, and any Image Service staging area that you define with the GlanceNodeStagingUri parameter. (BZ#1853433)

Changes to the openstack-tripleo-validations component:

  • This update fixes a Red Hat Ceph Storage (RHCS) version compatibility issue that caused failures during upgrades from Red Hat OpenStack platform 13 to 16.1. Before this fix, validations performed during the upgrade worked with RHCS3 clusters but not RHCS4 clusters. Now the validation works with both RHCS3 and RHCS4 clusters. (BZ#1852868)

Changes to the puppet-tripleo component:

  • Before this update, the Red Hat Ceph Storage dashboard listener was created in the HA Proxy configuration, even if the dashboard is disabled. As a result, upgrades of OpenStack with Ceph could fail.

    With this update, the service definition has been updated to distinguish the Ceph MGR service from the dashboard service so that the dashboard service is not configured if it is not enabled and upgrades are successful. (BZ#1850991)

4.3. RHSA-2020:4283 — Red Hat OpenStack Platform 16.1.2 general availability advisory

The bugs contained in this section are addressed by advisory RHSA-2020:4283. Further information about this advisory is available at link: https://access.redhat.com/errata/RHSA-2020:4283.html.

Bug Fix(es):

  • This update includes the following bug fix patches related to fully qualified domain names (FQDN).

    • Kaminario Fix unique_fqdn_network option

      Previously, the Kaminario driver accepted the unique_fqdn_network configuration option in the specific driver section. When this option was moved, a regression was introduced: the parameter was now only used if it was defined in the shared configuration group.

      This patch fixes the regression and makes it possible to define the option in the shared configuration group as well as the driver specific section.

    • HPE 3PAR Support duplicated FQDN in network

      The 3PAR driver uses the FQDN of the node that is doing the attach as an unique identifier to map the volume.

      Because the FQDN is not always unique, in some environments the same FQDN can be found in different systems. In those cases, if both try to attach volumes, the second system will fail.

      For example, this could happen in a QA environment where VMs share names like controller-.localdomain and compute-0.localdomain.

      This patch adds the unique_fqdn_network configuration option to the 3PAR driver to prevent failures caused by name duplication between systems. (BZ#1721361) (BZ#1721361)

  • This update makes it possible to run the Brocade FCZM driver in RHOSP 16.

    The Brocade FCZM vendor chose not to update the driver for Python 3, and discontinued support of the driver past the Train release of OpenStack [1]. Red Hat OpenStack (RHOSP) 16 uses Python 3.6.

    The upstream Cinder community assumed the maintenance of the Brocade FCZM driver on a best-effort basis, and the bugs that prevented the Brocade FCZM from running in a Python 3 environment (and hence in RHOSP 16) have been fixed.

    [1] https://docs.broadcom.com/doc/12397527 (BZ#1848420)

  • This update fixes a problem that caused volume attachments to fail on a VxFlexOS cinder backend.

    Previously, attempts to attach a volume on a VxFlexOS cinder backend failed because the cinder driver for the VxFlexOS back end did not include all of the information required to connect to the volume.

    The VxFlexOS cinder driver has been updated to include all the information required in order to connect to a volume. The attachments now work correctly. (BZ#1862213)

  • This enhancement introduces support for the revert-to-snapshot feature with the Block Storage (cinder) RBD driver. (BZ#1702234)
  • Red Hat OpenStack Platform 16.1 includes the following PowerMax Driver updates:

    Feature updates:

    • PowerMax Driver - Unisphere storage group/array tagging support
    • PowerMax Driver - Short host name and port group name override
    • PowerMax Driver - SRDF Enhancement
    • PowerMax Driver - Support of Multiple Replication

      Bug fixes:

    • PowerMax Driver - Debug Metadata Fix
    • PowerMax Driver - Volume group delete failure
    • PowerMax Driver - Setting minimum Unisphere version to 9.1.0.5
    • PowerMax Driver - Unmanage Snapshot Delete Fix
    • PowerMax Driver - RDF clean snapvx target fix
    • PowerMax Driver - Get Manageable Volumes Fix
    • PowerMax Driver - Print extend volume info
    • PowerMax Driver - Legacy volume not found
    • PowerMax Driver - Safeguarding retype to some in-use replicated modes
    • PowerMax Driver - Replication array serial check
    • PowerMax Driver - Support of Multiple Replication
    • PowerMax Driver - Update single underscores
    • PowerMax Driver - SRDF Replication Fixes
    • PowerMax Driver - Replication Metadata Fix
    • PowerMax Driver - Limit replication devices
    • PowerMax Driver - Allowing for default volume type in group
    • PowerMax Driver - Version comparison correction
    • PowerMax Driver - Detach RepConfig logging & Retype rename remote fix
    • PowerMax Driver - Manage volume emulation check
    • PowerMax Driver - Deletion of group with volumes
    • PowerMax Driver - PowerMax Pools Fix
    • PowerMax Driver - RDF status validation
  • PowerMax Driver - Concurrent live migrations failure

    • PowerMax Driver - Live migrate remove rep vol from sg
    • PowerMax Driver - U4P failover lock not released on exception
    • PowerMax Driver - Compression Change Bug Fix (BZ#1808583)
  • Before this update, the Block Storage service (cinder) assigned the default volume type in a volume create request, ignoring alternative methods of specifying the volume type.

    With this update, the Block Storage service performs as expected:

    • If you specify a source_volid in the request, the volume type that the Block Storage service sets is the volume type of the source volume.
    • If you specify a snapshot_id in the request, the volume type is inferred from the volume type of the snapshot.
    • If you specify an imageRef in the request, and the image has a cinder_img_volume_type image property, the volume type is inferred from the value of the image property.

      Otherwise, Block Storage service sets the volume type is the default volume type that you configure. If you do not configure a volume type, the Block Storage service uses the system default volume type, DEFAULT.

      When you specify a volume type explicitly in the volume create request, the Block Storage service uses the type that you specify. (BZ#1826741)

  • Before this update, when you created a volume from a snapshot, the operation could fail because the Block Storage service (cinder) would try to assign the default volume type to the new volume instead of inferring the correct volume type from the snapshot. With this update, you no longer have to specify the volume type when you create a volume. (BZ#1843789)
  • This enhancement adds a new driver for the Dell EMC PowerStore to support Block Storage service back end servers. The new driver supports the FC and iSCSI protocols, and includes these features:

    • Volume create and delete
    • Volume attach and detach
    • Snapshot create and delete
    • Create volume from snapshot
    • Get statistics on volumes
    • Copy images to volumes
    • Copy volumes to images
    • Clone volumes
    • Extend volumes
    • Revert volumes to snapshots (BZ#1862541)

4.4. RHEA-2020:4284 — Red Hat OpenStack Platform 16.1.2 general availability advisory

The bugs contained in this section are addressed by advisory RHEA-2020:4284. Further information about this advisory is available at link: https://access.redhat.com/errata/RHEA-2020:4284.html.

Changes to the openstack-nova component:

  • This bug fix enables you to boot an instance from an encrypted volume when that volume was created from an image that in turn was created by uploading an encrypted volume to the Image Service as an image. (BZ#1879190)

Changes to the openstack-octavia component:

  • The keepalived instance in the Red Hat OpenStack Platform Load-balancing service (octavia) instance (amphora) can abnormally terminate and interrupt UDP traffic. The cause of this issue is that the timeout value for the UDP health monitor is too small.

    Workaround: specify a new timeout value that is greater than two seconds: $ openstack loadbalancer healthmonitor set --timeout 3 <heath_monitor_id>

    For more information, search for "loadbalancer healthmonitor" in the Command Line Interface Reference. (BZ#1837316)

Changes to the openstack-tripleo-heat-templates component:

  • A known issue causes the migration of Ceph OSDs from Filestore to Bluestore to fail. In use cases where the osd_objectstore parameter was not set explicitly when you deployed OSP13 with RHCS3, the migration exits without converting any OSDs and falsely reports that the OSDs are already using Bluestore. For more information about the known issue, see https://bugzilla.redhat.com/show_bug.cgi?id=1875777

    As a workaround, perform the following steps:

    1. Include the following content in an environment file:

      parameter_defaults:
        CephAnsibleExtraConfig:
          osd_objectstore: filestore
    2. Perform a stack update with the overcloud deploy --stack-only command, and include the new or existing environment file that contains the osd_objectstore parameter. In the following example, this environment file is <osd_objectstore_environment_file>. Also include any other environment files that you included during the converge step of the upgrade:

      $ openstack overcloud deploy --stack-only \
        -e <osd_objectstore_environment_file> \
        -e <converge_step_environment_files>
    3. Proceed with the FileStore to BlueStore migration by using the existing documentation. See https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/framework_for_upgrades_13_to_16.1/OSD-migration-from-filestore-to-bluestore

      Result: The Filestore to Bluestore playbook triggers the conversion process, and removes and re-creates the OSDs successfully. (BZ#1733577)

  • Inadequate timeout values can cause an overcloud deployment to fail after four hours. To prevent these timeout failures, set the following undercloud and overcloud timeout parameters:
  • Undercloud timeouts (seconds):

    Example

    parameter_defaults:
      TokenExpiration: 86400
      ZaqarWsTimeout: 86400

  • Overcloud deploy timeouts (minutes):

    Example

    $ openstack overcloud deploy --timeout 1440

    The timeouts are now set. (BZ#1792500)

  • Currently, you cannot scale down or delete compute nodes if Red Hat OpenStack Platform is deployed with TLS-e using tripleo-ipa. This is because the cleanup role, traditionally delegated to the undercloud as localhost, is now being invoked from the mistral container.

    For more information, see https://access.redhat.com/solutions/5336241 (BZ#1866562)

  • This update fixes a bug that prevented the distributed compute nodes (DCN) compute servcie from accessing the glance service.

    Previously, distributed compute nodes were configured with a glance endpoint URI that specified an IP address, even when deployed with internal transport layer security (TLS). Because TLS requires the endpoint URI to specify a fully qualified domain name (FQDN), the compute service could not access the glance service.

    Now, when deployed with internal TLS, DCN services are configured with glance endpoint URI that specifies a FQDN, and the DCN compute service can access the glance service. (BZ#1873329)

  • This update introduces support of Distributed Compute Nodes TLS everywhere with Triple IPA. (BZ#1874847)
  • The update introduces support of Neutron routed provider networks with RH-OSP Distributed Compute Nodes (BZ#1874863)
  • This update adds support for encrypted volumes and images on distributed compute nodes (DCN).

    DCN nodes can now access the Key Manager service (barbican) running in the central control plane.

    Note

    This feature adds a new Key Manager client service to all DCN roles. To implement the feature, regenerate the roles.yaml file used for the DCN site’s deployment.

    For example:

    $ openstack overcloud roles generate DistributedComputeHCI DistributedComputeHCIScaleOut -o ~/dcn0/roles_data.yaml

    Use the appropriate path to the roles data file. (BZ#1852851)

  • Before this update, to successfully run a leapp upgrade during the fast forward upgrade (FFU) from RHOSP 13 to RHOSP16.1, the node where the Red Hat Enterprise Linux upgrade was occurring had to have the PermitRootLogin field defined in the ssh config file (/etc/ssh/sshd_config).

    With this update, the Orchestration service (heat) no longer requires you to modify /etc/ssh/sshd_config with the PermitRootLogin field. (BZ#1855751)

  • This enhancement adds a new driver for the Dell EMC PowerStore to support Block Storage service back end servers. (BZ#1862547)

Changes to the openstack-tripleo-validations component:

  • This update safeguards against potential package content conflict after content was moved from openstack-tripleo-validations to another package. (BZ#1877688)

Changes to the puppet-cinder component:

  • This release adds support for the Dell EMC PowerStore Cinder Backend Driver. (BZ#1862545)

Changes to the puppet-tripleo component:

  • This enhancement adds a new driver for the Dell EMC PowerStore to support Block Storage service back end servers. (BZ#1862546)
  • This update fixes incorrect parameter names in Dell EMC Storage Templates. (BZ#1868620)

Changes to the python-networking-ovn component:

  • Transmission of jumbo UDP frames on ML2/OVN routers depends on a kernel release that is not yet avaialbe.

    After receiving a jumbo UDP frame that exceeds the maximum transmission unit of the external network, ML2/OVN routers can return ICMP "fragmentation needed" packets back to the sending VM, where the sending application can break the payload into smaller packets. To determine the packet size, this feature depends on discovery of MTU limits along the south-to-north path.

    South-to-north path MTU discovery requires kernel-4.18.0-193.20.1.el8_2, which is scheduled for availability in a future release. To track availability of the kernel version, see https://bugzilla.redhat.com/show_bug.cgi?id=1860169. (BZ#1547074)

Changes to the python-os-brick component:

  • This update modifies get_device_info to use lsscsi to get [H:C:T:L] values, making it possible to support more than 255 logical unit numbers (LUNs) and host logical unit (HLU) ID values.

    Previously, get_device_info used sg_scan to get these values, with a limit of 255.

    You can get two device types with get_device_info:

    • o /dev/disk/by-path/xxx, which is a symlink to /dev/sdX
    • o /dev/sdX

      sg_scan can process any device name, but lsscsi only shows /dev/sdx names.

      If the device is a symlink, get_device_info uses the device name that the device links to. Otherwise get_device_info uses the device name directly.

      Then get_device_info gets the device info '[H:C:T:L]' by comparing the device name with the last column of lsscsi output. (BZ#1872211)

  • This update fixes an incompatibility that caused VxFlex volume detachment attempts to fail.

    A recent change in VxFlex cinder volume credentialing methods was not backward compatible with pre-existing volume attachments. If a VxFlex volume attachment was made before the credentialing method change, attempts to detach the volume failed.

    Now the detachments do not fail. (BZ#1869346)

Changes to the python-tripleoclient component:

  • The entry in /etc/hosts for the undercloud duplicates anytime the Compute stack is updated on the undercloud and overcloud nodes. This occurs for split-stack deployments where the Controllers and Compute nodes are divided into multiple stacks.

    Other indications of this problem are the following:

    • mysql reporting errors about packets exceeding their maximum size.
    • The Orchestration service (heat) warning that templates are exceeding their maximum size.
    • The Workflow service (mistral) warning that fields are exceeding their maximum size. As a workaround, in the file generated by running the openstack overcloud export command that is included in the Compute stack, under ExtraHostFileEntries, remove the erroneous entry for the undercloud. (BZ#1876153)

Changes to the tripleo-ansible component:

  • This update increases the speed of stack updates in certain cases.

    Previously, stack update performance was degraded when the Ansible --limit option was not passed to ceph-ansible. During a stack update, ceph-ansible sometimes made idempotent updates on nodes even if the --limit argument was used.

    Now director intercepts the Ansible --limit option and passes it to the ceph-ansible excecution. The --limit option passed to commands starting with 'openstack overcloud' deploy is passed to the ceph-ansible execution to reduce the time required for stack updates.

    Important

    Always include the undercloud in the limit list when using this feature with ceph-ansible. (BZ#1855112)

4.5. RHBA-2021:0817 — Red Hat OpenStack Platform 16.1.4 director bug fix advisory

The bugs contained in this section are addressed by advisory RHBA-2021:0817. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2021:0817.html.

Changes to the openstack-cinder component:

  • Before this update, cloned encrypted volumes were inaccessible when using the Block Storage (cinder) service with the Key Manager (barbican) service. With this update, cloned encrypted volumes are now accessible when using the Block Storage service with the Key Manager service. (BZ#1889228)
  • The 'all_tenants' key passed with a volume transfer request is removed because the database is unable to parse it. Removing this key allows the user to show the detail of a specific volume transfer by using the transfer name. Before this update, the 'all_tenants' key was removed only for admin users, which meant that non-admin users were unable to show volume transfers by using the transfer name. With this update, the 'all_tenants' key is now also removed for non-admins, allowing non-admins to show volume transfers by using the transfer name. (BZ#1847907)
  • Before this update, the Block Storage (cinder) NEC back end driver occasionally returned invalid data when initializing a volume connection, which could cause live migration to fail. With this update, the NEC driver has been fixed to reliably return valid connection data. Live migration no longer fails due to invalid volume connection data. (BZ#1910854)
  • Before this update, the Block Storage (cinder) service would always assign newly created volumes with the default volume type, even when the volume was created from another source, such as an image, snapshot or another volume. This resulted in volumes created from another source having a different volume type from the volume type of the source.

    With this update, the default volume type is assigned only after determining whether it should be assigned based on the volume type of the source. The volume type of volumes created from another source now match the volume type of the source. (BZ#1921735)

  • Before this update, the --server option was being ignored when passed with the cinder service-get-log command, which resulted in the logs for all hosts being returned instead of just the logs for a specific host. With this update, using the --server option correctly filters the logs for the specified host. (BZ#1728142)

Changes to the openstack-tripleo-common component:

  • The virt-admin tool is now available for you to use to capture logs for reporting RHOSP bugs. This tool is useful for troubleshooting all libvirt and QEMU problems, as the logs provide the communications between libvirt and QEMU on the Compute nodes. You can use virt-admin to set the libvirt and QEMU debug log filters dynamically, without having to restart the nova_libvirt container.

    Perform the following steps to enable libvirt and QEMU log filters on a Compute node:

    1. Log in to the nova_libvirt container on the Compute node:

      $ sudo podman exec -it nova_libvirt /bin/bash
    2. Specify the name and location of the log file to send virt-admin output to:

      $ virt-admin daemon-log-outputs "1:file:/var/log/libvirt/libvirtd.log"
    3. Configure the filters you want to collect logs for:

      $ virt-admin daemon-log-filters \
       "1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util"
      Note

      When debugging issues with live migration, you must configure these filters on all source and destination Compute nodes.

    4. Repeat your test. After debugging is complete, upload the libvirtd.log to a bug.
    5. Disable the libvirt and QEMU log filters on the Compute nodes:

      $ virt-admin daemon-log-filters ""
    6. To confirm that the filters are removed, enter the following command:

      $ virt-admin daemon-log-filters

      This command returns an empty list when you have successfully removed the filters.

(BZ#1870199)

Changes to the openstack-tripleo-heat-templates component:

  • Before this update, in-place upgrades from Red Hat OpenStack Platform 13 to 16.1 in a TLS everywhere environment used an incorrect rabbitmq password for the novajoin container. This caused the novajoin container on the undercloud to function incorrectly, which caused any overcloud node that ran an upgrade to fail with the following error:

    2020-11-24 20:01:31.569 7 ERROR join   File "/usr/lib/python3.6/site-packages/amqp/connection.py", line 639, in _on_close
    2020-11-24 20:01:31.569 7 ERROR join     (class_id, method_id), ConnectionError)
    2020-11-24 20:01:31.569 7 ERROR join amqp.exceptions.AccessRefused: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism AMQPLAIN. For detail see the broker logfile.

    With this update, the upgrade from RHOSP 13 to 16.1 uses the correct rabbitmq password in a TLS everywhere environment so that the framework for upgrades can complete successfully. (BZ#1901157)

  • With this enhancement, you can deploy the Red Hat Ceph Storage (RHCS) Dashboard on edge sites in a distributed compute node (DCN) architecture. (BZ#1793595)
  • With this enhancement, you can manage vPMEM with two new parameters NovaPMEMMappings and NovaPMEMNamespaces.

    • Use NovaPMEMMappings to set the nova configuration option pmem_namespaces that reflects mappings between vPMEM and physical PMEM namespaces.
    • Use NovaPMEMNamespaces to create and manage physical PMEM namespaces that you use as a back end for vPMEM. (BZ#1834185)
  • There is currently a known issue with the mechanism that ensures the subscribed environments have the right DNF module stream set. The Advanced Virtualization repository is not always available in the subscription that the Ceph nodes use, which causes the upgrade or update of a Ceph node to fail when you try to enable virt:8.2.

    Workaround:

    Override the DnfStreams parameter in the upgrade or update environment file to prevent the Ceph upgrade from failing:

    parameter_defaults:
      ...
      DnfStreams: [{'module':'container-tools', 'stream':'2.0'}]
    Note

    The Advanced Virtualization DNF stream is not enforced when you use this workaround.

    For more information, see https://bugzilla.redhat.com/show_bug.cgi?id=1923887. (BZ#1866479)

  • This enhancement adds support for heterogeneous storage configurations at the edge. Operators can now deploy edge sites with storage and sites without storage within the same DCN deployment. (BZ#1882058)
  • The Block Storage backup service sometimes needs access to files on the host that would otherwise not be available in the container running the service. This enhancement adds the CinderBackupOptVolumes parameter, which you can use to specify additional container volume mounts for the Block Storage backup service. (BZ#1891828)
  • Before this update, TLS-E on pre-provisioned nodes failed with the message: "--server cannot be used without providing --domain". With this update, the IDM domain name is detected by first resolving "ipa-ca" through DNS, then doing a reverse DNS lookup on the resulting IP address. It might be necessary to add the PTR record, which is required for the reverse lookup, manually. (BZ#1874936)
  • Before this update, you were required to use the openstack overcloud external-upgrade run --tags online_upgrade command to perform online database updates when upgrading from RHOSP 15 to RHOSP 16.1. With this update, you can now use the openstack overcloud external-update run --tags online_upgrade command. (BZ#1884556)
  • Before this update, if you had NovaComputeEnableKsm enabled and you were using Red Hat Subscription Management to register the overcloud Compute nodes, the qemu-kvm-common package failed to install. This was because the configuration was sometimes applied before the Compute nodes were registered to the required repositories.

    With this update, NovaComputeEnableKsm is enabled only after the Compute nodes are registered to the required repositories by using Red Hat Subscription Management, which ensures that the qemu-kvm-common package is successfully installed. (BZ#1895894)

  • Before this update, the connection data created by an iSCSI/LVM Block Storage back end was not stored persistently, which resulted in volumes not being accessible after a reboot. With this update, the connection data is stored persistently, and the volumes are accessible after a system reboot. (BZ#1898484)
  • Before this update, when deployed at an edge site the Image (glance) service was not configured to access the Key Manager (barbican) service running on the central site’s control plane. This resulted in the Image services running on edge sites being unable to access encryption keys stored in the Key Manager service.

    With this update, Image services running on edge sites are now configured to access the encryption keys stored in the Key Manager service. (BZ#1899761)

Changes to the puppet-collectd component:

  • With this enhancement, you can configure the format of the plugin instance for the collectd virt plugin by using the ExtraConfig parameter collectd::plugin::virt::plugin_instance_format. This allows more granular metadata to be exposed in the metrics label for virtual machine instances, such as on which host the instance is running. (BZ#1878191)
  • Before this update, when you configured the collectd::plugin::virt::hostname_format parameter with multiple values, director wrapped the values in double quotes. This caused the virt plugin to fail to load. With this update, when configuring collectd::plugin::virt::hostname_format, director no longer wraps multiple values in double quotes. (BZ#1902142)

Changes to the python-network-runner component:

Changes to the python-networking-ovn component:

  • With this enhancement, you can control multicast over the external networks and avoid cluster autoforming over external networks instead of only the internal networks. (BZ#1575512)
  • Before this update, the OVN mechanism driver did not correctly merge its agent list with those stored in the Networking (neutron) service database. With this update, the results from the OVN and Networking service database are merged before the API returns the result. (BZ#1828889)
  • This enhancement adds support for vlan transparency in the ML2/OVN mechanism driver with vlan and geneve network type drivers.

    With vlan transparency, you can manage vlan tags by using instances on Networking (neutron) service networks. You can create vlan interfaces on an instance and use any vlan tag without affecting other networks. The Networking service is not aware of these vlan tags.

    NOTE

  • When using vlan transparency on a vlan type network, the inner and outer ethertype of the packets is 802.1Q (0x8100).
  • The ML2/OVN mechanism driver does not support vlan transparency on flat provider networks.

(BZ#1846019)

Changes to the python-os-brick component:

  • Before this update, instances that were created on a RHOSP 13 environment with PowerFlex, VxFlex and ScaleIO volume attachments failed restarting after an upgrade to RHOSP 16.x. This was because the RHOSP 16.x Compute service uses a new PowerFlex driver connection property to access volume attachments, which is not present in the connection properties of volumes attached to instances running on a RHOSP 13 environment. With this update, the error is no longer thrown if this connection property is missing, and instances with PowerFlex volume attachments created on a RHOSP 13 environment continue to function correctly after upgrading to RHOSP 16.x.

Changes to the python-paunch component:

  • Before this update, if a user configured the ContainerImagePrepare parameter to use a custom tag, such as 'tag: "latest"' or 'tag: "16.1"', instead of the standard 'tag_from_label: "{version}-{release}"', the containers did not update to the latest container images.

    With this update, the container images are always fetched anytime a user runs a deployment action, including updates, and the image ID is checked against the running container to see if it needs to be rebuilt to consume the latest image. Containers are now always refreshed during deployment actions and restarted if they are updated.

    Note

    This is a change from previous versions where the deployment checked only that the image existed rather than always fetching the image. If a user is reusing tags, for example, "latest", the containers might be updated on nodes if you perform actions such as scaling out. It is not recommended to use "latest" unless you are controlling container tags by using a Satellite server deployment.

    (BZ#1881476)

Changes to the python-tripleoclient component:

  • Before this update, live migration failed when upgrading a TLS everywhere environment with local ephemeral storage and UseTLSTransportForNbd set to "False". This occurred because the default value of the UseTLSTransportForNbd configuration had changed from "False" in RHOSP 13 to "True" in RHOSP 16.x, which resulted in the correct certifications not being included in the QEMU process containers.

    With this update, director checks the configuration of the previously deployed environment for global_config_settings and uses it to ensure that the UseTLSTransportForNbd state stays the same in the upgrade as on previous deployment. If global_config_settings exists in the configuration file, then director checks the configuration of the use_tls_for_nbd key. If global_config_settings does not exist, the director evaluates the hieradata key nova::compute::libvirt::qemu::nbd_tls. Keeping the UseTLSTransportForNbd state the same in the upgraded deployment as on previous deployment ensures that live migration works. (BZ#1906698)

4.6. RHBA-2021:2097 — Red Hat OpenStack Platform 16.1.6 director bug fix advisory

Changes to the openstack-cinder component:

  • In prior releases, the SolidFire driver created a duplicate volume whenever it retried an API request. This led to unexpected behavior due to the accumulation of unused volumes.

    With this update, the Block Storage service (cinder) checks for existing volume names before it creates a volume. When Block Storage service detects a read timeout, it immediately checks for volume creation to prevent invalid API calls. This update also adds the sf_volume_create_timeout option for the SolidFire driver so that you can set an appropriate timeout value for your environment. (BZ#1939398)

  • This update fixes a bug that prevented cinder list from listing volumes when multiple filters were passed. (BZ#1843788)
  • This update adds CHAP support to the Dell EMC PowerStore driver. (BZ#1905231)
  • In prior releases, cinder NEC driver backups failed when the object was a snapshot. This occurred because the snapshot argument does not have the volume_attachment attribute. With this update, backups no longer refer to the volume_attachment attribute when the argument is snapshot. (BZ#1910855)
  • This update fixes an issue that caused some API calls, such as create snapshot, to fail with an xNotPrimary error during workload re-balancing operations.

    When SolidFire is under heavy load or being upgraded, the SolidFire cluster might re-balance cluster workload by automatically moving connections from primary to secondary nodes. Previously, some API calls failed with an xNotPrimary error during these workload balance operations and were not retried.

    This update fixes the issue by adding the xNotPrimary exception to the SolidFire driver list of retryable exceptions. (BZ#1947474)

Changes to the openstack-heat component:

  • This update makes it possible to use OS::Heat:Delay resources in heat templates. Previously, a variable naming conflict caused an assertion error during attempted completion of an OS::Heat::Delay resource. A variable was renamed to eliminate the conflict. (BZ#1868543)

Changes to the openstack-nova component:

  • When an instance is created, the Compute (nova) service sanitizes the instance display name to generate a valid host name when DNS integration is enabled in the Networking (neutron) service.

    Before this update, the sanitization did not replace periods ('.') in instance names, for example, 'rhel-8.4'. This could result in display names being recognized as Fully Qualified Domain Names (FQDNs) which produced invalid host names. When instance names contained periods and DNS integration was enabled in the Networking service, the Networking service rejected the invalid host name, which resulted in a failure to create the instance and a HTTP 500 server error from the Compute service.

    With this update, periods are now replaced by hyphens in instance names to prevent host names being parsed as FQDNs. You can continue to use free-form strings for instance display names. (BZ#1872314)

Changes to the openstack-tripleo-common component:

  • This update modifies the registry metadata creator to handle containers with and without namespaces in their URI. On the undercloud you can now manage containers that comply with the following formats:

    undercloud_host:port/namespace/container:tag undercloud_host:port/container:tag

    Red Hat does not support more complex namespaces, such as undercloud_host:port/name/space/container:tag, when pushing to the undercloud. (BZ#1919445)

Changes to the openstack-tripleo-heat-templates component:

  • After upgrading with the Leapp utility, Compute with OVS-DPDK workload does not function properly. Choose one of the following workaround options:
  • remove /etc/modules-load.d/vfio-pci.conf, before compute upgrade
  • restart compute ovs after compute upgrade. (BZ#1895887)
  • This update fixes a configuration problem that caused Leapp upgrades to stop and fail while executing on a CephStorage node.

    Previously, CephStorage nodes were incorrectly configured to consume OpenStack highavailability, advanced-virt, and fast-datapath repos during Leapp upgrades.

    Now UpgradeLeappCommand options is configurable on a per-node basis, and uses the correct default for CephStorage nodes, and Leapp upgrades succeed for CephStorage nodes. (BZ#1936419)

Changes to the validations-common component:

  • This update fixes a bug that caused failure of validations before openstack undercloud upgrade in some cases. Before this upgrade, a lack of permissions needed to access the requested logging directory sometimes resulted in the following failures:

    • Failure to log validation results
    • Failure of the validation run
    • Failure of artifacts collection from validation.

      This update adds a fallback logging directory. Validation results are logged and artifacts collected. (BZ#1895045)

4.7. RHBA-2021:3762 — Red Hat OpenStack Platform 16.1.7 general availability advisory

The bugs contained in this section are addressed by advisory RHSA-2021:3762. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2021:3762.html.

Changes to the diskimage-builder component:

  • Before this update, the appstream and baseos repositories were always added to the repositories enabled by Red Hat Subscription Manager, with no way to override them. With this update, when you define the $REG_REPOS variable, no base repositories are added. You can control which repositories are added, but you must now include all repositories, including the equivalent repository for baseos, and appstream when required. (BZ#1906162)

Changes to the openstack-cinder component:

  • Before this update, creating a volume from a snapshot of an encrypted volume could result in an unusable volume. When the destination volume is the same size as the source volume, creating an encrypted volume from a snapshot of an encrypted volume truncated the data in the new volume, which caused a size discrepancy.

    With this update, the RBD back end accounts for the encryption header and does not truncate the data so that creating a volume from a snapshot of an encrypted volume does not cause the error. (BZ#1987104)

  • In previous releases, in Red Hat OpenStack Platform (RHOSP) deployments that use the Dell EMC XtremIO driver, attach volume operations waited for a timeout if iSCSI or FC targets were not connected to a RHOSP host. This caused attach volume operations to fail.

    This release adds port filtering support for the Dell EMC XtremIO driver to allow iSCSI or FC ports that are not in use to be ignored. (BZ#1930255)

  • In previous releases, if Dell EMC PowerStore ports were configured for multiple purposes, such as iSCSI, replication, incorrect REST filtering caused the cinder driver to report that no accessible iSCSI targets were found.

    This release fixes the Dell EMC PowerStore REST filter functionality. (BZ#1945306)

  • Before this update, a failure occurred when users wanted to delete the DEFAULT volume type.

    With this update, you can delete the DEFAULT volume type when it is not set as the value of the default_volume_type parameter in the cinder.conf file. The default value of the default_volume_type parameter is DEFAULT so you must set it to an appropriate volume type, for example 'tripleo', so that you can delete the DEFAULT volume type. (BZ#1947415)

Changes to the openstack-manila-ui component:

  • Before this update, the Shared File Systems service (manila) dashboard had dynamic form elements whose names could potentially cause the forms to become unresponsive. This meant that the creation of share groups, share networks, and shares within share networks did not function.

    With this update, dynamic elements whose names might be problematic are encoded. The creation of share groups, share networks, and shares within share networks functions normally. (BZ#1938212)

Changes to the openstack-neutron component:

  • The logic to detect the hypervisor hostname has been fixed and now returns the result consistent with libvirt driver in the Compute service (nova). With this fix, you no longer need to specify the resource_provider_hypervisors option when you use the guaranteed minimum bandwidth QoS feature.

    With this update, a new option, resource_provider_default_hypervisor, has been added to the Modular Layer 2 with the Open Virtual Network mechanism driver (ML2/OVN) to replace the default hypervisor name. The option locates the root resource provider without giving a complete list of interfaces or bridges in the resource_provider_hypervisors option in case it has to be customized by the user. This new option is located in the [ovs] ini-section for the ovs-agent, and in the [sriov_nic] ini-section for the sriov-agent. (BZ#1900500)

Changes to the openstack-octavia component:

  • With this update, there is resolution to a problem that prevented the RHOSP Load-balancing service (octavia) to fail over load balancers with multiple failed amphorae. (BZ#1974831)
  • Before this update, when a configuration change to a Load-balancing service amphora caused an haproxy reload, the process consumed a lot of memory that could lead to memory allocation errors. The issue was caused by the lo interface not being configured in the amphora-haproxy namespace in the amphora. With this update, the namespace issue has been corrected and the problem is resolved. (BZ#1975790)

Changes to the openstack-tripleo-heat-templates component:

  • Before this update, upgrading a Red Hat OpenStack Platform (RHOSP) 13 environment that has been deployed with ML2-OVN, to RHOSP 16.1 caused the upgrade process to fail on the Controller nodes due to an SELinux denial issue. With this update, the correct SELinux label is applied to OVN and resolves the issue. For more information, see the Red Hat Knowledgebase solution OVN fails to configure after reboot during OSP-13 → OSP-16.1 FFU. (BZ#1997351)
  • Before this update, if your environment was deployed with a TLS-Everywhere architecture and it used the deprecated authconfig utility to configure authentication on your system, you had to configure your RHEL 8 system with the authselect utility. Without performing this action, the leapp process failed with the inhibitor named Missing required answers in the answer file. The workaround was to add sudo leapp answer --section authselect_check.confirm=True --add in the LeappInitCommand in the upgrades environment file. With this update, the configuration entry is no longer needed, and the upgrade now completes without intervention. (BZ#1952574)
  • Before this update, the Red Hat Enterprise Linux (RHEL) in-place upgrade tool, LEAPP, stalled because it encountered loaded kernel modules that are no longer provided in RHEL 8. Also, LEAPP upgraded RHEL to a version that is not supported by Red Hat OpenStack Platform (RHOSP). With this update, the manual configurations that you had to perform to workaround these two issues are no longer required. (For more information, see BZ1962365. (BZ#1962365)
  • With this update, the memory limit for the collectd container has been increased to 512 MB. When this limit is exceeded, the container restarts. (BZ#1969895)
  • Before this update, removal of the python2 packages for the Red Hat Enterprise Linux (RHEL) in-place upgrade tool, LEAPP, was unsuccessful. This failure was caused by a DNF exclude option that retained the LEAPP packages. With this update, automation has now been included to ensure that the necessary LEAPP packages are successfully removed. (BZ#2008976)
  • Before this update, an upgradable mariadb-server package in the RHEL repository caused the package manager to upgrade the mariadb-server package on the host, which interfered with the containerized mariadb-server that pre-exists on the same host. With this update, the Red Hat OpenStack Platform (RHOSP) director removes the mariadb-server package from any hosts that also have the containerized MariaDB, and the RHOSP FFU process continues. (BZ#2015325)
  • This enhancement adds the new CinderRpcResponseTimeout and CinderApiWsgiTimeout parameters to support tuning RPC and API WSGI timeouts in the Block Storage service (cinder). Default timeout values might not be adequate for large deployments and in situations where transactions might be delayed due to system load.

    It is now possible to tune the RPC and API WSGI timeouts to prevent transactions prematurely timing out. (BZ#1930806)

Changes to the puppet-collectd component:

  • Previously, the PluginInstanceFormat parameter for collectd accepted only one of the following values: 'none', 'name', 'uuid', or 'metadata'. With this update, you can now specify more than one value for the PluginInstanceFormat parameter, resulting in more information being sent in the plugin_instance label of collectd metrics. (BZ#1956887)

Changes to the python-networking-ovn component:

  • Currently, there is a known issue where it is not possible to simulate certain real-life scenarios when the MAC-IP addresses of a port are unknown. The RHOSP Networking service (neutron) directly specifies the MAC-IP of a port even if DHCP or security groups are not configured.

    Workaround: upgrade to RHOSP 16.1.7 and install ML2/OVN v21.03. If DHCP and port security are disabled, then the addresses field of a port does not include its MAC-IP address pairs, and ML2/OVN can use the MAC learning capabilities to send traffic only to the desired port. (BZ#1898198)

Changes to the python-os-brick component:

  • Before this update, there were unhandled exceptions during connection to iSCSI portals. For example, failures in iscsiadm -m session. This occurred because the _connect_vol threads can abort unexpectedly in some failure patterns, and this abort causes a hang in subsequent steps while waiting for results from _connct_vol threads.

    With this update, any exceptions during connection to iSCSI portals are handled in the _connect_vol method correctly and avoids any unexpected abort without updating thread results. (BZ#1977792)

Changes to the python-tripleoclient component:

  • With this update, the tripleo validator command now accepts variables and environment variables in a key-value pair format. In past releases, only JSON dictionaries allowed environment variables.

    openstack tripleo validator run \
    [--extra-vars key1=<val1>[,key2=val2 --extra-vars key3=<val3>] \
    | --extra-vars-file EXTRA_VARS_FILE] \
    [--extra-env-vars key1=<val1>[,key2=val2 --extra-env-vars key3=<val3>]]
    (--validation <validation_id>[,<validation_id>,...] | --group <group>[,<group>,...])

    Example

    $ openstack tripleo validator run --validation check-cpu,check-ram --extra-vars minimal_ram_gb=8 --extra-vars minimal_cpu_count=2

    For the complete list of supported options, run:

    $ openstack tripleo validator run --help

    (BZ#1959492)

  • Before this update, during a tripleo validation on an OpenStack component, the following exception error occurred:

    Unhandled exception during validation run.

    This error occurred because a variable in the code was referenced, but never assigned.

    With this update, this problem has been fixed and validations run without this error. (BZ#1959866)

Changes to the tripleo-ansible component:

  • Before this update, an optional feature of the RHOSP Load-balancing service (octavia), log offloading, was not correctly configured during deployment. As a result of this problem, the Load-balancing service was not receiving logs from the amphorae. This update resolves the issue. (BZ#1981652)
  • Before this update, changes to KernelArgs parameters caused errors in the Red Hat OpenStack Platform (RHOSP) fast forward upgrade (FFU) process for version 13 to version 16:
  • Duplicate entries appeared in /etc/default/grub.
  • Duplicate entries appeared in the kernel command line.
  • Nodes rebooted during the RHOSP upgrade.

    These errors were caused when the KernelArgs parameter, or the order of values in the string, changed or when a KernelArgs parameter was added.

    With this update, TripleO has added upgrade tasks in kernel-boot-params-baremetal-ansible.yaml to migrate from TRIPLEO_HEAT_TEMPLATE_KERNEL_ARGS to GRUB_TRIPLEO_HEAT_TEMPLATE_KERNEL_ARGS.

    This change was made to accommodate the Red Hat Enterprise Linux (RHEL) in-place upgrade tool, LEAPP, which is used to upgrade RHEL from version 7 to version 8, during the RHOSP version 13 to version 16 FFU process. LEAPP understands GRUB parameters only when the parameters start with GRUB_ in /etc/default/grub.

    Despite this update, you must manually inspect each KernelArgs value to ensure that it matches the value for all hosts in the corresponding role.

    The KernelArgs value may come from the PreNetworkConfig implementation from either the default tripleo-heat-templates or third-party heat templates.

    If you find any mismatches, change the value of the KernelArgs parameter in the corresponding role to match the value of KernelArgs on the hosts. Perform these checks before running the openstack overcloud upgrade prepare command.

    You can use the following script to check KernelArgs values:

    tripleo-ansible-inventory --static-yaml-inventory inventory.yaml
    KernelArgs='< KernelArgs_ FROM_THT >'
    ansible -i inventory.yaml ComputeSriov -m shell -b -a "cat /proc/cmdline | grep '${KernelArgs}'"

    (BZ#1980829)

4.8. RHBA-2022:0986 — Red Hat OpenStack Platform 16.1.8 bug fix and enhancement advisory

The bugs contained in this section are addressed by advisory RHBA-2022:0986. For more information, see link:https://access.redhat.com/errata/RHBA-2022:0986.html.

+ Changes to the openstack-cinder component:

  • Before this update, the GPFS SpectrumScale driver in the Block Storage service (cinder) did not correctly detect whether the storage back end supported copy-on-write (COW) mode. As a result, the driver disabled COW features, such as the ability to rapidly create volumes from an image. Sometimes, this caused some instances to time out when booting multiple instances simultaneously from an image.

    With this update, the GPFS SpectrumScale driver properly detects COW support for storage back ends. (BZ#1960639)

  • Before this update, when creating a snapshot with PowerMaxOS 5978.711, REST experienced a payload response change and caused the device label to modify its format. The underlying data from the solutions enabler changed, and no longer contained a colon character (:). This resulted in an IndexError exception in the PowerMax Driver:

    IndexError: list index out of range

    With this update, the problem is resolved in PowerMaxOS 5978.711 and later. (BZ#1992159)

  • Before this update, the OpenStack NFS driver blocked attempts to delete snapshots in an error state when snapshot support is disabled. New or existing snapshots are placed in an error state when snapshot support is disabled, but users could not remove these failed snapshots. With this update, users can now remove NFS snapshots in error status. (BZ#1741453)
  • Before this update, the PowerMax driver used a mechanism for storing and maintaining information on shared volume connections that did not work with previously created legacy volumes. This caused live migration to fail for volumes that were created before the PowerMax migration code was introduced. Now, the PowerMax live migration code is updated to work with legacy volumes so that live migrations do not fail. (BZ#1987957)
  • This update fixes a bug that omitted details from the output of the openstack volume backup list command when the output exceeded 1000 lines. (BZ#1999634)

Changes to the openstack-tripleo-common component:

  • With this update, the telemetry healthchecks have been made more robust and the way the healthchecks are parsed has been simplified.

    To get verbose mode when you run the healthcheck directly, run the command sudo podman -u root -e "HEALTHCHECK_DEBUG=1" <container> /openstack/healthcheck (BZ#1910939)

Changes to the openstack-tripleo-heat-templates component:

  • As of this release, the Red Hat supported method of updating OVN is aligned to the upstream OVN updgrade steps. (BZ#2052411)
  • Before this release, the collectd container failed to start on Compute nodes because a dpdk-telemetry collectd configuration file was being automatically created despite there being no dpdk-telemetry plugin installed.

    As of this release, dpdk_telemetry configuration files have been removed from the the collectd container. (BZ#1996865)

  • Enable the experimental rsyslog reopenOnTruncate to ensure that rsyslog immediately recognizes when a logrotation happens on a file. The setting affects every service configured to work with rsyslog.

    With rsyslog reopenOnTruncate disabled, rsyslog waits for a log file to fill to its original capacity before consuming any additional logs. (BZ#1939964)

  • With this update the CollectdContainerAdditionalCapAdd variable is added to the deployment tool. This variable is a comma separated list of additional collectd container capabilities. (BZ#1984095)
  • With this update, the LeapActorsToRemove heat parameter is introduced so that you can remove specific actors from the leapp process if those actors inhibit the upgrade. The LeapActorsToRemove heat parameter is role-specific for flexibility. (BZ#1984873)

Changes to the puppet-tripleo component:

  • This enhancement prepares your environment for update of the metrics_qdr service to a newer AMQ Interconnect release, which requires import of the CA certificate contents from the Service Telemetry Framework (STF) deployment. Changes are not yet required by administrators when deploying or updating Red Hat OpenStack Service Platform (RHOSP) as the metrics_qdr service has not yet been updated. This functionality is in preparation of the metrics_qdr service update in a future release.

    The following procedure will be required once https://bugzilla.redhat.com/show_bug.cgi?id=1949169 has shipped.

    This update corrects this problem by providing a new Orchestration service (heat) parameter, MetricsQdrSSLProfiles.

    To obtain a Red Hat OpenShift TLS certificate, run the following commands:

    $ oc get secrets
    $ oc get secret/default-interconnect-selfsigned -o jsonpath='{.data.ca\.crt}' | base64 -d

    Add the MetricsQdrSSLProfiles parameter with the contents of your Red Hat OpenShift TLS certificate to a custom environment file:

    MetricsQdrSSLProfiles:
        -   name: sslProfile
            caCertFileContent: |
               -----BEGIN CERTIFICATE-----
               ...
               TOpbgNlPcz0sIoNK3Be0jUcYHVMPKGMR2kk=
               -----END CERTIFICATE-----

    Then, redeploy your overcloud with the openstack overcloud deploy command. (BZ#1949168)

  • This update corrects an error that prevented the proper use of the Cinder powermax_port_groups parameter. (BZ#2029608)

Changes to the python-os-brick component:

  • Before this update, os-brick did not include a [global] section to contain the options it sets in a temporary configuration file, which is a requirement with Octopus (release 15.2.0+). As a result, connection information could not be found when using os-brick and a Ceph Octopus or later client, and a connection to the Ceph storage backend could not be established. Now, the connection options are included under a '[global]' section in the temporary configuration file. This fix is backward compatible to the Hammer release (0.94.0+) of Ceph. (BZ#2023413)

4.9. RHBA-2022:8795 — Red Hat OpenStack Platform 16.1.9 bug fix and enhancement advisory

The bugs contained in this section are addressed by advisory RHBA-2022:8795. For more information, see link: https://access.redhat.com/errata/RHBA-2022:8795.html.

Changes to the openstack-cinder component:

  • Before this update, a race condition occurred when the Compute service (nova) requested the Block Storage service (cinder) to detach a volume and there was an external request to delete the volume. The race condition resulted in the volume failing to detach, the volume being deleted, and the Compute service being unable to remove the non-existent volume. With this update, the race condition is resolved. (BZ#1977322)
  • Before this update, if you imported a backup record for a backup ID that currently existed, the import operation would correctly fail, but the existing backup record would incorrectly be deleted. With this update, the existing backup record is not deleted under this scenario. (BZ#1802263)
  • Before this update, NetApp ONTAP Block Storage (cinder) driver QoS policy groups were deleted when the associated volume was moved. With this update, QoS policy groups are associated permanently to the LUN or file that represents the volume. (BZ#1951485)
  • Before this update, a do_sync_check operation could result in the incorrect deletion of non-temporary snapshots from a volume because there was no check for non-temporary snapshots deletion during the do_sync_check operation. With this update, there is a check to determine if a snapshot must be deleted. The do_sync_check operation does not perform unnecessary non-temporary snapshot deletions.

    Before this update, there was a case mismatch in the conditional while checking if a storage group was a child of a parent storage group. While modifying the storage group, errors indicated that the parent storage group already contained the child storage group. With this update, the patterns used in the conditional are not case-sensitive and you can modify the storage group successfully. (BZ#2129310)

Changes to the openstack-ironic component:

  • Before this update, if there were repeated transient connectivity issues between the ironic-conductor service and a remote Baseboard Management Controller (BMC) using the Redfish hardware type when session authentication was used, the intermittent loss of connectivity could collide with a point where authentication was retried due to the in-memory credentials expiring. If this collision occurred, there was a loss of overall connectivity, which persisted due to the internal session cache built into the openstack-ironic-conductor service. With this update, support to detect and renegotiate in cases of this error were added to the Python DMTF Redfish library, sushy, and the openstack-ironic service. Intermittent connectivity failures colliding with session credential re-authentication no longer results in a complete loss of ability to communicate with the BMC until the openstack-ironic-conductor service is restarted. (BZ#2027544)

Changes to the openstack-manila component:

  • Before this update, the API that the Shared File Systems service (manila) uses to provision storage on NetApp ONTAP All Flash Fabric-Attached (AFF) storage systems caused Shared File Systems service shares to be thinly provisioned. The API did not enforce space guarantees, even when requested through the Shared File Systems service share type. With this update, the driver sets appropriate parameters for the NetApp ONTAP 9 API to work with AFF storage as well as traditional FAS storage systems. The API enforces space guarantees on NetApp ONTAP storage through the Shared File Systems service share types. (BZ#1968228)

Changes to the openstack-nova component:

  • There is currently a known issue when live migrating instances that have CPUs that are incompatible with the destination host CPUs.

    Workaround: Add the following configuration in the nova.conf file of each affected Compute node to skip CPU comparison on the destination host:

    [workarounds]
    skip_cpu_compare_on_dest = True

    (BZ#2076884)

  • Before this update, block device mapping updates by the libvirt driver on the destination host were not persisted during live migration. With specific storage back ends or configurations, for example, when using the n[workarounds]/rbd_volume_local_attach=True config option, certain operations on volume attachments, for example detaching, after a live migration did not work. With this update, you can correctly persist any block device mapping updates done by the libvirt driver on the destination host. Operations on affected volumes, such as detaching, succeed after live migration. (BZ#2089382)

Changes to the openstack-octavia component:

  • Before this update, the Virtual IP (VIP) address of UDP-only load balancers in active-standby mode was not reachable. With this update, the issue is fixed. (BZ#2078377)
  • Before this update, Conntrack was enabled in the Amphora VM for any type of packet, but it is only required for the User Datagram Protocol (UDP) and Stream Control Transmission Protocol (SCTP). With this update, Conntrack is now disabled for Transmission Control Protocol (TCP) flows, preventing some performance issues when a user generates a lot of connections that fill the Conntrack table. (BZ#2123225)
  • Before this update, members in the ERROR operating status might have been updated briefly to ONLINE during a Load Balancer configuration change. With this update, the issue is fixed. (BZ#1996756)
  • Before this update, the provisioning status of a load balancer was set to ERROR too early when an error occurred, making the load balancer mutable before the execution of the tasks for these resources was finished. With this update, the issue is fixed. (BZ#2040697)
  • Before this update, a SELinux issue triggered errors when using the ICMP monitor in the Load-balancing service (octavia) amphora driver. With this update, the SELinux issue is fixed. (BZ#2096387)

Changes to the openstack-tripleo-common component:

  • RHSA-2022:6969 introduced the process to clean up files in the /var/lib/mistral directory in the undercloud but the process consistently failed when the Load-balancing service (octavia) or Red Hat Ceph Storage was enabled because these services created additional directories, which the cleanup process could not properly remove. Some deployment actions, such as scale out, consistently failed if the Load-balancing service or Ceph Storage was enabled. With this update, Mistral no longer executes the cleanup. Users must manually delete files if they want to enforce the reduced permission of the files in the /var/lib/mistral directory. Deployment actions no longer fail because of a permission error. (BZ#2138184)

Changes to the puppet-rsyslog component:

  • With this update, the Rsyslog environment configuration supports an array of Elasticsearch targets. In previous releases, you could only specify a single target. You can now specify multiple Elasticsearch targets as a list of endpoints to send logs. (BZ#1945334)

Changes to the python-dogpile-cache component:

  • Before this update, dogpile.cache support for dead_retry and socket_timeout was not implemented for the memcached back end. The oslo.cache mechanism filled the arguments dictionary with values for dead_retry and socket_timeout, but dogpile.cache ignored the values so the defaults of 30s for dead_retry and 3s for socket_timeout were used. When using dogpile.cache.memcached as the cache back end on the Identity service (keystone), and then taking down one of the memcached instances, the memcache server objects set their deaduntil value to 30 seconds in the future. When a request came in to an API server with two memcached servers configured, one of which was unroutable, it took approximately 15 seconds for it to try each of those servers in each thread it created and reach the three-second socket timeout limit every time it encountered the one that was down. By the time the user issued another request, the deaduntil value was reached and the whole cycle was repeated. With this update, dogpile.cache consumes dead_retry and socket_timeout arguments passed by oslo.cache. (BZ#2100879)

Changes to the python-networking-ovn component:

  • This update in RHOSP 16.1.9 fixes a bug that causes the Networking service (neutron) to fail to start after an update to RHOSP 16.1.8 and also causes OVN database instability after updates to RHOSP 16.1.8.

    Instead of updating to RHOSP 16.1.8, update directly to RHOSP 16.1.9. (BZ#2125824)

  • With this update, you can now migrate an ML2/OVS deployment with the iptables_hybrid firewall driver to ML2/OVN. (BZ#2022040)
  • When a load balancer is created in a tenant network with a Virtual IP (VIP) and members, and the tenant network is connected to a router that is connected to the provider network, the Open Virtual Network (OVN) load balancer is associated with the OVN logical router. If the 'router' option was used for nat-addresses, ovn-controller sent GARP packets for that VIP on the provider network. As there was nothing to prevent different tenants in OpenStack from creating a subnet with the same Classless Inter-Domain Routing (CIDR) number and a load balancer with the same VIP, there could be several ovn-controllers generating GARP packets on the provider network for the same IP, each one with the MAC of the logical router port belonging to each tenant. This setup could be an issue for the physical network infrastructure. With this update, a new option (exclude-lb-vips-from-garp) is added in OVN[1] on the router gateway port. This flag ensures that no GARP packets are sent for the load balancer VIPs. (BZ#2064709)
  • Before this update, it was possible to add members without stating which subnet they belonged to, but they should be in the same subnet as the Virtual IP (VIP) port. If the subnet of the members is different to the VIP subnet, the members are created but incorrectly configured because there is no connectivity to them. With this update, members without a subnet are only accepted if the IP of the member belongs to the Classless Inter-Domain Routing (CIDR) number of the VIP subnet, as that is the subnet associated to the load balancer used to obtain the subnet for the members that do not have it. Member creation without a subnet is rejected if its IP does not belong to the VIP subnet CIDR. (BZ#2122925)

Changes to the python-octaviaclient component:

  • Before this update, python-octaviaclient did not display the full list of load balancers when the user had more than 1,000 load balancers. With this update, the OpenStack Load-balancing service (Octavia) displays all load balancers. (BZ#1996088) Changes to the python-openstackclient component of Bugzilla:

Changes to the tripleo-ansible component:

  • Before this update, the Load-balancing services (octavia) were restarted many times during deployments or updates. With this update, the services are restarted only when required, preventing potential interruptions of the control plane. (BZ#2057604)
  • Before this update, a nonexistent gateway address was configured on the load-balancing management network. This caused excessive Address Resolution Protocol (ARP) requests on the load-balancing management network. (BZ#1961162)
  • With this update, the port_security parameter of the Load-balancing service (octavia) management network is now enabled. (BZ#1982268)