Chapter 4. Technical Notes

This chapter supplements the information contained in the text of Red Hat OpenStack Platform "Train" errata advisories released through the Content Delivery Network.

4.1. RHEA-2020:3148 — Red Hat OpenStack Platform 16.1 general availability advisory

The bugs contained in this section are addressed by advisory RHBA-2020:3148. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2020:3148.html.

Changes to the ansible-role-atos-hsm component:

  • With this enhancement, you can use ATOS HSM deployment with HA mode. (BZ#1676989)

Changes to the collectd component:

Changes to the openstack-cinder component:

  • With this enhancement, you can revert Block Storage (cinder) volumes to the most recent snapshot, if supported by the driver. This method of reverting a volume is more efficient than cloning from a snapshot and attaching a new volume. (BZ#1686001)
  • Director can now deploy the Block Storage Service in an active/active mode. This deployment scenario is supported only for Edge use cases. (BZ#1700402)
  • This update includes the following enhancements:

    • Support for revert-to-snapshot in VxFlex OS driver
    • Support for volume migration in VxFlex OS driver
    • Support for OpenStack volume replication v2.1 in VxFlex OS driver
    • Support for VxFlex OS 3.5 in the VxFlex OS driver

Changes to the openstack-designate component:

  • DNS-as-a-Service (designate) returns to technology preview status in Red Hat OpenStack Platform 16.1. (BZ#1603440)

Changes to the openstack-glance component:

  • The Image Service (glance) now supports multi stores with the Ceph RBD driver. (BZ#1225775)
  • In Red Hat OpenStack Platform 16.1, you can use the Image service (glance) to copy existing image data into multiple stores with a single command. This removes the need for the operator to copy data manually and update image locations. (BZ#1758416)
  • In Red Hat OpenStack Platform 16.1, you can use the Image Service (glance) to copy existing image data into multiple stores with a single command. This removes the need for the operator to copy data manually and update image locations. (BZ#1758420)
  • With this update, when using Image Service (glance) multi stores, the image owner can delete an Image copy from a specific store. (BZ#1758424)

Changes to the openstack-ironic component:

  • A regression was introduced in ipmitool-1.8.18-11 that caused IPMI access to take over 2 minutes for certain BMCs that did not support the "Get Cipher Suites". As a result, introspection could fail and deployments could take much longer than previously.

    With this update, ipmitool retries are handled differently, introspection passes, and deployments succeed.

    Note

    This issue with ipmitool is resolved in ipmitool-1.8.18-17. (BZ#1831893)

Changes to the openstack-ironic-python-agent component:

  • Before this update, there were no retries and no timeout when downloading a final instance image with the direct deploy interface in ironic. As a result, the deployment could fail if the server that hosts the image fails to respond.

    With this update, the image download process attempts 2 retries and has a connection timeout of 60 seconds. (BZ#1827721)

Changes to the openstack-neutron component:

  • Before this update, it was not possible to deploy the overcloud in a Distributed Compute Node (DCN) or spine-leaf configuration with stateless IPv6 on the control plane. Deployments in this scenario failed during ironic node server provisioning. With this update, you can now deploy successfully with stateless IPv6 on the control plane. (BZ#1803989)

Changes to the openstack-tripleo-common component:

  • When you update or upgrade python3-tripleoclient, Ansible does not receive the update or upgrade and Ansible or ceph-ansible tasks fail.

    When you update or upgrade, ensure that Ansible also receives the update so that playbook tasks can run successfully. (BZ#1852801)

  • With this update, the Red Hat Ceph Storage dashboard uses Ceph 4.1 and a Grafana container based on ceph4-rhel8. (BZ#1814166)
  • Before this update, during Red Hat Ceph Storage (RHCS) deployment, Red Hat OpenStack Platform (RHOSP) director generated the CephClusterFSID by passing the desired FSID to ceph-ansible and used the Python uuid1() function. With this update, director uses the Python uuid4() function, which generates UUIDs more randomly. (BZ#1784640)

Changes to the openstack-tripleo-heat-templates component:

  • There is an incomplete definition for TLS in the Orchestration service (heat) when you update from 16.0 to 16.1, and the update fails.

    To prevent this failure, you must set the following parameter and value: InternalTLSCAFile: ''. (BZ#1840640)

  • With this enhancement, you can configure Red Hat OpenStack Platform to use an external, pre-existing Ceph RadosGW cluster. You can manage this cluster externally as an object-store for OpenStack guests. (BZ#1440926)
  • With this enhancement, you can use director to deploy the Image Service (glance) with multiple image stores. For example, in a Distributed Compute Node (DCN) or Edge deployment, you can store images at each site. (BZ#1598716)
  • With this enhancement, HTTP traffic that travels from the HAProxy load balancer to Red Hat Ceph Storage RadosGW instances is encrypted. (BZ#1701416)
  • With this update, you can deploy pre-provisioned nodes with TLSe using the new 'tripleo-ipa' method. (BZ#1740946)
  • Before this update, in deployments with an IPv6 internal API network, the Block Storage Service (cinder) and Compute Service (nova) were configured with a malformed glance-api endpoint URI. As a result, cinder and nova services located in a DCN or Edge deployment could not access the Image Service (glance).

    With this update, the IPv6 addresses in the glance-api endpoint URI are correct and the cinder and nova services at Edge sites can access the Image Service successfully. (BZ#1815928)

  • With this enhancement, FreeIPA has DNS entries for the undercloud and overcloud nodes. DNS PTR records are necessary to generate certain types of certificates, particularly certificates for cinder active/active environments with etcd. You can disable this functionality with the IdMModifyDNS parameter in an environment file. (BZ#1823932)
  • In this release of Red Hat OpenStack Platform, you can no longer customize the Red Hat Ceph Storage cluster admin keyring secret. Instead, the admin keyring secret is generated randomly during initial deployment. (BZ#1832405)
  • Before this update, stale neutron-haproxy-qdhcp-* containers remained after you deleted the related network. With this update, all related containers are cleaned correctly when you delete a network. (BZ#1832720)
  • Before this update, the ExtraConfigPre per_node script was not compatible with Python 3. As a result, the overcloud deployment failed at the step TASK [Run deployment NodeSpecificDeployment] with the message SyntaxError: invalid syntax.

    With this update, the ExtraConfigPre per_node script is compatible with Python 3 and you can provision custom per_node hieradata. (BZ#1832920)

  • With this update, the swift_rsync container runs in unprivileged mode. This makes the swift_rsync container more secure. (BZ#1807841)
  • PowerMax configuration options have changed since Newton. This update includes the latest PowerMax configuration options and supports both iSCSI and FC drivers.

    The CinderPowermaxBackend parameter also supports multiple back ends. CinderPowermaxBackendName supports a list of back ends, and you can use the new CinderPowermaxMultiConfig parameter to specify parameter values for each back end. For example syntax, see environments/cinder-dellemc-powermax-config.yaml. (BZ#1813393)

  • Support for Xtremio Cinder Backend

    Updated the Xtremio cinder backend to support both iSCSI and FC drivers. It is also enhanceded to support multiple backends. (BZ#1852082)

  • Red Hat OpenStack Platform 16.1 includes tripleo-heat-templates support for VXFlexOS Volume Backend. (BZ#1852084)
  • Red Hat OpenStack Platform 16.1 includes support for SC Cinder Backend. The SC Cinder back end now supports both iSCSI and FC drivers, and can also support multiple back ends. You can use the CinderScBackendName parameter to list back ends, and the CinderScMultiConfig parameter to specify parameter values for each back end. For an example configuration file, see environments/cinder-dellemc-sc-config.yaml. (BZ#1852087)
  • PowerMax configuration options have changed since Newton. This update includes the latest PowerMax configuration options and supports both iSCSI and FC drivers.

    The CinderPowermaxBackend parameter also supports multiple back ends. CinderPowermaxBackendName supports a list of back ends, and you can use the new CinderPowermaxMultiConfig parameter to specify parameter values for each back end. For example syntax, see environments/cinder-dellemc-powermax-config.yaml. (BZ#1852088)

Changes to the openstack-tripleo-validations component:

  • Before this update, the data structure format that the ceph osd stat -f json command returns changed. As a result, the validation to stop the deployment unless a certain percentage of Red Hat Ceph Storage (RHCS) OSDs are running did not function correctly, and stopped the deployment regardless of how many OSDs were running.

    With this update, the new version of openstack-tripleo-validations computes the percentage of running RHCS OSDs correctly and the deployment stops early if a percentage of RHCS OSDs are not running. You can use the parameter CephOsdPercentageMin to customize the percentage of RHCS OSDs that must be running. The default value is 66%. Set this parameter to 0 to disable the validation. (BZ#1845079)

Changes to the puppet-cinder component:

Changes to the puppet-tripleo component:

  • Before this update, the etcd service was not configured properly to run in a container. As a result, an error occurred when the service tried to create the TLS certificate. With this update, the etcd service runs in a container and can create the TLS certificate. (BZ#1804079)

Changes to the python-cinderclient component:

  • Before this update, the latest volume attributes were not updated during poll, and the volume data was incorrect on the display screen. With this update, volume attributes update correctly during poll and the correct volume data appears on the display screen. (BZ#1594033)

Changes to the python-tripleoclient component:

  • With this enhancement, you can use the --limit, --skip-tags, and --tags Ansible options in the openstack overcloud deploy command. This is particularly useful when you want to run the deployment on specific nodes, for example, during scale-up operations. (BZ#1767581)
  • With this enhancement, there are new options in the openstack tripleo container image push command that you can use to provide credentials for the source registry. The new options are --source-username and --source-password.

    Before this update, you could not provide credentials when pushing a container image from a source registry that requires authentication. Instead, the only mechanism to push the container was to pull the image manually and push from the local system. (BZ#1811490)

  • With this update, the container_images_file parameter is now a required option in the undercloud.conf file. You must set this parameter before you install the undercloud.

    With the recent move to use registry.redhat.io as the container source, you must authenticate when you fetch containers. For the undercloud, the container_images_file is the recommended option to provide the credentials when you perform the installation. Before this update, if this parameter was not set, the deployment failed with authentication errors when trying to fetch containers. (BZ#1819016)

4.2. RHBA-2020:3542 — Red Hat OpenStack Platform 16.1.1 general availability advisory

The bugs contained in this section are addressed by advisory RHBA-2020:3542. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2020:3542.html.

Changes to the openstack-tripleo component:

  • The overcloud deployment steps included an older Ansible syntax that tagged the tripleo-bootstrap and tripleo-ssh-known-hosts roles as common_roles. This older syntax caused Ansible to run tasks tagged with the common_roles when Ansible did not use the common_roles tag. This syntax resulted in errors during the 13 to 16.1 system_upgrade process.

    This update uses a newer syntax to tag the tripleo-bootstrap and tripleo-ssh-known-hosts roles as common_roles. Errors do not appear during the 13 to 16.1 system_upgrade process and you no longer include the --playbook upgrade_steps_playbook.yaml option to the system_upgrade process as a workaround. (BZ#1851914)

Changes to the openstack-tripleo-heat-templates component:

  • This update fixes a GRUB parameter naming convention that led to unpredictable behaviors on compute nodes during leapp upgrades.

    Previously, the presence of the obsolete "TRIPELO" prefix on GRUB parameters caused problems.

    The file /etc/default/grub has been updated with GRUB for the tripleo kernel args parameter so that leapp can upgrade it correctly. This is done by adding "upgrade_tasks" to the service "OS::TripleO::Services::BootParams", which is a new service added to all roles in the roles_data.yaml file. (BZ#1858673)

  • This update fixes a problem that caused baremetal nodes to become non-responsive during Leapp upgrades.

    Previously, Leapp did not process transient interfaces like SR-IOV virtual functions (VF) during migration. As a result, Leapp did not find the VF interfaces during the upgrade, and nodes entered an unrecoverable state.

    Now the service "OS::TripleO::Services::NeutronSriovAgent" sets the physical function (PF) to remove all VFs, and migrates workloads before the upgrade. After the successful Leapp upgrade, os-net-config runs again with the "--no-activate" flag to re-establish the VFs. (BZ#1866372)

  • This director enhancement automatically installs the Leapp utility on overcloud nodes to prepare for OpenStack upgrades. This enhancement includes two new Heat parameters: LeappRepoInitCommand and LeappInitCommand. In addition, if you have the following repository defaults, you do not need to pass UpgradeLeappCommandOptions values.

    --enablerepo rhel-8-for-x86_64-baseos-eus-rpms --enablerepo rhel-8-for-x86_64-appstream-eus-rpms --enablerepo rhel-8-for-x86_64-highavailability-eus-rpms --enablerepo advanced-virt-for-rhel-8-x86_64-rpms --enablerepo ansible-2.9-for-rhel-8-x86_64-rpms --enablerepo fast-datapath-for-rhel-8-x86_64-rpms

    (BZ#1845726)

  • If you do not set the UpgradeLevelNovaCompute parameter to '', live migrations are not possible when you upgrade from RHOSP 13 to RHOSP 16. (BZ#1849235)
  • This update fixes a bug that prevented the successful deployment of transport layer security (TLS) everywhere with public TLS certifications. (BZ#1852620)
  • Before this update, director did not set the noout flag on Red Hat Ceph Storage OSDs before running a Leapp upgrade. As a result, additional time was required for the the OSDs to rebalance after the upgrade.

    With this update, director sets the noout flag before the Leapp upgrade, which accelerates the upgrade process. Director also unsets the noout flag after the Leapp upgrade. (BZ#1853275)

  • Before this update, the Leapp upgrade could fail if you had any NFS shares mounted. Specifically, the nodes that run the Compute Service (nova) or the Image Service (glance) services hung if they used an NFS mount.

    With this update, before the Leapp upgrade, director unmounts /var/lib/nova/instances, /var/lib/glance/images, and any Image Service staging area that you define with the GlanceNodeStagingUri parameter. (BZ#1853433)

Changes to the openstack-tripleo-validations component:

  • This update fixes a Red Hat Ceph Storage (RHCS) version compatibility issue that caused failures during upgrades from Red Hat OpenStack platform 13 to 16.1. Before this fix, validations performed during the upgrade worked with RHCS3 clusters but not RHCS4 clusters. Now the validation works with both RHCS3 and RHCS4 clusters. (BZ#1852868)

Changes to the puppet-tripleo component:

  • Before this update, the Red Hat Ceph Storage dashboard listener was created in the HA Proxy configuration, even if the dashboard is disabled. As a result, upgrades of OpenStack with Ceph could fail.

    With this update, the service definition has been updated to distinguish the Ceph MGR service from the dashboard service so that the dashboard service is not configured if it is not enabled and upgrades are successful. (BZ#1850991)

4.3. RHSA-2020:4283 — Red Hat OpenStack Platform 16.1.2 general availability advisory

The bugs contained in this section are addressed by advisory RHSA-2020:4283. Further information about this advisory is available at link: https://access.redhat.com/errata/RHSA-2020:4283.html.

Bug Fix(es):

  • This update includes the following bug fix patches related to fully qualified domain names (FQDN).

    • Kaminario Fix unique_fqdn_network option

      Previously, the Kaminario driver accepted the unique_fqdn_network configuration option in the specific driver section. When this option was moved, a regression was introduced: the parameter was now only used if it was defined in the shared configuration group.

      This patch fixes the regression and makes it possible to define the option in the shared configuration group as well as the driver specific section.

    • HPE 3PAR Support duplicated FQDN in network

      The 3PAR driver uses the FQDN of the node that is doing the attach as an unique identifier to map the volume.

      Because the FQDN is not always unique, in some environments the same FQDN can be found in different systems. In those cases, if both try to attach volumes, the second system will fail.

      For example, this could happen in a QA environment where VMs share names like controller-.localdomain and compute-0.localdomain.

      This patch adds the unique_fqdn_network configuration option to the 3PAR driver to prevent failures caused by name duplication between systems. (BZ#1721361) (BZ#1721361)

  • This update makes it possible to run the Brocade FCZM driver in RHOSP 16.

    The Brocade FCZM vendor chose not to update the driver for Python 3, and discontinued support of the driver past the Train release of OpenStack [1]. Red Hat OpenStack (RHOSP) 16 uses Python 3.6.

    The upstream Cinder community assumed the maintenance of the Brocade FCZM driver on a best-effort basis, and the bugs that prevented the Brocade FCZM from running in a Python 3 environment (and hence in RHOSP 16) have been fixed.

    [1] https://docs.broadcom.com/doc/12397527 (BZ#1848420)

  • This update fixes a problem that caused volume attachments to fail on a VxFlexOS cinder backend.

    Previously, attempts to attach a volume on a VxFlexOS cinder backend failed because the cinder driver for the VxFlexOS back end did not include all of the information required to connect to the volume.

    The VxFlexOS cinder driver has been updated to include all the information required in order to connect to a volume. The attachments now work correctly. (BZ#1862213)

  • This enhancement introduces support for the revert-to-snapshot feature with the Block Storage (cinder) RBD driver. (BZ#1702234)
  • Red Hat OpenStack Platform 16.1 includes the following PowerMax Driver updates:

    Feature updates:

    • PowerMax Driver - Unisphere storage group/array tagging support
    • PowerMax Driver - Short host name and port group name override
    • PowerMax Driver - SRDF Enhancement
    • PowerMax Driver - Support of Multiple Replication

      Bug fixes:

    • PowerMax Driver - Debug Metadata Fix
    • PowerMax Driver - Volume group delete failure
    • PowerMax Driver - Setting minimum Unisphere version to 9.1.0.5
    • PowerMax Driver - Unmanage Snapshot Delete Fix
    • PowerMax Driver - RDF clean snapvx target fix
    • PowerMax Driver - Get Manageable Volumes Fix
    • PowerMax Driver - Print extend volume info
    • PowerMax Driver - Legacy volume not found
    • PowerMax Driver - Safeguarding retype to some in-use replicated modes
    • PowerMax Driver - Replication array serial check
    • PowerMax Driver - Support of Multiple Replication
    • PowerMax Driver - Update single underscores
    • PowerMax Driver - SRDF Replication Fixes
    • PowerMax Driver - Replication Metadata Fix
    • PowerMax Driver - Limit replication devices
    • PowerMax Driver - Allowing for default volume type in group
    • PowerMax Driver - Version comparison correction
    • PowerMax Driver - Detach RepConfig logging & Retype rename remote fix
    • PowerMax Driver - Manage volume emulation check
    • PowerMax Driver - Deletion of group with volumes
    • PowerMax Driver - PowerMax Pools Fix
    • PowerMax Driver - RDF status validation
  • PowerMax Driver - Concurrent live migrations failure

    • PowerMax Driver - Live migrate remove rep vol from sg
    • PowerMax Driver - U4P failover lock not released on exception
    • PowerMax Driver - Compression Change Bug Fix (BZ#1808583)
  • Before this update, the Block Storage service (cinder) assigned the default volume type in a volume create request, ignoring alternative methods of specifying the volume type.

    With this update, the Block Storage service performs as expected:

    • If you specify a source_volid in the request, the volume type that the Block Storage service sets is the volume type of the source volume.
    • If you specify a snapshot_id in the request, the volume type is inferred from the volume type of the snapshot.
    • If you specify an imageRef in the request, and the image has a cinder_img_volume_type image property, the volume type is inferred from the value of the image property.

      Otherwise, Block Storage service sets the volume type is the default volume type that you configure. If you do not configure a volume type, the Block Storage service uses the system default volume type, DEFAULT.

      When you specify a volume type explicitly in the volume create request, the Block Storage service uses the type that you specify. (BZ#1826741)

  • Before this update, when you created a volume from a snapshot, the operation could fail because the Block Storage service (cinder) would try to assign the default volume type to the new volume instead of inferring the correct volume type from the snapshot. With this update, you no longer have to specify the volume type when you create a volume. (BZ#1843789)
  • This enhancement adds a new driver for the Dell EMC PowerStore to support Block Storage service back end servers. The new driver supports the FC and iSCSI protocols, and includes these features:

    • Volume create and delete
    • Volume attach and detach
    • Snapshot create and delete
    • Create volume from snapshot
    • Get statistics on volumes
    • Copy images to volumes
    • Copy volumes to images
    • Clone volumes
    • Extend volumes
    • Revert volumes to snapshots (BZ#1862541)

4.4. RHEA-2020:4284 — Red Hat OpenStack Platform 16.1.2 general availability advisory

The bugs contained in this section are addressed by advisory RHEA-2020:4284. Further information about this advisory is available at link: https://access.redhat.com/errata/RHEA-2020:4284.html.

Changes to the openstack-nova component:

  • This bug fix enables you to boot an instance from an encrypted volume when that volume was created from an image that in turn was created by uploading an encrypted volume to the Image Service as an image. (BZ#1879190)

Changes to the openstack-octavia component:

  • The keepalived instance in the Red Hat OpenStack Platform Load-balancing service (octavia) instance (amphora) can abnormally terminate and interrupt UDP traffic. The cause of this issue is that the timeout value for the UDP health monitor is too small.

    Workaround: specify a new timeout value that is greater than two seconds: $ openstack loadbalancer healthmonitor set --timeout 3 <heath_monitor_id>

    For more information, search for "loadbalancer healthmonitor" in the Command Line Interface Reference. (BZ#1837316)

Changes to the openstack-tripleo-heat-templates component:

  • A known issue causes the migration of Ceph OSDs from Filestore to Bluestore to fail. In use cases where the osd_objectstore parameter was not set explicitly when you deployed OSP13 with RHCS3, the migration exits without converting any OSDs and falsely reports that the OSDs are already using Bluestore. For more information about the known issue, see https://bugzilla.redhat.com/show_bug.cgi?id=1875777

    As a workaround, perform the following steps:

    1. Include the following content in an environment file:

      parameter_defaults:
        CephAnsibleExtraConfig:
          osd_objectstore: filestore
    2. Perform a stack update with the overcloud deploy --stack-only command, and include the new or existing environment file that contains the osd_objectstore parameter. In the following example, this environment file is <osd_objectstore_environment_file>. Also include any other environment files that you included during the converge step of the upgrade:

      $ openstack overcloud deploy --stack-only \
        -e <osd_objectstore_environment_file> \
        -e <converge_step_environment_files>
    3. Proceed with the FileStore to BlueStore migration by using the existing documentation. See https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html/framework_for_upgrades_13_to_16.1/OSD-migration-from-filestore-to-bluestore

      Result: The Filestore to Bluestore playbook triggers the conversion process, and removes and re-creates the OSDs successfully. (BZ#1733577)

  • Inadequate timeout values can cause an overcloud deployment to fail after four hours. To prevent these timeout failures, set the following undercloud and overcloud timeout parameters:
  • Undercloud timeouts (seconds):

    Example

    parameter_defaults:
      TokenExpiration: 86400
      ZaqarWsTimeout: 86400

  • Overcloud deploy timeouts (minutes):

    Example

    $ openstack overcloud deploy --timeout 1440

    The timeouts are now set. (BZ#1792500)

  • Currently, you cannot scale down or delete compute nodes if Red Hat OpenStack Platform is deployed with TLS-e using tripleo-ipa. This is because the cleanup role, traditionally delegated to the undercloud as localhost, is now being invoked from the mistral container.

    For more information, see https://access.redhat.com/solutions/5336241 (BZ#1866562)

  • This update fixes a bug that prevented the distributed compute nodes (DCN) compute servcie from accessing the glance service.

    Previously, distributed compute nodes were configured with a glance endpoint URI that specified an IP address, even when deployed with internal transport layer security (TLS). Because TLS requires the endpoint URI to specify a fully qualified domain name (FQDN), the compute service could not access the glance service.

    Now, when deployed with internal TLS, DCN services are configured with glance endpoint URI that specifies a FQDN, and the DCN compute service can access the glance service. (BZ#1873329)

  • This update introduces support of Distributed Compute Nodes TLS everywhere with Triple IPA. (BZ#1874847)
  • The update introduces support of Neutron routed provider networks with RH-OSP Distributed Compute Nodes (BZ#1874863)
  • This update adds support for encrypted volumes and images on distributed compute nodes (DCN).

    DCN nodes can now access the Key Manager service (barbican) running in the central control plane.

    Note

    This feature adds a new Key Manager client service to all DCN roles. To implement the feature, regenerate the roles.yaml file used for the DCN site’s deployment.

    For example:

    $ openstack overcloud roles generate DistributedComputeHCI DistributedComputeHCIScaleOut -o ~/dcn0/roles_data.yaml

    Use the appropriate path to the roles data file. (BZ#1852851)

  • Before this update, to successfully run a leapp upgrade during the fast forward upgrade (FFU) from RHOSP 13 to RHOSP16.1, the node where the Red Hat Enterprise Linux upgrade was occurring had to have the PermitRootLogin field defined in the ssh config file (/etc/ssh/sshd_config).

    With this update, the Orchestration service (heat) no longer requires you to modify /etc/ssh/sshd_config with the PermitRootLogin field. (BZ#1855751)

  • This enhancement adds a new driver for the Dell EMC PowerStore to support Block Storage service back end servers. (BZ#1862547)

Changes to the openstack-tripleo-validations component:

  • This update safeguards against potential package content conflict after content was moved from openstack-tripleo-validations to another package. (BZ#1877688)

Changes to the puppet-cinder component:

  • This release adds support for the Dell EMC PowerStore Cinder Backend Driver. (BZ#1862545)

Changes to the puppet-tripleo component:

  • This enhancement adds a new driver for the Dell EMC PowerStore to support Block Storage service back end servers. (BZ#1862546)
  • This update fixes incorrect parameter names in Dell EMC Storage Templates. (BZ#1868620)

Changes to the python-networking-ovn component:

  • Transmission of jumbo UDP frames on ML2/OVN routers depends on a kernel release that is not yet avaialbe.

    After receiving a jumbo UDP frame that exceeds the maximum transmission unit of the external network, ML2/OVN routers can return ICMP "fragmentation needed" packets back to the sending VM, where the sending application can break the payload into smaller packets. To determine the packet size, this feature depends on discovery of MTU limits along the south-to-north path.

    South-to-north path MTU discovery requires kernel-4.18.0-193.20.1.el8_2, which is scheduled for availability in a future release. To track availability of the kernel version, see https://bugzilla.redhat.com/show_bug.cgi?id=1860169. (BZ#1547074)

Changes to the python-os-brick component:

  • This update modifies get_device_info to use lsscsi to get [H:C:T:L] values, making it possible to support more than 255 logical unit numbers (LUNs) and host logical unit (HLU) ID values.

    Previously, get_device_info used sg_scan to get these values, with a limit of 255.

    You can get two device types with get_device_info:

    • o /dev/disk/by-path/xxx, which is a symlink to /dev/sdX
    • o /dev/sdX

      sg_scan can process any device name, but lsscsi only shows /dev/sdx names.

      If the device is a symlink, get_device_info uses the device name that the device links to. Otherwise get_device_info uses the device name directly.

      Then get_device_info gets the device info '[H:C:T:L]' by comparing the device name with the last column of lsscsi output. (BZ#1872211)

  • This update fixes an incompatibility that caused VxFlex volume detachment attempts to fail.

    A recent change in VxFlex cinder volume credentialing methods was not backward compatible with pre-existing volume attachments. If a VxFlex volume attachment was made before the credentialing method change, attempts to detach the volume failed.

    Now the detachments do not fail. (BZ#1869346)

Changes to the python-tripleoclient component:

  • The entry in /etc/hosts for the undercloud duplicates anytime the Compute stack is updated on the undercloud and overcloud nodes. This occurs for split-stack deployments where the Controllers and Compute nodes are divided into multiple stacks.

    Other indications of this problem are the following:

    • mysql reporting errors about packets exceeding their maximum size.
    • The Orchestration service (heat) warning that templates are exceeding their maximum size.
    • The Workflow service (mistral) warning that fields are exceeding their maximum size. As a workaround, in the file generated by running the openstack overcloud export command that is included in the Compute stack, under ExtraHostFileEntries, remove the erroneous entry for the undercloud. (BZ#1876153)

Changes to the tripleo-ansible component:

  • This update increases the speed of stack updates in certain cases.

    Previously, stack update performance was degraded when the Ansible --limit option was not passed to ceph-ansible. During a stack update, ceph-ansible sometimes made idempotent updates on nodes even if the --limit argument was used.

    Now director intercepts the Ansible --limit option and passes it to the ceph-ansible excecution. The --limit option passed to commands starting with 'openstack overcloud' deploy is passed to the ceph-ansible execution to reduce the time required for stack updates.

    Important

    Always include the undercloud in the limit list when using this feature with ceph-ansible. (BZ#1855112)

4.5. RHBA-2021:0817 — Red Hat OpenStack Platform 16.1.4 director bug fix advisory

The bugs contained in this section are addressed by advisory RHBA-2021:0817. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2021:0817.html.

Changes to the openstack-cinder component:

  • Before this update, cloned encrypted volumes were inaccessible when using the Block Storage (cinder) service with the Key Manager (barbican) service. With this update, cloned encrypted volumes are now accessible when using the Block Storage service with the Key Manager service. (BZ#1889228)
  • The 'all_tenants' key passed with a volume transfer request is removed because the database is unable to parse it. Removing this key allows the user to show the detail of a specific volume transfer by using the transfer name. Before this update, the 'all_tenants' key was removed only for admin users, which meant that non-admin users were unable to show volume transfers by using the transfer name. With this update, the 'all_tenants' key is now also removed for non-admins, allowing non-admins to show volume transfers by using the transfer name. (BZ#1847907)
  • Before this update, the Block Storage (cinder) NEC back end driver occasionally returned invalid data when initializing a volume connection, which could cause live migration to fail. With this update, the NEC driver has been fixed to reliably return valid connection data. Live migration no longer fails due to invalid volume connection data. (BZ#1910854)
  • Before this update, the Block Storage (cinder) service would always assign newly created volumes with the default volume type, even when the volume was created from another source, such as an image, snapshot or another volume. This resulted in volumes created from another source having a different volume type from the volume type of the source.

    With this update, the default volume type is assigned only after determining whether it should be assigned based on the volume type of the source. The volume type of volumes created from another source now match the volume type of the source. (BZ#1921735)

  • Before this update, the --server option was being ignored when passed with the cinder service-get-log command, which resulted in the logs for all hosts being returned instead of just the logs for a specific host. With this update, using the --server option correctly filters the logs for the specified host. (BZ#1728142)

Changes to the openstack-tripleo-common component:

  • The virt-admin tool is now available for you to use to capture logs for reporting RHOSP bugs. This tool is useful for troubleshooting all libvirt and QEMU problems, as the logs provide the communications between libvirt and QEMU on the Compute nodes. You can use virt-admin to set the libvirt and QEMU debug log filters dynamically, without having to restart the nova_libvirt container.

    Perform the following steps to enable libvirt and QEMU log filters on a Compute node:

    1. Log in to the nova_libvirt container on the Compute node:

      $ sudo podman exec -it nova_libvirt /bin/bash
    2. Specify the name and location of the log file to send virt-admin output to:

      $ virt-admin daemon-log-outputs "1:file:/var/log/libvirt/libvirtd.log"
    3. Configure the filters you want to collect logs for:

      $ virt-admin daemon-log-filters \
       "1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 3:object 1:util"
      Note

      When debugging issues with live migration, you must configure these filters on all source and destination Compute nodes.

    4. Repeat your test. After debugging is complete, upload the libvirtd.log to a bug.
    5. Disable the libvirt and QEMU log filters on the Compute nodes:

      $ virt-admin daemon-log-filters ""
    6. To confirm that the filters are removed, enter the following command:

      $ virt-admin daemon-log-filters

      This command returns an empty list when you have successfully removed the filters.

(BZ#1870199)

Changes to the openstack-tripleo-heat-templates component:

  • Before this update, in-place upgrades from Red Hat OpenStack Platform 13 to 16.1 in a TLS everywhere environment used an incorrect rabbitmq password for the novajoin container. This caused the novajoin container on the undercloud to function incorrectly, which caused any overcloud node that ran an upgrade to fail with the following error:

    2020-11-24 20:01:31.569 7 ERROR join   File "/usr/lib/python3.6/site-packages/amqp/connection.py", line 639, in _on_close
    2020-11-24 20:01:31.569 7 ERROR join     (class_id, method_id), ConnectionError)
    2020-11-24 20:01:31.569 7 ERROR join amqp.exceptions.AccessRefused: (0, 0): (403) ACCESS_REFUSED - Login was refused using authentication mechanism AMQPLAIN. For detail see the broker logfile.

    With this update, the upgrade from RHOSP 13 to 16.1 uses the correct rabbitmq password in a TLS everywhere environment so that the framework for upgrades can complete successfully. (BZ#1901157)

  • With this enhancement, you can deploy the Red Hat Ceph Storage (RHCS) Dashboard on edge sites in a distributed compute node (DCN) architecture. (BZ#1793595)
  • With this enhancement, you can manage vPMEM with two new parameters NovaPMEMMappings and NovaPMEMNamespaces.

    • Use NovaPMEMMappings to set the nova configuration option pmem_namespaces that reflects mappings between vPMEM and physical PMEM namespaces.
    • Use NovaPMEMNamespaces to create and manage physical PMEM namespaces that you use as a back end for vPMEM. (BZ#1834185)
  • There is currently a known issue with the mechanism that ensures the subscribed environments have the right DNF module stream set. The Advanced Virtualization repository is not always available in the subscription that the Ceph nodes use, which causes the upgrade or update of a Ceph node to fail when you try to enable virt:8.2.

    Workaround:

    Override the DnfStreams parameter in the upgrade or update environment file to prevent the Ceph upgrade from failing:

    parameter_defaults:
      ...
      DnfStreams: [{'module':'container-tools', 'stream':'2.0'}]
    Note

    The Advanced Virtualization DNF stream is not enforced when you use this workaround.

    For more information, see https://bugzilla.redhat.com/show_bug.cgi?id=1923887. (BZ#1866479)

  • This enhancement adds support for heterogeneous storage configurations at the edge. Operators can now deploy edge sites with storage and sites without storage within the same DCN deployment. (BZ#1882058)
  • The Block Storage backup service sometimes needs access to files on the host that would otherwise not be available in the container running the service. This enhancement adds the CinderBackupOptVolumes parameter, which you can use to specify additional container volume mounts for the Block Storage backup service. (BZ#1891828)
  • Before this update, TLS-E on pre-provisioned nodes failed with the message: "--server cannot be used without providing --domain". With this update, the IDM domain name is detected by first resolving "ipa-ca" through DNS, then doing a reverse DNS lookup on the resulting IP address. It might be necessary to add the PTR record, which is required for the reverse lookup, manually. (BZ#1874936)
  • Before this update, you were required to use the openstack overcloud external-upgrade run --tags online_upgrade command to perform online database updates when upgrading from RHOSP 15 to RHOSP 16.1. With this update, you can now use the openstack overcloud external-update run --tags online_upgrade command. (BZ#1884556)
  • Before this update, if you had NovaComputeEnableKsm enabled and you were using Red Hat Subscription Management to register the overcloud Compute nodes, the qemu-kvm-common package failed to install. This was because the configuration was sometimes applied before the Compute nodes were registered to the required repositories.

    With this update, NovaComputeEnableKsm is enabled only after the Compute nodes are registered to the required repositories by using Red Hat Subscription Management, which ensures that the qemu-kvm-common package is successfully installed. (BZ#1895894)

  • Before this update, the connection data created by an iSCSI/LVM Block Storage back end was not stored persistently, which resulted in volumes not being accessible after a reboot. With this update, the connection data is stored persistently, and the volumes are accessible after a system reboot. (BZ#1898484)
  • Before this update, when deployed at an edge site the Image (glance) service was not configured to access the Key Manager (barbican) service running on the central site’s control plane. This resulted in the Image services running on edge sites being unable to access encryption keys stored in the Key Manager service.

    With this update, Image services running on edge sites are now configured to access the encryption keys stored in the Key Manager service. (BZ#1899761)

Changes to the puppet-collectd component:

  • With this enhancement, you can configure the format of the plugin instance for the collectd virt plugin by using the ExtraConfig parameter collectd::plugin::virt::plugin_instance_format. This allows more granular metadata to be exposed in the metrics label for virtual machine instances, such as on which host the instance is running. (BZ#1878191)
  • Before this update, when you configured the collectd::plugin::virt::hostname_format parameter with multiple values, director wrapped the values in double quotes. This caused the virt plugin to fail to load. With this update, when configuring collectd::plugin::virt::hostname_format, director no longer wraps multiple values in double quotes. (BZ#1902142)

Changes to the python-network-runner component:

Changes to the python-networking-ovn component:

  • With this enhancement, you can control multicast over the external networks and avoid cluster autoforming over external networks instead of only the internal networks. (BZ#1575512)
  • Before this update, the OVN mechanism driver did not correctly merge its agent list with those stored in the Networking (neutron) service database. With this update, the results from the OVN and Networking service database are merged before the API returns the result. (BZ#1828889)
  • This enhancement adds support for vlan transparency in the ML2/OVN mechanism driver with vlan and geneve network type drivers.

    With vlan transparency, you can manage vlan tags by using instances on Networking (neutron) service networks. You can create vlan interfaces on an instance and use any vlan tag without affecting other networks. The Networking service is not aware of these vlan tags.

    NOTE

  • When using vlan transparency on a vlan type network, the inner and outer ethertype of the packets is 802.1Q (0x8100).
  • The ML2/OVN mechanism driver does not support vlan transparency on flat provider networks.

(BZ#1846019)

Changes to the python-os-brick component:

  • Before this update, instances that were created on a RHOSP 13 environment with PowerFlex, VxFlex and ScaleIO volume attachments failed restarting after an upgrade to RHOSP 16.x. This was because the RHOSP 16.x Compute service uses a new PowerFlex driver connection property to access volume attachments, which is not present in the connection properties of volumes attached to instances running on a RHOSP 13 environment. With this update, the error is no longer thrown if this connection property is missing, and instances with PowerFlex volume attachments created on a RHOSP 13 environment continue to function correctly after upgrading to RHOSP 16.x.

Changes to the python-paunch component:

  • Before this update, if a user configured the ContainerImagePrepare parameter to use a custom tag, such as 'tag: "latest"' or 'tag: "16.1"', instead of the standard 'tag_from_label: "{version}-{release}"', the containers did not update to the latest container images.

    With this update, the container images are always fetched anytime a user runs a deployment action, including updates, and the image ID is checked against the running container to see if it needs to be rebuilt to consume the latest image. Containers are now always refreshed during deployment actions and restarted if they are updated.

    Note

    This is a change from previous versions where the deployment checked only that the image existed rather than always fetching the image. If a user is reusing tags, for example, "latest", the containers might be updated on nodes if you perform actions such as scaling out. It is not recommended to use "latest" unless you are controlling container tags by using a Satellite server deployment.

    (BZ#1881476)

Changes to the python-tripleoclient component:

  • Before this update, live migration failed when upgrading a TLS everywhere environment with local ephemeral storage and UseTLSTransportForNbd set to "False". This occurred because the default value of the UseTLSTransportForNbd configuration had changed from "False" in RHOSP 13 to "True" in RHOSP 16.x, which resulted in the correct certifications not being included in the QEMU process containers.

    With this update, director checks the configuration of the previously deployed environment for global_config_settings and uses it to ensure that the UseTLSTransportForNbd state stays the same in the upgrade as on previous deployment. If global_config_settings exists in the configuration file, then director checks the configuration of the use_tls_for_nbd key. If global_config_settings does not exist, the director evaluates the hieradata key nova::compute::libvirt::qemu::nbd_tls. Keeping the UseTLSTransportForNbd state the same in the upgraded deployment as on previous deployment ensures that live migration works. (BZ#1906698)

4.6. RHBA-2021:2097 — Red Hat OpenStack Platform 16.1.6 director bug fix advisory

Changes to the openstack-cinder component:

  • In prior releases, the SolidFire driver created a duplicate volume whenever it retried an API request. This led to unexpected behavior due to the accumulation of unused volumes.

    With this update, the Block Storage service (cinder) checks for existing volume names before it creates a volume. When Block Storage service detects a read timeout, it immediately checks for volume creation to prevent invalid API calls. This update also adds the sf_volume_create_timeout option for the SolidFire driver so that you can set an appropriate timeout value for your environment. (BZ#1939398)

  • This update fixes a bug that prevented cinder list from listing volumes when multiple filters were passed. (BZ#1843788)
  • This update adds CHAP support to the Dell EMC PowerStore driver. (BZ#1905231)
  • In prior releases, cinder NEC driver backups failed when the object was a snapshot. This occurred because the snapshot argument does not have the volume_attachment attribute. With this update, backups no longer refer to the volume_attachment attribute when the argument is snapshot. (BZ#1910855)
  • This update fixes an issue that caused some API calls, such as create snapshot, to fail with an xNotPrimary error during workload re-balancing operations.

    When SolidFire is under heavy load or being upgraded, the SolidFire cluster might re-balance cluster workload by automatically moving connections from primary to secondary nodes. Previously, some API calls failed with an xNotPrimary error during these workload balance operations and were not retried.

    This update fixes the issue by adding the xNotPrimary exception to the SolidFire driver list of retryable exceptions. (BZ#1947474)

Changes to the openstack-heat component:

  • This update makes it possible to use OS::Heat:Delay resources in heat templates. Previously, a variable naming conflict caused an assertion error during attempted completion of an OS::Heat::Delay resource. A variable was renamed to eliminate the conflict. (BZ#1868543)

Changes to the openstack-nova component:

  • When an instance is created, the Compute (nova) service sanitizes the instance display name to generate a valid host name when DNS integration is enabled in the Networking (neutron) service.

    Before this update, the sanitization did not replace periods ('.') in instance names, for example, 'rhel-8.4'. This could result in display names being recognized as Fully Qualified Domain Names (FQDNs) which produced invalid host names. When instance names contained periods and DNS integration was enabled in the Networking service, the Networking service rejected the invalid host name, which resulted in a failure to create the instance and a HTTP 500 server error from the Compute service.

    With this update, periods are now replaced by hyphens in instance names to prevent host names being parsed as FQDNs. You can continue to use free-form strings for instance display names. (BZ#1872314)

Changes to the openstack-tripleo-common component:

  • This update modifies the registry metadata creator to handle containers with and without namespaces in their URI. On the undercloud you can now manage containers that comply with the following formats:

    undercloud_host:port/namespace/container:tag undercloud_host:port/container:tag

    Red Hat does not support more complex namespaces, such as undercloud_host:port/name/space/container:tag, when pushing to the undercloud. (BZ#1919445)

Changes to the openstack-tripleo-heat-templates component:

  • After upgrading with the Leapp utility, Compute with OVS-DPDK workload does not function properly. Choose one of the following workaround options:
  • remove /etc/modules-load.d/vfio-pci.conf, before compute upgrade
  • restart compute ovs after compute upgrade. (BZ#1895887)
  • This update fixes a configuration problem that caused Leapp upgrades to stop and fail while executing on a CephStorage node.

    Previously, CephStorage nodes were incorrectly configured to consume OpenStack highavailability, advanced-virt, and fast-datapath repos during Leapp upgrades.

    Now UpgradeLeappCommand options is configurable on a per-node basis, and uses the correct default for CephStorage nodes, and Leapp upgrades succeed for CephStorage nodes. (BZ#1936419)

Changes to the validations-common component:

  • This update fixes a bug that caused failure of validations before openstack undercloud upgrade in some cases. Before this upgrade, a lack of permissions needed to access the requested logging directory sometimes resulted in the following failures:

    • Failure to log validation results
    • Failure of the validation run
    • Failure of artifacts collection from validation.

      This update adds a fallback logging directory. Validation results are logged and artifacts collected. (BZ#1895045)