Chapter 4. Technical Notes

This chapter supplements the information contained in the text of Red Hat OpenStack Platform "Train" errata advisories released through the Content Delivery Network.

4.1. RHEA-2020:3148 Red Hat OpenStack Platform 16.1 general availability advisory

The bugs contained in this section are addressed by advisory RHBA-2020:3148. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2020:3148.html.

Changes to the ansible-role-atos-hsm component:

  • With this enhancement, you can use ATOS HSM deployment with HA mode. (BZ#1676989)

Changes to the collectd component:

Changes to the openstack-cinder component:

  • With this enhancement, you can revert Block Storage (cinder) volumes to the most recent snapshot, if supported by the driver. This method of reverting a volume is more efficient than cloning from a snapshot and attaching a new volume. (BZ#1686001)
  • Director can now deploy the Block Storage Service in an active/active mode. This deployment scenario is supported only for Edge use cases. (BZ#1700402)
  • This update includes the following enhancements:

    • Support for revert-to-snapshot in VxFlex OS driver
    • Support for volume migration in VxFlex OS driver
    • Support for OpenStack volume replication v2.1 in VxFlex OS driver
    • Support for VxFlex OS 3.5 in the VxFlex OS driver

Changes to the openstack-designate component:

  • DNS-as-a-Service (designate) returns to technology preview status in Red Hat OpenStack Platform 16.1. (BZ#1603440)

Changes to the openstack-glance component:

  • The Image Service (glance) now supports multi stores with the Ceph RBD driver. (BZ#1225775)
  • In Red Hat OpenStack Platform 16.1, you can use the Image service (glance) to copy existing image data into multiple stores with a single command. This removes the need for the operator to copy data manually and update image locations. (BZ#1758416)
  • In Red Hat OpenStack Platform 16.1, you can use the Image Service (glance) to copy existing image data into multiple stores with a single command. This removes the need for the operator to copy data manually and update image locations. (BZ#1758420)
  • With this update, when using Image Service (glance) multi stores, the image owner can delete an Image copy from a specific store. (BZ#1758424)

Changes to the openstack-ironic component:

  • A regression was introduced in ipmitool-1.8.18-11 that caused IPMI access to take over 2 minutes for certain BMCs that did not support the "Get Cipher Suites". As a result, introspection could fail and deployments could take much longer than previously.

    With this update, ipmitool retries are handled differently, introspection passes, and deployments succeed.

    Note

    This issue with ipmitool is resolved in ipmitool-1.8.18-17. (BZ#1831893)

Changes to the openstack-ironic-python-agent component:

  • Before this update, there were no retries and no timeout when downloading a final instance image with the direct deploy interface in ironic. As a result, the deployment could fail if the server that hosts the image fails to respond.

    With this update, the image download process attempts 2 retries and has a connection timeout of 60 seconds. (BZ#1827721)

Changes to the openstack-neutron component:

  • Before this update, it was not possible to deploy the overcloud in a Distributed Compute Node (DCN) or spine-leaf configuration with stateless IPv6 on the control plane. Deployments in this scenario failed during ironic node server provisioning. With this update, you can now deploy successfully with stateless IPv6 on the control plane. (BZ#1803989)

Changes to the openstack-tripleo-common component:

  • When you update or upgrade python3-tripleoclient, Ansible does not receive the update or upgrade and Ansible or ceph-ansible tasks fail.

    When you update or upgrade, ensure that Ansible also receives the update so that playbook tasks can run successfully. (BZ#1852801)

  • With this update, the Red Hat Ceph Storage dashboard uses Ceph 4.1 and a Grafana container based on ceph4-rhel8. (BZ#1814166)
  • Before this update, during Red Hat Ceph Storage (RHCS) deployment, Red Hat OpenStack Platform (RHOSP) director generated the CephClusterFSID by passing the desired FSID to ceph-ansible and used the Python uuid1() function. With this update, director uses the Python uuid4() function, which generates UUIDs more randomly. (BZ#1784640)

Changes to the openstack-tripleo-heat-templates component:

  • There is an incomplete definition for TLS in the Orchestration service (heat) when you update from 16.0 to 16.1, and the update fails.

    To prevent this failure, you must set the following parameter and value: InternalTLSCAFile: ''. (BZ#1840640)

  • With this enhancement, you can configure Red Hat OpenStack Platform to use an external, pre-existing Ceph RadosGW cluster. You can manage this cluster externally as an object-store for OpenStack guests. (BZ#1440926)
  • With this enhancement, you can use director to deploy the Image Service (glance) with multiple image stores. For example, in a Distributed Compute Node (DCN) or Edge deployment, you can store images at each site. (BZ#1598716)
  • With this enhancement, HTTP traffic that travels from the HAProxy load balancer to Red Hat Ceph Storage RadosGW instances is encrypted. (BZ#1701416)
  • With this update, you can deploy pre-provisioned nodes with TLSe using the new 'tripleo-ipa' method. (BZ#1740946)
  • Before this update, in deployments with an IPv6 internal API network, the Block Storage Service (cinder) and Compute Service (nova) were configured with a malformed glance-api endpoint URI. As a result, cinder and nova services located in a DCN or Edge deployment could not access the Image Service (glance).

    With this update, the IPv6 addresses in the glance-api endpoint URI are correct and the cinder and nova services at Edge sites can access the Image Service successfully. (BZ#1815928)

  • With this enhancement, FreeIPA has DNS entries for the undercloud and overcloud nodes. DNS PTR records are necessary to generate certain types of certificates, particularly certificates for cinder active/active environments with etcd. You can disable this functionality with the IdMModifyDNS parameter in an environment file. (BZ#1823932)
  • In this release of Red Hat OpenStack Platform, you can no longer customize the Red Hat Ceph Storage cluster admin keyring secret. Instead, the admin keyring secret is generated randomly during initial deployment. (BZ#1832405)
  • Before this update, stale neutron-haproxy-qdhcp-* containers remained after you deleted the related network. With this update, all related containers are cleaned correctly when you delete a network. (BZ#1832720)
  • Before this update, the ExtraConfigPre per_node script was not compatible with Python 3. As a result, the overcloud deployment failed at the step TASK [Run deployment NodeSpecificDeployment] with the message SyntaxError: invalid syntax.

    With this update, the ExtraConfigPre per_node script is compatible with Python 3 and you can provision custom per_node hieradata. (BZ#1832920)

  • With this update, the swift_rsync container runs in unprivileged mode. This makes the swift_rsync container more secure. (BZ#1807841)
  • PowerMax configuration options have changed since Newton. This update includes the latest PowerMax configuration options and supports both iSCSI and FC drivers.

    The CinderPowermaxBackend parameter also supports multiple back ends. CinderPowermaxBackendName supports a list of back ends, and you can use the new CinderPowermaxMultiConfig parameter to specify parameter values for each back end. For example syntax, see environments/cinder-dellemc-powermax-config.yaml. (BZ#1813393)

  • Support for Xtremio Cinder Backend

    Updated the Xtremio cinder backend to support both iSCSI and FC drivers. It is also enhanceded to support multiple backends. (BZ#1852082)

  • Red Hat OpenStack Platform 16.1 includes tripleo-heat-templates support for VXFlexOS Volume Backend. (BZ#1852084)
  • Red Hat OpenStack Platform 16.1 includes support for SC Cinder Backend. The SC Cinder back end now supports both iSCSI and FC drivers, and can also support multiple back ends. You can use the CinderScBackendName parameter to list back ends, and the CinderScMultiConfig parameter to specify parameter values for each back end. For an example configuration file, see environments/cinder-dellemc-sc-config.yaml. (BZ#1852087)
  • PowerMax configuration options have changed since Newton. This update includes the latest PowerMax configuration options and supports both iSCSI and FC drivers.

    The CinderPowermaxBackend parameter also supports multiple back ends. CinderPowermaxBackendName supports a list of back ends, and you can use the new CinderPowermaxMultiConfig parameter to specify parameter values for each back end. For example syntax, see environments/cinder-dellemc-powermax-config.yaml. (BZ#1852088)

Changes to the openstack-tripleo-validations component:

  • Before this update, the data structure format that the ceph osd stat -f json command returns changed. As a result, the validation to stop the deployment unless a certain percentage of Red Hat Ceph Storage (RHCS) OSDs are running did not function correctly, and stopped the deployment regardless of how many OSDs were running.

    With this update, the new version of openstack-tripleo-validations computes the percentage of running RHCS OSDs correctly and the deployment stops early if a percentage of RHCS OSDs are not running. You can use the parameter CephOsdPercentageMin to customize the percentage of RHCS OSDs that must be running. The default value is 66%. Set this parameter to 0 to disable the validation. (BZ#1845079)

Changes to the puppet-cinder component:

Changes to the puppet-tripleo component:

  • Before this update, the etcd service was not configured properly to run in a container. As a result, an error occurred when the service tried to create the TLS certificate. With this update, the etcd service runs in a container and can create the TLS certificate. (BZ#1804079)

Changes to the python-cinderclient component:

  • Before this update, the latest volume attributes were not updated during poll, and the volume data was incorrect on the display screen. With this update, volume attributes update correctly during poll and the correct volume data appears on the display screen. (BZ#1594033)

Changes to the python-networking-ovn component:

  • Because of a core OVN bug, virtual machines with floating IP (FIP) addresses cannot route to other networks in an ML2/OVN deployment with distributed virtual routing (DVR) enabled. Core OVN sets a bad next hop when routing SNAT IPv4 traffic from a VM with a floating ip with DVR enabled. Instead of the gateway IP, OVN sets the destination IP. As a result, the router sends an ARP request for an unknown IP instead of routing the request to the gateway.

    Workaround: Before you deploy a new overcloud with ML2/OVN, disable DVR by setting NeutronEnableDVR: false in an environment file. If you have ML2/OVN in an existing deployment, complete the following steps:

    1) Set enable_distributed_floating_ips to 'False' in the neutron.conf file:

    (undercloud) [stack@undercloud-0 ~]$ ansible -i /usr/bin/tripleo-ansible-inventory -m shell -b -a "crudini --set /var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/ml2_conf.ini ovn enable_distributed_floating_ip False" Controller

    2) Restart neutron server containers:

    (undercloud) [stack@undercloud-0 ~]$ ansible -i /usr/bin/tripleo-ansible-inventory -m shell -b -a "podman restart neutron_api" Controller

    3) Centralize all of the FIP traffic through gateway nodes. Run the following command on any overcloud node:

    $ export NB=$(sudo ovs-vsctl get open . external_ids:ovn-remote | sed -e 's/\"//g' | sed -e 's/6642/6641/g') $ alias ovn-nbctl='sudo podman exec ovn_controller ovn-nbctl --db=$NB' $ for fip in $(ovn-nbctl --bare --columns _uuid find nat type=dnat_and_snat); do ovn-nbctl clear NAT $fip external_mac; done

    When the fix is available in RHOSP 16.1.1, you can re-enable distributed FIP traffic:

    1) Set enable_distributed_floating_ips back to 'True' in the neutron.conf file:

    (undercloud) [stack@undercloud-0 ~]$ ansible -i /usr/bin/tripleo-ansible-inventory -m shell -b -a "crudini --set /var/lib/config-data/puppet-generated/neutron/etc/neutron/plugins/ml2/ml2_conf.ini ovn enable_distributed_floating_ip True" Controller

    2) Restart neutron server containers:

    (undercloud) [stack@undercloud-0 ~]$ ansible -i /usr/bin/tripleo-ansible-inventory -m shell -b -a "podman restart neutron_api" Controller

    3) Trigger the update in all of the FIPs. Run the following command on any overcloud node:

    $ export NB=$(sudo ovs-vsctl get open . external_ids:ovn-remote | sed -e 's/\"//g' | sed -e 's/6642/6641/g') $ alias ovn-nbctl='sudo podman exec ovn_controller ovn-nbctl --db=$NB' $ for i in $(ovn-nbctl --bare --columns logical_port find nat type=dnat_and_snat); do ovn-nbctl set logical_switch_port $i up=false; done

    Note

    Disabling DVR causes traffic to be centralized. All L3 traffic travels through the Controller/Networker nodes. This might affect scale, data plane performance, and throughput. (BZ#1836963)

Changes to the python-tripleoclient component:

  • With this enhancement, you can use the --limit, --skip-tags, and --tags Ansible options in the openstack overcloud deploy command. This is particularly useful when you want to run the deployment on specific nodes, for example, during scale-up operations. (BZ#1767581)
  • With this enhancement, there are new options in the openstack tripleo container image push command that you can use to provide credentials for the source registry. The new options are --source-username and --source-password.

    Before this update, you could not provide credentials when pushing a container image from a source registry that requires authentication. Instead, the only mechanism to push the container was to pull the image manually and push from the local system. (BZ#1811490)

  • With this update, the container_images_file parameter is now a required option in the undercloud.conf file. You must set this parameter before you install the undercloud.

    With the recent move to use registry.redhat.io as the container source, you must authenticate when you fetch containers. For the undercloud, the container_images_file is the recommended option to provide the credentials when you perform the installation. Before this update, if this parameter was not set, the deployment failed with authentication errors when trying to fetch containers. (BZ#1819016) === RHBA-2020:3542 — Red Hat OpenStack Platform 16.1.1 general availability advisory

The bugs contained in this section are addressed by advisory RHBA-2020:3542. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2020:3542.html.

Changes to the openstack-tripleo component:

  • The overcloud deployment steps included an older Ansible syntax that tagged the tripleo-bootstrap and tripleo-ssh-known-hosts roles as common_roles. This older syntax caused Ansible to run tasks tagged with the common_roles when Ansible did not use the common_roles tag. This syntax resulted in errors during the 13 to 16.1 system_upgrade process.

    This update uses a newer syntax to tag the tripleo-bootstrap and tripleo-ssh-known-hosts roles as common_roles. Errors do not appear during the 13 to 16.1 system_upgrade process and you no longer include the --playbook upgrade_steps_playbook.yaml option to the system_upgrade process as a workaround. (BZ#1851914)

Changes to the openstack-tripleo-heat-templates component:

  • This update fixes a GRUB parameter naming convention that led to unpredictable behaviors on compute nodes during leapp upgrades.

    Previously, the presence of the obsolete "TRIPELO" prefix on GRUB parameters caused problems.

    The file /etc/default/grub has been updated with GRUB for the tripleo kernel args parameter so that leapp can upgrade it correctly. This is done by adding "upgrade_tasks" to the service "OS::TripleO::Services::BootParams", which is a new service added to all roles in the roles_data.yaml file. (BZ#1858673)

  • This update fixes a problem that caused baremetal nodes to become non-responsive during Leapp upgrades.

    Previously, Leapp did not process transient interfaces like SR-IOV virtual functions (VF) during migration. As a result, Leapp did not find the VF interfaces during the upgrade, and nodes entered an unrecoverable state.

    Now the service "OS::TripleO::Services::NeutronSriovAgent" sets the physical function (PF) to remove all VFs, and migrates workloads before the upgrade. After the successful Leapp upgrade, os-net-config runs again with the "--no-activate" flag to re-establish the VFs. (BZ#1866372)

  • This director enhancement automatically installs the Leapp utility on overcloud nodes to prepare for OpenStack upgrades. This enhancement includes two new Heat parameters: LeappRepoInitCommand and LeappInitCommand. In addition, if you have the following repository defaults, you do not need to pass UpgradeLeappCommandOptions values.

    --enablerepo rhel-8-for-x86_64-baseos-eus-rpms --enablerepo rhel-8-for-x86_64-appstream-eus-rpms --enablerepo rhel-8-for-x86_64-highavailability-eus-rpms --enablerepo advanced-virt-for-rhel-8-x86_64-rpms --enablerepo ansible-2.9-for-rhel-8-x86_64-rpms --enablerepo fast-datapath-for-rhel-8-x86_64-rpms

    (BZ#1845726)

  • If you do not set the UpgradeLevelNovaCompute parameter to '', live migrations are not possible when you upgrade from RHOSP 13 to RHOSP 16. (BZ#1849235)
  • This update fixes a bug that prevented the successful deployment of transport layer security (TLS) everywhere with public TLS certifications. (BZ#1852620)
  • Before this update, director did not set the noout flag on Red Hat Ceph Storage OSDs before running a Leapp upgrade. As a result, additional time was required for the the OSDs to rebalance after the upgrade.

    With this update, director sets the noout flag before the Leapp upgrade, which accelerates the upgrade process. Director also unsets the noout flag after the Leapp upgrade. (BZ#1853275)

  • Before this update, the Leapp upgrade could fail if you had any NFS shares mounted. Specifically, the nodes that run the Compute Service (nova) or the Image Service (glance) services hung if they used an NFS mount.

    With this update, before the Leapp upgrade, director unmounts /var/lib/nova/instances, /var/lib/glance/images, and any Image Service staging area that you define with the GlanceNodeStagingUri parameter. (BZ#1853433)

Changes to the openstack-tripleo-validations component:

  • This update fixes a Red Hat Ceph Storage (RHCS) version compatibility issue that caused failures during upgrades from Red Hat OpenStack platform 13 to 16.1. Before this fix, validations performed during the upgrade worked with RHCS3 clusters but not RHCS4 clusters. Now the validation works with both RHCS3 and RHCS4 clusters. (BZ#1852868)

Changes to the puppet-tripleo component:

  • Before this update, the Red Hat Ceph Storage dashboard listener was created in the HA Proxy configuration, even if the dashboard is disabled. As a result, upgrades of OpenStack with Ceph could fail.

    With this update, the service definition has been updated to distinguish the Ceph MGR service from the dashboard service so that the dashboard service is not configured if it is not enabled and upgrades are successful. (BZ#1850991)