Red Hat Training

A Red Hat training course is available for Red Hat OpenStack Platform

Chapter 4. Technical Notes

This chapter supplements the information contained in the text of Red Hat OpenStack Platform "Queens" errata advisories released through the Content Delivery Network.

4.1. RHEA-2018:2086 — Red Hat OpenStack Platform 13.0 Enhancement Advisory

The bugs contained in this section are addressed by advisory RHEA-2018:2086. Further information about this advisory is available at link: https://access.redhat.com/errata/RHEA-2018:2086.

ceph-ansible

The ceph-ansible utility does not always remove the ceph-create-keys container from the same node where it was created.

Because of this, the deployment may fail with the message "Error response from daemon: No such container: ceph-create-keys." This may affect any ceph-ansible run, including fresh deployments, that have: * multiple compute notes or * a custom role behaving as ceph client which is also hosting a service consuming ceph.

gnocchi

The openstack-gnocchi packages have been renamed to gnocchi. The openstack- prefix was removed because of an upstream project scoping change. Gnocchi has been moved out of the OpenStack umbrella and is maintained as a stand-alone project.

opendaylight

Connecting to an external IP fails when associating a floating IP to an instance then disassociating the floating IP. This situation happens in a tenant VLAN network when: * a VM spawned on a non-NAPT switch is associated with a floating IP and * the floating IP is removed. This results in a missing flow (sporadically) in the FIB table of NAPT switch.

Due to the missing FIB table entry, the VM loses connectivity to the public network.

Associating the floating IP to the VM restores connectivity to the public network. As long as the floating IP is associated with the VM, it will be able to connect to the internet. However, you will lose a public IP/floating IP from the external network.

openstack-cinder

Previously, the cinder service had to be restarted twice when performing an offline upgrade because of the rolling upgrade mechanism.

The double system restart can be skipped with the new optional parameter -called "--bump-versions"- added to the cinder-manage db sync command.

The Block Storage service (cinder) uses a synchronization lock to prevent duplicate entries in the volume image cache. The scope of the lock was too broad and caused simultaneous requests to create a volume from an image to compete for the lock, even when the image cache was not enabled.

These simultaneous requests to create a volume from an image would be serialized and not run in parallel.

As a result, the synchronization lock has been updated to minimize the scope of the lock and to take effect only when the volume image cache is enabled.

Now, simultaneous requests to create a volume from an image run in parallel when the volume image cache is disabled. When the volume image cache is enabled, locking is minimized to ensure only a single entry is created in the cache.

openstack-manila

The Shared File System service (manila) now provides IPv6 access rule support with NetApp ONTAP cDOT driver, which lets you use manila with IPv6 environments.

As a result, the Shared File System service now supports exporting shares backed by NetApp ONTAP back ends over IPv6 networks. Access to the exported shares is controlled by IPv6 client addresses.

The Shared File System service (manila) supports mounting shared file systems backed by a Ceph File System (CephFS) via the NFSv4 protocol. NFS-Ganesha servers operating on Controller nodes are used to export CephFS to tenants with high availability (HA). Tenants are isolated from one another and may only access CephFS through the provided NFS gateway interface. This new feature is fully integrated into director, enabling CephFS back end deployment and configuration for the Shared File System service.

openstack-neutron

When an interface is added or removed to or from a router and isolated metadata is enabled on the DHCP Agent, the metadata proxy for that network is not updated.

As such, instances would not able to fetch metadata if they are on a network which is not connected to a router.

You need to update metadata proxies when a router interface is added or removed. The instances will then be able to fetch metadata from the DHCP namespace when their networks become isolated.

openstack-selinux

Previously, the virtlogd service logged redundant AVC denial errors when a guest virtual machine was started. With this update, the virtlogd service no longer attempts to send shutdown inhibition calls to systemd, which prevents the described errors from occurring.

openstack-swift

The Object Store service (swift) can now integrate with Barbican to transparently encrypt and decrypt your stored (at-rest) objects. At-rest encryption is distinct from in-transit encryption and refers to the objects being encrypted while being stored on disk.

Swift objects are stored as clear text on disk. These disks can pose a security risk if not properly disposed of when they reach end-of-life. Encrypting the objects mitigates that risk.

Swift performs these encryption tasks transparently, with the objects being automatically encrypted when uploaded to swift, then automatically decrypted when served to a user. This encryption and decryption is done using the same (symmetric) key, which is stored in Barbican.

openstack-tripleo-common

Octavia does not scale to practical workloads because the default configured quotas for the "service" project limits the number of Octavia load balancers that can be created in the overcloud.

To mitigate this problem, as the overcloud admin user, set the required quotas to unlimited or some sufficiently large value. For example, run the following commands on the undercloud:

# source ~/overcloudrc
# openstack quota set --cores -1 --ram -1 --ports -1 --instances -1 --secgroups -1 service

The tripleo.plan_management.v1.update_roles workflow did not pass the overcloud plan name (swift container name) or zaqar queue name to the sub-workflow it triggered. This caused incorrect behaviour when using an overcloud plan name other than the default ('overcloud'). This fix correctly passes these parameters and restores the correct behaviour.

The 'docker kill' command does not exit if the container is set to automatically restart. If a user attempts to run 'docker kill <container>', it may hang indefinitely. In this case, CTRL+C will stop the command.

To avoid the problem, use 'docker stop' (instead of 'docker kill') to stop a containerized service.

Cause: The "openstack overcloud node configure" command would only take image names not image IDs for "deploy-kernel" and "deploy-ramdisk" parameters. Image IDs are now accepted after this fix.

openstack-tripleo-heat-templates

This enhancement adds support for deploying RT enabled compute nodes from director alongside "regular" compute nodes.

  1. Based on tripleo-heat-templates/environments/compute-real-time-example.yaml, create a compute-real-time.yaml environment file that sets the parameters for the ComputeRealTime role with at least the correct values for:

    • IsolCpusList and NovaVcpuPinSet: a list of CPU cores that should be reserved for real-time workloads. This depends on your CPU hardware on your real-time compute nodes.
    • KernelArgs: set to "default_hugepagesz=1G hugepagesz=1G hugepages=X" with X depending on the number of guests and how much memory they will have.
  2. Build and upload the overcloud-realtime-compute image:

    • Prepare the repos (for CentOS):

    • openstack overcloud image build --image-name overcloud-realtime-compute --config-file /usr/share/openstack-tripleo-common/image-yaml/overcloud-realtime-compute.yaml --config-file /usr/share/openstack-tripleo-common/image-yaml/overcloud-realtime-compute-centos7.yaml
    • openstack overcloud image upload --update-existing --os-image-name overcloud-realtime-compute.qcow2
  3. Create roles_data.yaml with ComputeRealTime and all other required roles, for example: openstack overcloud roles generate -o ~/rt_roles_data.yaml Controller ComputeRealTime …​ and assign the ComputeRealTime role to the real-time nodes in one of the usual ways. See https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/custom_roles.html
  4. Deploy the overcloud:

    openstack overcloud deploy --templates -r ~/rt_roles_data.yaml -e ./tripleo-heat-templates/environments/host-config-and-reboot.yaml -e ./compute-real-time.yaml [...]

The glance-direct method requires a shared staging area when used in a HA configuration. Image uploads using the 'glance-direct' method may fail in an HA environment if a common staging area is not present. Incoming requests to the controller nodes are distributed across the available controller nodes. One controller handles the first step and another controller handles the second request with both controllers writing the image to different staging areas. The second controller will not have access to the same staging area used by the controller handling the first step.

Glance supports multiple image import methods, including the 'glance-direct' method. This method uses a three-step approach: creating an image record, uploading the image to a staging area, and then transferring the image from the staging area to the storage backend so the image becomes available. In an HA setup (i.e., with 3 controller nodes), the glance-direct method requires a common staging area using a shared file system across the controller nodes.

The list of enabled Glance import methods can now be configured. The default configuration does not enable the 'glance-direct' method (web-download is enabled by default). To avoid the issue and reliably import images to Glance in an HA environment, do not enable the 'glance-direct' method.

The openvswitch systemd script deletes the /run/openvswitch folder when stopping it in the host. The /run/openvswitch path inside the ovn-controller container becomes a stale directory. When the service is started again, it recreates the folder. In order for ovn-controller to access this folder again, the folder has to be remounted or the ovn-controller container restarted.

A new CinderRbdExtraPools Heat parameter has been added which specifies a list of Ceph pools for use with RBD backends for Cinder. An extra Cinder RBD backend driver is created for each pool in the list. This is in addition to the standard RBD backend driver associated with the CinderRbdPoolName. The new parameter is optional and defaults to an empty list. All of the pools are associated with a single Ceph cluster.

Redis is unable to correctly replicate data across nodes in a HA deployment with TLS enabled. Redis follower nodes will not contain any data from the leader node. It is recommended to disable TLS for Redis deployments.

This enhancement adds support for sending metrics data to a Gnocchi DB instance.

The following new parameters for collectd composable service were added. If CollectdGnocchiAuthMode is set to 'simple', then CollectdGnocchiProtocol, CollectdGnocchiServer, CollectdGnocchiPort and CollectdGnocchiUser are taken into account for configuration.

If CollectdGnocchiAuthMode is set to 'keystone', then CollectdGnocchiKeystone* parameters are taken into account for configuration.

Following is a detailed description of added parameters:

CollectdGnocchiAuthMode

type: string

description: Type of authentication Gnocchi server is using. Supported values are’simple' and 'keystone'.

default: 'simple'

CollectdGnocchiProtocol

type: string

description: API protocol Gnocchi server is using.

default: 'http'

CollectdGnocchiServer

type: string

description: The name or address of a gnocchi endpoint to which we should send metrics.

default: nil

CollectdGnocchiPort

type: number

description: The port to which we will connect on the Gnocchi server.

default: 8041

CollectdGnocchiUser

type: string

description: Username for authenticating to the remote Gnocchi server using simple authentication.

default: nil

CollectdGnocchiKeystoneAuthUrl

type: string

description: Keystone endpoint URL to authenticate to.

default: nil

CollectdGnocchiKeystoneUserName

type: string

description: Username for authenticating to Keystone.

default: nil

CollectdGnocchiKeystoneUserId

type: string

description: User ID for authenticating to Keystone.

default: nil

CollectdGnocchiKeystonePassword

type: string

description: Password for authenticating to Keystone

default: nil

CollectdGnocchiKeystoneProjectId

type: string

description: Project ID for authenticating to Keystone.

default: nil

CollectdGnocchiKeystoneProjectName

type: string

description: Project name for authenticating to Keystone.

default: nil

CollectdGnocchiKeystoneUserDomainId

type: string

description: User domain ID for authenticating to Keystone.

default: nil

CollectdGnocchiKeystoneUserDomainName

type: string

description: User domain name for authenticating to Keystone.

default: nil

CollectdGnocchiKeystoneProjectDomainId

type: string

description: Project domain ID for authenticating to Keystone.

default: nil

CollectdGnocchiKeystoneProjectDomainName

type: string

description: Project domain name for authenticating to Keystone.

default: nil

CollectdGnocchiKeystoneRegionName

type: string

description: Region name for authenticating to Keystone.

default: nil

CollectdGnocchiKeystoneInterface

type: string

description: Type of Keystone endpoint to authenticate to.

default: nil

CollectdGnocchiKeystoneEndpoint

type: string

description: Explicitly state Gnocchi server URL if you want to override Keystone value

default: nil

CollectdGnocchiResourceType

type: string

description: Default resource type created by the collectd-gnocchi plugin in Gnocchi to store hosts.

default: 'collectd'

CollectdGnocchiBatchSize

type: number

description: Minimum number of values Gnocchi should batch.

default: 10

BZ#1566376

The OVN metadata service was not being deployed in DVR based environment. Therefore, instances were not able to fetch metadata such as instance name, public keys, etc.

This patch enables the aforementioned service so that any booted instance can fetch metadata.

The Heat templates for Cinder backend services were triggering Puppet to deploy the cinder-volume service on the overcloud host, regardless of whether the service is meant to be deployed in a container. This caused the cinder-volume service to be deployed twice: in a container as well as on the host.

Because of this, the OpenStack volume operations (such as creating and attaching a volume) would occasionally fail when the operation was handled by the rogue cinder-volume service running on the host.

As a result, the Cinder backend heat templates have been updated to not deploy a second instance of the cinder-volume service.

A poorly performing Swift cluster used as a Gnocchi back end can generate 503 errors in the collectd log and "ConnectionError: ('Connection aborted.', CannotSendRequest())" errors in in gnocchi-metricd.conf. To mitigate the problem, increase the value of the CollectdDefaultPollingInterval parameter or improve the Swift cluster performance.

The manila-share service fails to initialize because changes to ceph-ansible’s complex ceph-keys processing generate incorrect content in the /etc/ceph/ceph.client.manila.keyring file.

To allow the manila-share service to initialize: 1) Make a copy of /usr/share/openstack/tripleo-heat-templates to use for the overcloud deploy.

2) Edit the …​/tripleo-heat-templates/docker/services/ceph-ansible/ceph-base.yaml file to change all triple backslashes in line 295 to single backslashes. Before: mon_cap: 'allow r, allow command \\\"auth del\\\", allow command \\\"auth caps\\\", allow command \\\"auth get\\\", allow command \\\"auth get-or-create\\\"' After: mon_cap: 'allow r, allow command \"auth del\", allow command \"auth caps\", allow command \"auth get\", allow command \"auth get-or-create\"'

3) Deploy the overcloud substituting the path to the copy of tripleo-heat-templates wherever /usr/share/openstack-tripleo-heat templates occurred in your original overcloud-deploy command.

The ceph key /etc/ceph/ceph.client.manila.keyring file will have proper contents and the manila-share service will initialize properly.

When configuring the cinder-volume service for HA, cinder’s DEFAULT/host configuration was set to "hostgroup". Other cinder services (cinder-api, cinder-scheduler, cinder-backup) would use "hostgroup" for their configuration, regardless of which overcloud node was running the service. Log messages from these services looked like they all originated from the same "hostgroup" host, which made it difficult to know which node generated the message.

When deploying for HA, cinder-volume’s backend_host is set to "hostgroup" instead of setting DEFAULT/host to that value. This ensures each node’s DEFAULT/host value is unique.

Consequently, log messages from cinder-api, cinder-scheduler, and cinder-backup are correctly associated with the node that generated the message.

After upgrading to a new release, Block Storage services (cinder) were stuck using the old RPC versions from the prior release. Because of this, all cinder API requests requiring the latest RPC versions failed.

When upgrading to a new release, all cinder RPC versions are updated to match the latest release.

python-cryptography

Since version 2.1, python-cryptography checks that the CNS Names used in certificates are compliant with IDN standards. If the found names do not follow this specification, cryptography will fail to validate the certificate and different errors may be found when using OpenStack command line interface or in OpenStack service logs.

After installing python-cryptography build, the initial import from RDO failed because it was missing Obsoletes. The RHEL 7 build of this package is correct and has right Obsoletes entries.

This fix adds the Obsoletes for python-cryptography.

python-ironic-tests-tempest

A tempest plugin (-tests) rpm installed before the upgrade fails after the OSP Release 13 upgrade. The initial upgrade packaging did not include the epoch commands needed to obsolete the old rpm. The sub-rpm is not shipped in OSP 13, and the Obsoletes in the new plugin rpm didn’t correctly Obsolete the right rpm.

To fix the issue, correct the obsoletes or manually uninstall the old -rpm and manually install the replacement plugin python2-*-tests-tempest.

python-networking-ovn

To help maintain consistency between the neutron and OVN databases, configuration changes are internally compared and verified in the backend. Each configuration change is assigned a revision number, and a scheduled task validates all create, update, and delete operations made to the databases.

This version introduces support for internal DNS resolution in networking-ovn. Although there are two know limitations, one is bz#1581332 which prevents proper resolution of internal fqdn requests via internal dns.

Please note that the extension is not configured by default by tripleo on the GA release. See bz#1577592 for a workaround.

When a subnet is created without a gateway, no DHCP options were added and instances on such subnets are not able to obtain DHCP.

The Metadata/DHCP port is used instead for this purpose so that instances can obtain an IP address. You must enable the metadata service. Instances on subnets without a external gateway are now able to obtain their IP addresses through DHCP via the OVN metadata/DHCP port.

The current L3 HA scheduler was not taking the priorities of the nodes into consideration. Therefore, all gateways were being hosted by the same node and the load was not distributed across candidates.

This fix implements an algorithm to select the least loaded node when scheduling a gateway router. Gateway ports are now being scheduled on the least loaded network node distributing the load evenly across them.

When a subport was reassigned to a different trunk on another hypervisor, it did not get its binding info updated and the subport did not transition to ACTIVE.

This fix clears up the binding info when the subport is removed from the trunk. The subport now transitions to ACTIVE when it is reassigned to another trunk port that resides on a different hypervisor.

python-os-brick

When using iSCSI discovery, the node startup configuration was reset from "automatic" to "default", which caused the services to not be started on reboot. This issue is fixing by restoring all startup values after each discovery.

python-zaqar-tests-tempest

Upgrades were having dependencies issues because the collection of tempest plugins were extracted from openstack-*-tests rpm subpackages during the Queens cycle. However, not all of the packaging had the right combination of Provides and Obsoletes. OSP 13 does not have the -tests (unittest sub-rpms).

When attempting to do upgrades with -tests installed from prior release cause failures due to dependencies issues.

To correct this issue, the Obsoletes for the older version of the -tests rpms they were extracted from have been added back.

4.2. RHSA-2018:2214 — Important: openstack-tripleo-heat-templates security update

The bugs contained in this section are addressed by advisory RHSA-2018:2214. Further information about this advisory is available at link: https://access.redhat.com/errata/RHSA-2018:2214.html.

openstack-tripleo-common

Logs from Ansible playbooks now include timestamps that provide information about the timing of actions during deployment, updates, and upgrades.

openstack-tripleo-heat-templates

Previously, overcloud updates failed due to stale cache in OpenDaylight. With this update, OpenDaylight is stopped and the stale cache is removed before upgrading to a new version. Level 1 updates work with OpenDaylight deployments. Level 2 updates are currently unsupported.

Enabling Octavia on an existing overcloud deployment reports as a success, but the Octavia API endpoints are not reachable because the firewall rules on the Controller nodes are misconfigured.

Workaround:

On all controller nodes, add firewall rules and make sure they are inserted before the DROP rule:

IPv4:
  # iptables -A INPUT -p tcp -m multiport --dports 9876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy ipv4" -j ACCEPT
  # iptables -A INPUT -p tcp -m multiport --dports 13876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy_ssl ipv4" -j ACCEPT
  # iptables -A INPUT -p tcp -m multiport --dports 9876,13876 -m state --state NEW -m comment --comment "120 octavia_api ipv4" -j ACCEPT

IPv6:
  # ip6tables -A INPUT -p tcp -m multiport --dports 9876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy ipv6" -j ACCEPT
  # ip6tables -A INPUT -p tcp -m multiport --dports 13876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy_ssl ipv6" -j ACCEPT
  # ip6tables -A INPUT -p tcp -m multiport --dports 9876,13876 -m state --state NEW -m comment --comment "120 octavia_api ipv6" -j ACCEPT

Restart HAProxy:

  # docker restart haproxy-bundle-docker-0

OpenDaylight logging might be missing earlier logs. This is a known issue with journald logging of OpenDaylight (using the “docker logs opendaylght_api” command). The current workaround is to switch OpenDaylight logging to the “file” mechanism which will log inside of the container to /opt/opendaylight/data/logs/karaf.log. To do this, configure the following heat parameter: OpenDaylightLogMechanism: ‘file’.

Rerunning an overcloud deploy command against an existing overcloud failed to trigger a restart of any pacemaker managed resource. For example, when adding a new service to haproxy, haproxy would not restart, rendering the newly configured service unavailable until a manual restart of the haproxy pacemaker resource.

With this update, a configuration change of any pacemaker resource is detected, and the pacemaker resource automatically restarts. Any changes in the configuration of pacemaker managed resources is then reflected in the overcloud.

Service deployment tasks within the minor-update workflow were run twice caused by superfluous entries in the list of playbooks. This update removes the superfluous playbook entries and includes host preparation tasks directly in the updated playbook. Actions in minor version updates run once in the desired order.

Previously, the UpgradeInitCommonCommand parameter was not present in heat templates used to deploy the overcloud on pre-provisioned servers. The ‘openstack overcloud upgrade prepare’ command would not perform all of the necessary operations, which caused issues during upgrades in some environments.

This update adds UpgradeInitCommonCommand to the templates used for pre-provisioned servers, allowing the ‘openstack overcloud upgrade prepare’ command to perform the necessary actions.

To enhance security, the default OpenDaylightPassword “admin” is now replaced by a randomly generated 16-digit number. You can overwrite the randomly generated password by specifying a password in a heat template:

$ cat odl_password.yaml
parameter_defaults:
  OpenDaylightPassword: admin

And then pass the file to the overcloud deploy command:

openstack overcloud deploy <other env files> -e odl_password.yaml

puppet-opendaylight

Previously, the Karaf shell (the management shell for OpenDaylight) was not bound to a specific IP on port 8101, causing the Karaf shell to listen on the public-facing, external network. This created a security vulnerability, because the external network could be used to access OpenDaylight on the port.

This update binds the Karaf shell to the internal API network IP during deployment, which makes the Karaf shell only accessible on the private internal API network.

4.3. RHBA-2018:2215 — openstack-neutron bug fix advisory

The bugs contained in this section are addressed by advisory RHBA-2018:2215. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2018:2215.html.

opendaylight

Layer 3 connectivity between nova instances across multiple subnets may fail when an instance without a floating IP tries to reach another instance that has a floating IP on another router. This occurs when nova instances are spread across multiple compute nodes. There is no suitable workaround for this issue.

During deployment, one or more OpenDaylight instances may fail to start correctly due to a feature loading bug. This may lead to a deployment or functional failure.

When a deployment passes, only two of the three OpenDaylight instances must be functional for the deployment to succeed. It is possible that the third OpenDaylight instance started incorrectly. Check the health status of each container with the docker ps command. If it is unhealthy, restart the container with docker restart opendaylight_api.

When a deployment fails, the only option is to restart the deployment. For TLS-based deployments, all OpenDaylight instances must boot correctly or deployment will fail.

Missing parameters from createFibEntry generate a Null Pointer Exception (NPE) during NAT setup. This bug may result in missing FIB entries from the routing table, causing NAT or routing to fail. This update adds the proper parameters to the RPC call. NPE is no longer seen in the OpenDaylight log, and NAT and routing function correctly.

When the NAPT switch is selected on a node without any port in a VLAN network, all flows required are not programmed. External connectivity fails for all VMs in the network that don’t have floating IP addresses. This update adds a pseudo port to create a VLAN footprint in the NAPT switch for VLANs that are part of the router. External connectivity works for VMs without floating IP addresses.

A race condition causes Open vSwitch to not connect to the Opendaylight openflowplugin. A fix is currently being implemented for a 13.z release of this product.

When the router gateway is cleared, the Layer 3 flows related to learned IP addresses is not removed. The learned IP addresses include the PNF and external gateway IP addresses. This leads stale flows, but not any functional issue. The external gateway and IP address does not change frequently. The stale flows will be removed when the external network is deleted.

openstack-neutron

A new configuration option called bridge_mac_table_size has been added for the neutron OVS agent. This value is set as the "other_config:mac-table-size" option on each bridge managed by the openvswitch-neutron-agent. The value controls the maximum number of MAC addresses that can be learned on a bridge. The default value for this new option is 50,000, which should be enough for most systems. Values outside a reasonable range (10 to 1,000,000) will be forced by OVS.

python-networking-odl

Neutron may issue an error claiming that the Quota has been exceed for Neutron Router creation. This is a known issue where multiple router resources are created with a single create request in Neutron DB due to a bug with networking-odl. The workaround for this issue is to delete the duplicated routers using the OpenStack Neutron CLI and create a router again, resulting with a single instance.

python-networking-ovn

When the OVSDB server fails over to a different controller node, a reconnection from neutron-server/metadata-agent does not take place because they are not detecting this condition.

As a result, booting VMs may not work as metadata-agent will not provision new metadata namespaces and the clustering is not behaving as expected.

A possible workaround is to restart the ovn_metadata_agent container in all the compute nodes after a new controller has been promoted as master for OVN databases. Also increase the ovsdb_probe_interval on the plugin.ini to a value of 600000 milliseconds.

If the 'dns_nameservers' field is not set for a subnet, the VMs attached to the subnet have empty /etc/resolv.conf. With this fix, neutron-server gets the DNS resolver from the /etc/resolv.conf of the host from which it runs and uses it as the default dns_nameservers for the tenant VMs.

4.4. RHBA-2018:2573 — openstack platform 13 bug fix and enhancement advisory

The bugs contained in this section are addressed by advisory RHBA-2018:2573. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2018:2573

openstack-kuryr-kubernetes

The controller does not support Nodeport services, and users should not create them. Nonetheless, Nodeport services are present in some configurations, and their presence has caused the controller to crash. To safeguard against such crashes, the controller now ignores Nodeport services.

openstack-manila

This update adds support for use of Manila IPv6 export locations and access rules with Dell-EMC Unity and VNX back ends.

openstack-manila-ui

Configuration files for manila-ui plugin were not being copied. As a result, the manila panel did not show up on the dashboard. The instructions for copying all of the configuration files for manila-ui to the required locations are now present. The manila panel is visible when the user enables the dashboard.

openvswitch

The creation time of OVN ports grew linearly as ports were created. The creation time now remains constant, regardless of the number of ports in the cloud.

python-eventlet

There was an issue in python-eventlet UDP address handling that resulted in some IPv6 addresses being handled incorrectly in some cases. As a result, when receiving DNS responses via UDP, python-eventlet ignored the response and stalled for several seconds, severely impacting performance. This issue is now resolved.

Due to a bug in eventlet, systems that did not configure any nameservers (or in which the nameservrs were unreachable) and that relied only on hosts file for name resolution hit a delay when booting instances. This is because of an attempt to resolve the IPv6 entry even when only an IPv4 host was specified. With this fix, eventlet returns immediately without attempting to use network resolution if at least one of the entries is present in the hosts file.

python-oslo-policy

Previously, every time a policy check was made in neutron, the policy file was reloaded and re-evaluated. The re-evaluation of the policy file slowed down API operations substantially for non-admin users. With this update, the state of the policy file is saved so the file only reloads if the rules have changed. Neutron API operations for non-admin users are resolved quickly.

python-proliantutils

Because of issues with multiple Sushy object creation on HP Gen10 servers, HPE Gen10 servers were not providing consistent response when accessing the system with id /redfish/v1/Systems/1. Instead of using session-based authentication, which is the default authentication method in Sushy, use basic authentication at the time of Sushy object creation. This resolves power request issues.

When the ironic-dbsync utility tried to load the ironic drivers and when a driver imported the proliantutils.ilo client module, the proliantutils library tried to load all of the pysnmp MIBs. If the ironic-dbsync process resided in an unreadable CWD, pysnmp failed when trying to search for MIBs in CWD. This resulted in the following error messages in ironic-dbsync.log on deployment: Unable to load classic driver fake_drac: MIB file pysnmp_mibs/CPQIDA-MIB.pyc access error: [Errno 13] Permission denied: 'pysnmp_mibs': MibLoadError: MIB file pysnmp_mibs/CPQIDA-MIB.pyc access error: [Errno 13] Permission denied: 'pysnmp_mibs' An update to proliantutils ensures that pysnmp does not load all MIBs on module import. This avoids the situation when an MIB search is attempted prior to the moment of being explicitly requested by the application.

rhosp-release

When removing older image packages, the post scriptlets sometimes incorrectly updated the symlinks for image packages. The scriptlets have been updated to call a script that can be used to fix the symlinks.

4.5. RHBA-2018:2574 — openstack director bug fix advisory

The bugs contained in this section are addressed by advisory RHBA-2018:2574. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2018:2574

instack-undercloud

Red Hat OpenStack undercloud upgrade failed when the overcloud was in a Failed state. It failed very late with a cryptic error when trying to migrate the overcloud stack to use convergence architecture in the post-configuration step of the upgrade process. Now, it fails fast and does not allow undercloud upgrade to proceed. The user receives an error at the beginning of undercloud upgrade. The user must ensure that the overcloud is in *_COMPLETE state before proceeding with the undercloud upgrade.

Previously, when the parameter local_mtu was set to 1900 and was specified in undercloud.conf, the undercloud installation failed. If the value of local_mtu was greater than 1500, the undercloud installation failed. Set global_physnet_mtu to local_mtu. Undercloud installation succeeds when the value of local_mtu is greater than 1500.

Sometimes an undercloud that has SSL enabled failed during installation with the following error: ERROR: epmd error. Failure occurred because the VIP matching the hostname was configured by keepalived after rabbitmq. Ensure that you configure keepalived before rabbitmq. This prevents undercloud installation failure.

openstack-tripleo

The procedures for upgrading from RHOSP 10 to RHOSP 13 with NFV deployed have been retested and updated for DPDK and SR-IOV environments.

openstack-tripleo-common

The 'openstack undercloud backup' command did not capture extended attributes. This caused metadata loss from the undercloud Swift storage object, rendering them unusable. This fix adds the '--xattrs' flag when creating the backup archive. Undercloud Swift storage objects now retain their extended attributes during backup.

When the undercloud imported bare metal nodes from the instackenv.json file and while the UCS driver was being configured, ironic nodes that only differ in pm_service_profile (or ucs_service_profile) fields overrode one another in ironic configuration. This resulted in just one of such ironic nodes ending up in the ironic configuration. An update to openstack-tripleo-common ensures that ironic nodes that only differ in pm_service_profile (or ucs_service_profile) fields are still considered distinct. All of the ironic nodes that only differ in pm_service_profile or ucs_service_profile fields get imported into ironic.

It is possible to create the stonith resources for the cluster automatically before the overcloud deployment. Before the start of the deployment, run the following command: openstack overcloud generate fencing --ipmi-lanplus --output /home/stack/fencing.yaml /home/stack/instackenv.json

Then pass '-e /home/stack/fencing.yaml' to the list of arguments to the deploy command. This creates the necessary stonith resources for the cluster automatically.

The Derived Parameters workflow now supports the use of SchedulerHints to identify overcloud nodes. Previously, the workflow could not use use SchedulerHints to identify overcloud nodes associated with the corresponding TripleO overcloud role. This caused the overcloud deployment to fail. SchedulerHints support prevents these failures.

The docker healthcheck for OpenDaylight ensured only that the REST interface and neutron NB component was healthy in OpenDaylight. The healthcheck did not include all loaded OpenDaylight components and therefore was not accurate. Use diagstatus URI with docker healthcheck to check all of the loaded OpenDaylight components. OpenDaylight docker container health status is now more accurate.

openstack-tripleo-heat-templates

The manila-share service container failed to bind-mount PKI trust stores from the controller host. As a result, connections from the manila-share service to the storage back end could not be encrypted using SSL. Bind-mount the PKI trust stores from the controller host into the manila-share service container. The connections from the manila-share service to the storage back end can now be encrypted using SSL.

A change in the libvirtd live-migration port range prevents live-migration failures. Previously, libvirtd live-migration used ports 49152 to 49215, as specified in the qemu.conf file. On Linux, this range is a subset of the ephemeral port range 32768 to 61000. Any port in the ephemeral range can be consumed by any other service as well. As a result, live-migration failed with the error: Live Migration failure: internal error: Unable to find an unused port in range 'migration' (49152-49215). The new libvirtd live-migration range of 61152-61215 is not in the ephemeral range. The related failures no longer occur.

Previously, when removing the ceph-osd package from the overcloud nodes, the corresponding Ceph product key was not removed. Therefore, the subscription-manager incorrectly reported that the ceph-osd package was still installed. The script that handles the removal of the ceph-osd package now also removes the corresponding Ceph product key. The script that removes the ceph-osd package and product key executes only during the overcloud update procedure. As a result, subscription-manager list no longer reports that the Ceph OSD is installed.

Containers are now the default deployment method. There is still a way to deploy the baremetal services in environments/baremetal-services.yaml, but this is expected to eventually disappear.

Environment files with resource registries referencing environments/services-docker must be altered to the environments/services paths. If you need to retain any of the deployed baremetal services, update references to environments/services-baremetal instead of the originally placed environments/services.

Previously, the code that supports the Fast Forward Upgrade path for Sahara was missing. As a result, not all of the required changes were applied to Sahara services after a Fast Forward Upgrade from 10 to 13. With this update, the issue has been resolved and Sahara services work correctly after a Fast Forward Upgrade.

README has been added to /var/log/opendaylight, stating the correct OpenDaylight log path.

In CephFS-NFS driver deployments, the NFS-Ganesha server, backed by CephFS, performs dentry, inode, and attribute caching that is also performed by the libcephfs clients. The NFS-Ganesha server’s redundant caching led to a large memory footprint. It also affected cache coherency. Turn off NFS-Ganesha server’s inode, dentry, and attribute caching. This reduces the memory footprint of the NFS-Ganesha server. Cache coherency issues are less probable.

TripleO’s capabilities-map.yaml referenced Cinder’s Netapp backend in an incorrect file location. The UI uses the capabilities map and was unable to access Cinder’s Netapp configuration file. The capabilities-map.yaml has been updated to specify the correct location for Cinder’s Netapp configuration. The UI’s properties tab for the Cinder Netapp backend functions correctly.

Manila configuration manifests for Dell-EMC storage systems (VNX, Unity, and VMAX) had incorrect configuration options. As a result, the overcloud deployment of manila-share service with Dell Storage systems failed. The Manila configuration manifests for Dell-EMC storage systems (VNX, Unity, and VMAX) have now been fixed. The overcloud deployment of manila-share service with Dell storage systems completes successfully.

If Telemetry is manually enabled on the undercloud, hardware.* metrics does not work due to a misconfiguration of the firewall on each of the nodes. As a workaround, you need to manually set the snmpd subnet with the control plane network by adding an extra template for the undercloud deployment as follows: parameter_defaults: SnmpdIpSubnet: 192.168.24.0/24

On rare occasions, a deployment failed with the following error log from a container: standard_init_linux.go:178: exec user process caused "text file busy". To avoid the race and to avoid deployment failure, do not attempt to write out the docker-puppet.sh file multiple times concurrently.

When setting the parameter KernelDisableIPv6 to true in order to disable ipv6, the deployment failed with rabbitmq errors because the Erlang Port Mapper Daemon requires that at least the loopback interface support IPv6 in order to initialize correctly. To ensure successful deployment when disabling ipv6, do not disable IPv6 on the loopback interface.

Docker used journald backend rolls over logs based on size. This resulted in the deletion of some of the older OpenDaylight logs. This issue has been resolved by moving to logging to file instead of console where log file size and rollover can be managed by OpenDaylight. As a result, older logs are persistent for a longer duration than before.

If you use a non-standard port for RabbitMQ instance that is for monitoring purposes, the sensu-client container reported an unhealthy state due to not reflecting the port value in the container health check. The port value now shows in the container health check.

The default age for purging deleted database records has been corrected so that deleted records are purged from Cinder’s database. Previously, the CinderCronDbPurgeAge value for Cinder’s purge cron job used the wrong value and deleted records were not purged from Cinder’s DB when they reached the required default age.

The single-nic-vlans network templates in TripleO Heat Templates in OSP 13 contained an incorrect bridge name for Ceph nodes. If the single-nic-vlans templates were used in a previous deployment, upgrades to OSP 13 failed on the Ceph nodes. The bridge name br-storage is now used on Ceph nodes in the single-nic-vlans templates, which matches the bridge name from previous versions. Upgrades to OSP 13 on environments using the single-nic-vlans templates are now successful on Ceph nodes.

In previous versions, the *NetName parameters (e.g. InternalApiNetName) changed the names of the default networks. This is no longer supported. To change the names of the default networks, use a custom composable network file (network_data.yaml) and include it with your 'openstack overcloud deploy' command using the '-n' option. In this file, set the "name_lower" field to the custom net name for the network you want to change. For more information, see "Using Composable Networks" in the Advanced Overcloud Customization guide. In addition, you need to add a local parameter for the ServiceNetMap table to network_environment.yaml and override all the default values for the old network name to the new custom name. You can find the default values in /usr/share/openstack-tripleo-heat-templates/network/service_net_map.j2.yaml. This requirement to modify ServiceNetMap will not be necessary in future OSP-13 releases.

yaml-nic-config-2-script.py required interactive user input. The script could not be called in a non-interactive manner for automation purposes. A --yes option has been added. yaml-nic-config-2-script.py can now be called with --yes option and the user is not asked for interactive input.

Previously, some versions of the tripleo-heat-templates contained an error in a setting for the Redis VIP port in the environment file fixed-ips-v6.yaml. If the file fixed-ips-v6.yaml was included on the deployment command line after network-isolation-v6.yaml, the Redis service was placed on the Control Plane network rather than the correct IPv6 network. With this update, the file environments/fixed-ips-v6.yaml contains the correct reference to network/ports/vip_v6.yaml, instead of network/ports/vip.yaml. The fixed-ips-v6.yaml environment file contains the correct resource registry entries and the Redis VIP will be created with an IPv6 address, regardless of the order of the included environment files.

TripleO’s BlockStorage role was not updated when Cinder services migrated from running on the host to running in containers. The cinder-volume service deployed on the BlockStorage host. The BlockStorage role has been updated to deploy the cinder-volume service in a container. The cinder-volume service runs correctly in a container.

An overcloud update with Manila configuration changes failed to deploy those changes to the containerized Manila share-service. With this fix, the deployment of the changes is now successful.

With shared storage for /var/lib/nova/instances, like nfs, restarting nova_compute on any compute resulted in owner/group change of the instances virtual ephemeral disks and console.log. As a result, instances lost access to their virtual ephemeral disks and stopped working. The scripts to modify the ownership of the instance files in /var/lib/nova/instances have been improved. There is now no loss in access to the instance files during restart of nova compute.

The TripleO environment files used for deploying Cinder’s Netapp backend were out of date and contained incorrect data. This resulted in failed overcloud deployment. The Cinder Netapp environment files have been updated and are now correct. You can now deploy an overcloud with a Cinder Netapp backend.

Previously, libvirtd live-migration used ports 49152 to 49215, as specified in the qemu.conf file. On Linux, this range is a subset of the ephemeral port range 32768 to 61000. Any port in the ephemeral range can be consumed by any other service as well. As a result, live-migration failed with the error: Live Migration failure: internal error: Unable to find an unused port in range 'migration' (49152-49215). The new libvirtd live-migration range of 61152 to 61215 is not in the ephemeral range.

Previously, if a nic config template contained a blank line followed by a line starting with a comma, the yaml-nic-config-2-script.py did not reset the starting column of the next row. The nic config template converted by the script was invalid and caused a deployment failure. With this update, the script correctly sets the value for the column when the blank line is detected. Scripts that have a blank line followed by a line with a comma are converted correctly.

puppet-nova

Nova’s libvirt driver now allows the specification of granular CPU feature flags when configuring CPU models. One benefit of this is the alleviation of a performance degradation experienced on guests running with certain Intel-based virtual CPU models after application of the "Meltdown" CVE fixes. This guest performance impact is reduced by exposing the CPU feature flag 'PCID' ("Process-Context ID") to the guest CPU, assuming that the PCID flag is available in the physical hardware itself. This change removes the restriction of having only 'PCID' as the only CPU feature flag and allows for the addition and removal of multiple CPU flags, making way for other use cases. For more information, refer to the documentation of [libvirt]/cpu_model_extra_flags in nova.conf.

puppet-opendaylight

OpenDaylight polls OpenFlow (OF) statistics periodically. These statistics are not being used anywhere currently. This affects OpenDaylight performance. You can disable polling of OF statistics to increase OpenDaylight performance.

puppet-tripleo

Instance HA deployments failed due to a race condition, generating an error: Error: unable to get cib. The race was a result of pacemaker properties being set on the compute nodes before the pacemaker cluster was fully up and hence failing with the 'unable to get cib' error. This fix results in no errors in the deployment when using IHA.

Previously, if you used uppercase letters in the stack name, the deployment failed. This update ensures that a stack name with uppercase letters leads to a successful deployment. Specifically, the bootstrap_host scripts inside the containers now convert strings to lowercase and the same happens for pacemaker properties.

The compress option for the containerized logrotate service to compress rotated logs by default has been added. The delaycompress option ensures the first rotation of a log file remains uncompressed.

Previously, configuring empty string values for some deprecated parameters for Cinder’s Netapp backend resulted in an invalid configuration for the Cinder driver, causing Cinder’s Netapp backend driver to fail during initialization. As of this update, empty string values for the deprecated Netapp parameters are converted to a valid Netapp driver configuration. As a result, Cinder’s Netapp backend driver successfully initializes.

Previously, the Cinder Netapp backend ignored the CinderNetappNfsMountOptions TripleO Heat parameter that prevented configuration of the Netapp NFS mount options via the TripleO Heat parameter. The code responsible for handling Cinder’s Netapp configuration no longer ignores the CinderNetappNfsMountOptions parameter. The CinderNetappNfsMountOptions parameter correctly configures Cinder’s Netapp NFS mount options.

During a version upgrade, Cinder’s database synchronization is now executed only on the bootstrap node. This prevents database synchronization and upgrade failures that occurred when database synchronization was executed on all Controller nodes.

4.6. RHBA-2018:3587 — Red Hat OpenStack Platform 13.0 director Bug Fix Advisory

The bugs contained in this section are addressed by advisory RHBA-2018:3587. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2018:3587

instack-undercloud

Some hardware changes boot device ordering in an unexpected way when receiving an IPMI bootdev command. This may prevent nodes from booting from the correct NIC or prevent PXE from booting at all. This release introduces a new "noop" management interface for the "ipmi" driver. When it is used, bootdev commands are not issued, and the current boot order is used. Nodes must be configured to try PXE booting from the correct NIC, and then fall back to the local hard drive. This change ensures a pre-configured boot order is kept with the new management interface.

In prior versions, undercloud hieradata overrides could be used to tune some service configurations using the <service>::config options similar to the overcloud. However, this functionality was not available for all deployed OpenStack services. With this version, any configuration values not currently available can be updated via the <service>::config hieradata.

openstack-tripleo-common

When upgrading from Red Hat OpenStack Platform 12 to 13 the ceph-osd package is removed. The package removal stopped the running OSDs even though they were running in containers and shouldn’t have required the package. This release removes the playbook that removes the package during the upgrade and Ceph OSDs are not unintentionally stopped during upgrade.

Director uploads the latest amphora image to glance when OpenStack is updated and/or upgraded. The latest amphora image ensures amphora instances run with the latest general bug and security fixes, not only for Octavia agent fixes, but also for operating system fixes.

With this release, newly created and recreated amphora instances are made with the latest amphora image. Previous amphora images will remain stored in glance and be renamed to include the timestamp in the suffix.

openstack-tripleo-heat-templates

One of the instance HA scripts connected to the publicURL keystone endpoint. This has now been moved to the internalURL endpoint by default. Additionally, an operator can override this via the '[placement]/valid_interfaces' configuration entry point in nova.conf.

In prior releases, triggers for online data migrations were missing. Online data migrations for nova, cinder, and ironic in the overcloud did not run automatically after upgrading to OSP 13, which forced a manual workaround. This release adds trigger logic for online data migrations. Online data migrations are triggered during the openstack overcloud upgrade converge command when upgrading to OSP 13.

In prior releases, you could set RX/TX queue size via nova::compute::libvirt::rx_queue_size/nova::compute::libvirt::tx_queue_size. However, there was no dedicated TripleO heat template parameter. With this release, the RX/TX queue size can be set on a role base like this:

parameter_defaults: ComputeParameters: NovaLibvirtRxQueueSize: 1024 NovaLibvirtTxQueueSize: 1024

The result is rx_queue_size/tx_queue_size is set using new parameters.

To set MTU as a part of OSPD, this release adds neutron::plugins::ml2::physical_network_mtus as a NeutronML2PhysicalNetworkMtus in the heat template to enable MTU in the ml2 plugin. Neutron::plugins::ml2::physical_network_mtus is set based on values from the TripleO heat template.

In prior versions, the conditions for checking whether a Docker daemon requires a restart were too strict. As a result, the Docker daemon and all containers were restarted whenever the Docker configuration changed or when the Docker RPM was updated. With this release, the conditions are relaxed to prevent unnecessary container restarts. Use the "live restore" functionality for configuration changes to make sure the Docker daemon and all containers are restarted when Docker RPM is updated, but not when the Docker configuration is changed.

During a redeployment, a number of containers can be restarted needlessly, even in the absence of any configuration change. This was due to including too many unneeded files in the md5 calculation of the config files. With this release, no spurious container restarts are triggered by a redeploy.

The TripleO CinderNetappBackendName parameter did not correctly override the default value for cinder’s Netapp back end. As a result, the name associated with cinder’s Netapp back end could not be overridden. With this release, the CinderNetappBackendName parameter correctly overrides the default back end name.

puppet-cinder

Several configuration settings were removed from cinder, but the corresponding parameters were not removed from the TripleO Puppet module responsible for setting cinder’s configuration settings. As a result, invalid cinder configuration settings were added to cinder.conf. With this release, the Puppet module has been updated to prevent obsolete settings from being added to cinder.conf.

Note

The updated Puppet module will not remove any obsolete settings that were previously added to cinder.conf. Obsolete settings must be manually removed.

puppet-tripleo

A faulty interaction between rhel-plugin-push.service and the Docker service occurred during system shutdown, which caused the controller reboot to take a long time. WIth this release, the correct shutdown ordering is enforced for these two services. Rebooting a controller takes less time now.

During deployment, an OVS switch may be configured with the incorrect OpenFlow controller port (6640, instead of 6653) for two out of the three controllers. This causes either a deployment failure, or a functional failure with the deployment later on, where the incorrect flows are programmed into the switch. This release correctly sets all of the OpenFlow controller ports to 6653 for each OVS switch. All of the OVS switches have the correct OpenFlow controller configuration, which consists of three URIs, one to each OpenDaylight using port 6653.

When a single OpenDaylight instance was removed from a cluster, this moved the instance into an isolated state, meaning it no longer acted on incoming requests. HA Proxy still load-balanced requests to the isolated OpenDaylight instance, which potentially resulted in OpenStack network commands failing or not working correctly. HA Proxy now detects the isolated OpenDaylight instance as in an unhealthy state. HA Proxy does not forward requests to the isolated OpenDaylight.

python-os-brick

Under certain circumstances, the os-brick code responsible for scanning FibreChannel HBA hosts could return an invalid value. The invalid value would cause services such as cinder and nova to fail. With this release, the FibreChannel HBA scan code always returns a valid value. Cinder and nova no longer crash when scanning FibreChannel HBA hosts.

On multipath connections, devices are individually flushed for all paths upon disconnect. In certain cases, a failure on an individual device flush incorrectly prevents disconnection. With this release, individual paths are no longer flushed because flushing the multipath already ensures buffered data is written on the remote device. Now a disconnection only fails when it would actually lose data.

There is a case where multipathd show status doesn’t return an error code as it should, so we are now checking stdout as a workaround for this issue to properly detect that multipathd is in an error state.

In prior releases, volume migration failed (with a VolumePathNotRemoved error) when a single iSCSI path failed in a narrow window a time during migration initiation. This release fixes the issue by extending the timeout period for verification of volume removal.

iSCSI device detection checked for the presence of devices based on the re-scan time. Devices becoming available between scans went undetected. With this release, searching and rescanning are independent operations working at different cadences with checks happening every second.

python-tripleoclient

In prior releases, if you used a custom plan—​done via the '-p' option of the deploy command line—​a number of passwords (such as mysql, horizon, pcsd, and so forth) were reset to new values during redeployment of an existing overcloud. This caused the redeployment to fail. With this release, a custom plan does not trigger setting new passwords.

4.7. RHBA-2019:0068 — Red Hat OpenStack Platform 13 bug fix and enhancement advisory

The bugs contained in this section are addressed by advisory RHBA-2019:0068. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2019:0068

openstack-tripleo-common

Previously, when you updated a node from the undercloud, the capabilities field values were not always converted to a string value type. After this bug fix, the `capabilities field now always converts to the string value type during node updates.

openstack-tripleo-heat-templates

This enhancement adds the parameter NeutronOVSTunnelCsum, which allows you to configure neutron::agents::ml2::ovs::tunnel_csum in the heat template. This parameter sets or removes the tunnel header checksum on the GRE/VXLAN tunnel that carries outgoing IP packets in the OVS agent.

OpenDaylight (ODL) configuration files were not recreated during Controller replacement, which caused updates to fail. This fix unmounts /opt/opendaylight/data from the host, which causes the configuration files to be recreated during redeployment.

Previously, the OpenStack Platform Director did not configure authentication for Block Storage (Cinder) to access volumes that use the Nova privileged API. This caused operations on these volumes, such as migrating an in-use volume, to fail.

This bug fix adds the capability to configure Cinder with the Nova authentication data, which allows you to perform operations on volumes that use the privileged API with these credentials.

During an upgrade to a containerized deployment with Ironic, the TFTP server did not shut down correctly, which caused the upgrade to fail. This fix corrects the shutdown process of the TFTP server, so now the server can listen to the port and the upgrade completes successfully.

Previously, the inactivity probe timer for Open vSwitch from ODL was insufficient for larger-scale deployments, which caused the ODL L2 agents to appear as offline after the inactivity time lapsed.

This bug fix increases the default inactivity probe timer duration and adds the capability to configure the timer in the Director using the OpenDaylightInactivityProbe heat parameter. Default value is 180 seconds.

The location of Pacemaker log files for RabbitMQ containers was not set to the correct location, which caused unnecessary log files to be created in /var/log/secure. This fix adds mounting of the /var/log/btmp path during the start of the RabbitMQ container, which enables Pacemaker to create the logs in the correct location.

This feature adds the capability to configure the Cinder Dell EMC StorageCenter driver to use a multipath for volume-to-image and image-to-volume transfers. The feature includes a new parameter CinderDellScMultipathXfer with a default value of True. Enabling multipath transfers can reduce the total time of data transfers between volumes and images.

This feature adds a new parameter NovaLibvirtVolumeUseMultipath (boolean), which sets the multipath configuration parameter libvirt/volume_use_multipath in the nova.conf file for Compute nodes. This parameter can be set for each Compute role. Default value is False.

This feature adds the parameter NovaSchedulerWorkers, which allows you to configure multiple nova-schedule workers for each scheduler node. Default value is 1.

Previously, the loopback device associated with an LVM volume group did not always restart after restarting the Controller node. This prevented the LVM volume groups used by the iSCSI Cinder backend from persisting after the restart, and prevented new volumes from being created.

After this bug fix, the loopback device is now restored after you reboot the Controller node, and the LVM volume group is accessible by the node.

This enhancement adds the RabbitAdditionalErlArgs parameter to the Erlang VM, which allows you to define custom arguments for the VM. The default argument is +sbwt none, which instructs the Erlang threads to go to sleep if no additional operations are required. For more information, see the Erlang documentation at: http://erlang.org/doc/man/erl.html#+sbwt

openstack-tripleo-heat-templates-compat

Previously, OpenStack 13 Director did not set the correct Ceph version when deploying from OpenStack 12 templates. This caused the overcloud deployment to fail.

This bug fix sets the Ceph version to Jewel, and allows for correct deployment from OpenStack 12 templates.

puppet-opendaylight

This feature adds support for deploying OpenDaylight (ODL) on IPv6 addresses.

4.8. RHBA-2019:0448 — Red Hat OpenStack Platform 13 bug fix and enhancement advisory

The bugs contained in this section are addressed by advisory RHBA-2019:0448. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2019:0448

openstack-tripleo-common

This bug was caused by updated versions of dmidecode 3.1 or later that returned system UUIDs in lowercase. As a consequence, systems deployed with per node ceph-ansible customization prior to this version can break if UUID case mismatches and cause deployment failures. This fix updates the openstack-tripleo-common package to accept uppercase or lowercase UUIDs. Forced lowercase on dmidecode output make the code case insensitive.

openstack-tripleo-heat-templates

Previously, Octavia Health Manager did not receive heartbeat messages from amphorae due to a packet drop by the firewall. As a result, the operating_status of load balancers on Octavia composable role deployments never changed to ONLINE.

With this update, load balancers on Octavia composable role deployments change to ONLINE operating status successfully.

With this update, you can use the following parameters to set the default Octavia timeouts for backend member and frontend client:

  • OctaviaTimeoutClientData: Frontend client inactivity timeout
  • OctaviaTimeoutMemberConnect: Backend member connection timeout
  • OctaviaTimeoutMemberData: Backend member inactivity timeout
  • OctaviaTimeoutTcpInspect: Time to wait for TCP packets for content inspection

The value for all of these parameters is in milliseconds.

Previously, iSCSI connections that containerized OpenStack services created were not visible on the host. As a result, the host must close all iSCSI connections during shutdown. The shutdown sequence hung when the host failed to terminate these iSCSI connections and the host failed to terminate the hOpenStack connections because the connection information was not visible on the host.

With this update, connection information for containerized services that create iSCSI connections is now visible on the host, and the shutdown sequence no longer hangs.

With this update, OpenDaylight minor update is now included in the Red Hat OpenStack Platform minor update workflow.

With this update, Compute nodes in a Red Hat OpenStack Platform environment that uses OpenDaylight as a back end can be scaled successfully.

Previously, ODL configuration files were missing after redeployment.

With this update, /opt/opendaylight/data is no longer mounted on the host. As a result, the ODL configuration files are generated during redeployment.

Previously, the rabbitmq pacemaker bundle logged excessively during normal operation.

With this update, the rabbitmq bundle no longer logs excessively. In particular, the rabbitmq bundle does not log the harmless error Failed to connect to system bus: No such file or directory.

openstack-tripleo-image-elements

With this update, you can now boot whole security-hardened images in UEFI mode.

puppet-opendaylight

Previously, OpenDaylight packaging used the default OpenDaylight log_pattern values and included the PaxOsgi appender. These default values are not always appropriate for every deployment and it is appropriate to configure custom values.

With this update, puppet-opendaylight has two additional configuration variables:

1) log_pattern: Use this variable to configure which log pattern you want to use with the OpenDaylight logger log4j2.

2) enable_paxosgi_appender: Use this boolean flag to enable or disable the PaxOsgi appender.

puppet-opendaylight also modifies the OpenDaylight defaults. Deployments that use puppet-opendaylight have new defaults:

  • log_pattern: %d{ISO8601} | %-5p | %-16t | %-60c{6} | %m%n
  • enable_paxosgi_appender: false

New variable configuration options

log_pattern

String that controls the log pattern used for logging.

Default: %d{ISO8601} | %-5p | %-16t | %-60c{6} | %m%n

Valid options: A string that is a valid log4j2 pattern.

enable_paxosgi_logger

Boolean that controls whether the PaxOsgi appender is enabled for logging.

If you enable the enable_paxosgi_logger variable, you must also modify the log pattern to utilize the additional capabilities. Modify the log_pattern variable and include a pattern that contains the PaxOsgi tokens. For example, set the log_pattern variable to a string that includes the following values:

'%X{bundle.id} - %X{bundle.name} - %X{bundle.version}'

If you do not edit the log_pattern variable, the PaxOsgi appender is still enabled and continues to run but logging does not utilize the additional functionality.

For example, set the enable_paxosgi_logger variable to true and set the log_pattern variable to the following value:

'%d{ISO8601} | %-5p | %-16t | %-32c{1} | %X{bundle.id} - %X{bundle.name} - %X{bundle.version} | %m%n'

Default: false

Valid options: The boolean values true and false.

puppet-tripleo

Previously, deployments could fail when deploying the Overcloud with a BlockStorage role and setting a pacemaker property on nodes that belong to the BlockStorage role.

With this update, the pacemaker-managed cinder-volume resource starts only on nodes that pacemaker manages. As a result, Overcloud deployments with a BlockStorage role succeed.

4.9. RHBA-2021:2385 — Red Hat OpenStack Platform 13 bug fix and enhancement advisory

The bugs contained in this section are addressed by advisory RHBA-2021:2385. Further information about this advisory is available at link: https://access.redhat.com/errata/RHBA-2021:2385

openstack-cinder component

Before this update, when a Block Storage service (cinder) API response was lost, the NetApp SolidFire back end created an unused duplicate volume.

With this update, a patch to the SolidFire driver first checks if the volume name already exists before trying to create it. The patch also checks for volume creation immediately after it detects a read timeout, and prevents invalid API calls. (BZ#1914590)

Before this update, when using the Block Storage service (cinder) to create a large number of instances (bootable volumes) from snapshots on HP3Par Storage back end servers, timeouts occurred. An HP variable (convert_to_base) was set to true which caused HP3Par to create a thick volume of the original volume. This was an unnecessary and unwanted action.

With this update, a newer HP driver (4.0.11) has been backported to RHOSP 13 that includes a new spec:

hpe3par:convert_to_base=True | False
  • True (default) - The volume is created independently from the snapshot (HOS8 behavior).
  • False - The volume is created as a child of snapshot (HOS5 behavior).

Usage

You can set this new spec for HPE3Par volumes by using the cinder type-key command:

cinder type-key <volume-type-name-or-ID> set hpe3par:convert_to_base=False | True

Example

$ cinder type-key myVolType set hpe3par:convert_to_base=False
$ cinder create --name v1 --volume-type myVolType 10
$ cinder snapshot-create --name s1 v1
$ cinder snapshot-list
$ cinder create --snapshot-id <snap_id> --volume-type myVolType --name v2 10

Notes

If the size of v2 is greater than size of v1, then the volume cannot be grown. In this case, to avoid any error, v2 is converted to a base volume (convert_to_base=True). (BZ#1940153)

Before this update, API calls to the NetApp SolidFire back end for the Block Storage service (cinder) could fail with a xNotPrimary error. This type of error occurred when an operation was made to a volume at the same time that SolidFire automatically moved connections to rebalance the cluster workload.

With this update, a SolidFire driver patch adds the xNotPrimary exception to the list of exceptions that can be retried. (BZ#1888417)

Before this update, users experienced timeouts in certain environments, mostly when volumes were too big. Often these multi-terabyte volumes experienced poor network performance or upgrade issues that involved the SolidFire cluster.

With this update, two timeout settings have been added to the SolidFire driver to allow users to set the appropriate timeouts for their environment. (BZ#1888469)

openstack-tripleo-heat-templates

This enhancement enables you to override the Orchestration service (heat) parameter, ServiceNetMap, for a role when you deploy the overcloud.

On spine-and-leaf (edge) deployments that use TLS-everywhere, hiera interpolation has been problematic when used to map networks on roles. Overriding the ServiceNetMap per role fixes the issues seen in some TLS-everywhere deployments, provides an easier interface, and replaces the need for the more complex hiera interpolation. (BZ#1875508)

The Block Storage backup service sometimes needs access to files on the host that would otherwise not be available in the container that runs the service. This enhancement adds the CinderBackupOptVolumes parameter, which you can use to specify additional container volume mounts for the Block Storage backup service. (BZ#1924727)

puppet-tripleo

Before this update, the Service Telemetry Framework (STF) client could not connect to the STF server, because the latest version of Red Hat AMQ Interconnect does not allow TLS connections without a CA certificate.

This update corrects this problem by providing a new Orchestration service (heat) parameter, MetricsQdrSSLProfiles.

To obtain a Red Hat OpenShift TLS certificate, enter these commands:

$ oc get secrets
$ oc get secret/default-interconnect-selfsigned -o jsonpath='{.data.ca\.crt}' | base64 -d

Add the MetricsQdrSSLProfiles parameter with the contents of your Red Hat OpenShift TLS certificate to a custom environment file:

MetricsQdrSSLProfiles:
    -   name: sslProfile
        caCertFileContent: |
           -----BEGIN CERTIFICATE-----
           ...
           TOpbgNlPcz0sIoNK3Be0jUcYHVMPKGMR2kk=
           -----END CERTIFICATE-----

Then, redeploy your overcloud with the openstack overcloud deploy command. (BZ#1934440)

python-os-brick

Before this update, when the Compute service (nova) made a terminate-connection call to the Block Storage service (cinder), single and multipath devices were not being flushed and there was a risk of data loss because these devices were in a leftover state.

The cause of this problem was that the os-brick disconnect_volume code assumed that the use_multipath parameter had the same value as the connector that was used in the original connect_volume call.

With this update, the Block Storage service changes how it performs disconnects. The os-brick code now properly flushes and detaches volumes when the multipath configuration in the Compute service changes for volumes attached to instances. (BZ#1943181)