Chapter 3. Release Information

These release notes highlight technology preview items, recommended practices, known issues, and deprecated functionality to be taken into consideration when deploying this release of Red Hat OpenStack Platform.

Notes for updates released during the support lifecycle of this Red Hat OpenStack Platform release will appear in the advisory text associated with each update.

3.1. Red Hat OpenStack Platform 13 GA

These release notes highlight technology preview items, recommended practices, known issues, and deprecated functionality to be taken into consideration when deploying this release of Red Hat OpenStack Platform.

3.1.1. Enhancements

This release of Red Hat OpenStack Platform features the following enhancements:

BZ#1419556

The Object Store service (swift) can now integrate with Barbican to transparently encrypt and decrypt your stored (at-rest) objects. At-rest encryption is distinct from in-transit encryption and refers to the objects being encrypted while being stored on disk.

Swift objects are stored as clear text on disk. These disks can pose a security risk if not properly disposed of when they reach end-of-life. Encrypting the objects mitigates that risk.

Swift performs these encryption tasks transparently, with the objects being automatically encrypted when uploaded to swift, then automatically decrypted when served to a user. This encryption and decryption is done using the same (symmetric) key, which is stored in Barbican.

BZ#1540239

This enhancement adds support for sending metrics data to a Gnocchi DB instance.

The following new parameters for collectd composable service were added. If CollectdGnocchiAuthMode is set to 'simple', then CollectdGnocchiProtocol, CollectdGnocchiServer, CollectdGnocchiPort and CollectdGnocchiUser are taken into account for configuration.

If CollectdGnocchiAuthMode is set to 'keystone', then CollectdGnocchiKeystone* parameters are taken into account for configuration.

Following is a detailed description of added parameters:

  CollectdGnocchiAuthMode:
    type: string
    description: >
      Type of authentication Gnocchi server is using. Supported values are
      'simple' and 'keystone'.
    default: 'simple'
  CollectdGnocchiProtocol:
    type: string
    description: API protocol Gnocchi server is using.
    default: 'http'
  CollectdGnocchiServer:
    type: string
    description: >
      The name or address of a gnocchi endpoint to which we should
      send metrics.
    default: nil
  CollectdGnocchiPort:
    type: number
    description: The port to which we will connect on the Gnocchi server.
    default: 8041
  CollectdGnocchiUser:
    type: string
    description: >
      Username for authenticating to the remote Gnocchi server using simple
      authentication.
    default: nil
  CollectdGnocchiKeystoneAuthUrl:
    type: string
    description: Keystone endpoint URL to authenticate to.
    default: nil
  CollectdGnocchiKeystoneUserName:
    type: string
    description: Username for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystoneUserId:
    type: string
    description: User ID for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystonePassword:
    type: string
    description: Password for authenticating to Keystone
    default: nil
  CollectdGnocchiKeystoneProjectId:
    type: string
    description: Project ID for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystoneProjectName:
    type: string
    description: Project name for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystoneUserDomainId:
    type: string
    description: User domain ID for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystoneUserDomainName:
    type: string
    description: User domain name for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystoneProjectDomainId:
    type: string
    description: Project domain ID for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystoneProjectDomainName:
    type: string
    description: Project domain name for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystoneRegionName:
    type: string
    description: Region name for authenticating to Keystone.
    default: nil
  CollectdGnocchiKeystoneInterface:
    type: string
    description: Type of Keystone endpoint to authenticate to.
    default: nil
  CollectdGnocchiKeystoneEndpoint:
    type: string
    description: >
      Explicitly state Gnocchi server URL if you want to override
      Keystone value
    default: nil
  CollectdGnocchiResourceType:
    type: string
    description: >
      Default resource type created by the collectd-gnocchi plugin in Gnocchi
      to store hosts.
    default: 'collectd'
  CollectdGnocchiBatchSize:
    type: number
    description: Minimum number of values Gnocchi should batch.
    default: 10

BZ#1592823

Logs from Ansible playbooks now include timestamps that provide information about the timing of actions during deployment, updates, and upgrades.

3.1.2. Technology Preview

The items listed in this section are provided as Technology Previews. For further information on the scope of Technology Preview status, and the associated support implications, refer to https://access.redhat.com/support/offerings/techpreview/.

BZ#1446311

This release adds support for PCI device NUMA affinity policies, which are configured as part of the “[pci]alias” configuration options. Three policies are supported:

“required” (must have)
“legacy” (default; must have, if available)
“preferred” (nice to have)

In all cases, strict NUMA affinity is provided, if possible. These policies allow you to configure how strict your NUMA affinity should be per PCI alias to maximize resource utilization. The key difference between the policies is how much NUMA affinity you're willing to forsake before failing to schedule.

When the “preferred” policy is configured for a PCI device, nova uses CPUs on a different NUMA node from the NUMA node of the PCI device, if it is available. This results in increased resource utilization, but performance is reduced for these instances.

BZ#1488095

From RHOS-12 onwards, the OpenStack services are becoming containerized. In this release, we containerize OpenStack Tempest as well. The containerized OpenStack Tempest is available as a Technology Preview.

3.1.3. Release Notes

This section outlines important details about the release, including recommended practices and notable changes to Red Hat OpenStack Platform. You must take this information into account to ensure the best possible outcomes for your deployment.

BZ#1468020

The Shared File System service (manila) now provides IPv6 access rule support with NetApp ONTAP cDOT driver, which lets you use manila with IPv6 environments.

As a result, the Shared File System service now supports exporting shares backed by NetApp ONTAP back ends over IPv6 networks. Access to the exported shares is controlled by IPv6 client addresses.

BZ#1469208

The Shared File System service (manila) supports mounting shared file systems backed by a Ceph File System (CephFS) via the NFSv4 protocol. NFS-Ganesha servers operating on Controller nodes are used to export CephFS to tenants with high availability (HA). Tenants are isolated from one another and may only access CephFS through the provided NFS gateway interface. This new feature is fully integrated into director, enabling CephFS back end deployment and configuration for the Shared File System service.

BZ#1496584

When neutron services are containerized, trying to run commands in a network namespace might fail with t he following error:

# ip netns exec qrouter...
RTNETLINK answers: Invalid argument

In order to run a command inside a network namespace, you must do it from the neutron container that created the namespace.  For example, the l3-agent creates network namespace for routers, so the command would need to change to:

# docker exec neutron_l3_agent ip netns exec qrouter...

Similarly with network namespaces beginning with 'qdhcp' you would need to exec from the 'neutron_dhcp' container.

BZ#1503521

This version introduces support for internal DNS resolution in networking-ovn. Although there are two know limitations,
one is bz#1581332 which prevents proper resolution of internal fqdn requests via internal dns.

Please note that the extension is not configured by default by tripleo on the GA release. See bz#1577592 for a workaround.

BZ#1533206

The openstack-gnocchi packages have been renamed to gnocchi. The openstack- prefix was removed because of an upstream project scoping change. Gnocchi has been moved out of the OpenStack umbrella and is maintained as a stand-alone project.

BZ#1556933

Since version 2.1, python-cryptography checks that the CNS Names used in certificates are compliant with IDN standards. If the found names do not follow this specification, cryptography will fail to validate the certificate and different errors may be found when using OpenStack command line interface or in OpenStack service logs.

BZ#1563412

The reserved host memory for OpenStack Compute (nova) has increased from 2048 MB to 4096 MB. This can affect capacity estimations for your environment. If necessary, you can reconfigure the reserved memory using the 'NovaReservedHostMemory' parameter in a environment file. For example:

parameter_defaults:
  NovaReservedHostMemory: 2048

BZ#1564176

The python-mistralclient is not part of any supported overcloud use-cases so it is being dropped from the -tools channels for the OSP 13 release.

BZ#1567735

OSP13 using OVN as the networking backend won't include IPv6 support in the first release. There is a problem with the responses to the Neighbor Solicitation requests coming from guests VMs which causes a loss of the default routes.

BZ#1575752

In previous versions, the *NetName parameters (e.g. InternalApiNetName) changed the names of the default networks. This is no longer supported.

To change the names of the default networks, use a custom composable network file (network_data.yaml) and include it with your 'openstack overcloud deploy' command using the '-n' option. In this file you should set the "name_lower" field to the custom net name for the network you want to change. For more information, see "Using Composable Networks" in the Advanced Overcloud Customization guide.

In addition, you need to add a local parameter for the ServiceNetMap table to network_environment.yaml and override all the default values for the old network name to the new custom name.  The default values can be found in /usr/share/openstack-tripleo-heat-templates/network/service_net_map.j2.yaml. This requirement to modify ServiceNetMap will not be necessary in future OSP-13 releases.

BZ#1577537

Fixes OSP 13 Beta issue where some container images were not available.

BZ#1578312

When the OVSDB server fails over to a different controller node, a reconnection from neutron-server/metadata-agent does not take place because they are not detecting this condition.

As a result, booting VMs may not work as metadata-agent will not provision new metadata namespaces and the clustering is not behaving as expected.

A possible workaround is to restart the ovn_metadata_agent container in all the compute nodes after a new controller has been promoted as master for OVN databases. Also increase  the ovsdb_probe_interval on the plugin.ini to a value of 600000 milliseconds.

BZ#1589849

When the OVN metadata agent is stopped in a Compute node, all the VMs on that node will not have access to the metadata service. The impact is that if a new VM is spawned or an existing VM is rebooted, the VM will fail to access metadata until the OVN metadata agent is brought up back again.

BZ#1592528

In rare circumstances, after rebooting controller nodes several times, RabbitMQ may be running in an inconsistent state that will block API operations on the overcloud.

The symptoms for this issue are:
 - Entries in any of the OpenStack service logs of the form:
 DuplicateMessageError: Found duplicate message(629ff0024219488499b0fac0cacaa3a5). Skipping it.
 - "openstack network agent list" returns that some agents are DOWN

To restore normal operation, run the following command on any of the controller nodes (you only need to do this on one controller):
 pcs resource restart rabbitmq-bundle

3.1.4. Known Issues

These known issues exist in Red Hat OpenStack Platform at this time:

BZ#1321179

OpenStack command-line clients that use `python-requests` can not currently validate certificates that have an IP address in the SAN field.

BZ#1461132

When using Red Hat Ceph Storage as a Block Storage backend for both Cinder volume and Cinder backup, any attempts to perform an incremental backup will result in a
full backup instead, without any warning. This is a known issue.

BZ#1508449

OVN serves DHCP as an openflow controller with ovn-controller directly on compute nodes. But SR-IOV instances are directly attached to the network through the VF/PF. As such, SR-IOV instances will not be able to get DHCP responses from anywhere.

To workaround this issue, change  OS::TripleO::Services::NeutronDhcpAgent to:

   OS::TripleO::Services::NeutronDhcpAgent: docker/services/neutron-dhcp.yaml

BZ#1515815

When the router gateway is cleared, the Layer 3 flows related to learned IP addresses is not removed. The learned IP addresses include the PNF and external gateway IP addresses. This leads stale flows, but not any functional issue. The external gateway and IP address does not change frequently. The stale flows will be removed when the external network is deleted.

BZ#1518126

Redis is unable to correctly replicate data across nodes in a HA deployment with TLS enabled. Redis follower nodes will not contain any data from the leader node. It is recommended to disable TLS for Redis deployments.

BZ#1519783

Neutron may issue an error claiming that the Quota has been exceed for Neutron Router creation.  This is a known issue where multiple router resources are created with a single create request in Neutron DB due to a bug with networking-odl.  The workaround for this issue is to delete the duplicated routers using the OpenStack Neutron CLI and create a router again, resulting with a single instance.

BZ#1557794

A regression was identified in the procedure for backing up and restoring the director undercloud. As a result, the procedure requires modification and verification before it can be published.

The book 'Back Up and Restore the Director Undercloud' is therefore not available with the general availability of Red Hat OpenStack Platform 13. The procedure will be updated as a priority after the general availability release, and published as soon as it is verified.

BZ#1559055

OpenDaylight logging might be missing earlier logs.  This is a known issue with journald logging of OpenDaylight (using the “docker logs opendaylght_api” command).  The current workaround is to switch OpenDaylight logging to the “file” mechanism which will log inside of the container to /opt/opendaylight/data/logs/karaf.log.  To do this, configure the following heat parameter: OpenDaylightLogMechanism: ‘file’.

BZ#1568012

Connecting to an external IP fails when associating a floating IP to an instance then disassociating the floating IP. This situation happens in a tenant VLAN network when:
* a VM spawned on a non-NAPT switch is associated with a floating IP and
* the floating IP is removed.
This results in a missing flow (sporadically) in the FIB table of NAPT switch.

Due to the missing FIB table entry, the VM loses connectivity to the public network.

Associating the floating IP to the VM restores connectivity to the public network. As long as the floating IP is associated with the VM, it will be able to connect to the internet. However, you will lose a public IP/floating IP from the external network.

BZ#1568311

Layer 3 connectivity between nova instances across multiple subnets may fail when an instance without a floating IP tries to reach another instance that has a floating IP on another router. This occurs when nova instances are spread across multiple compute nodes. There is no suitable workaround for this issue.

BZ#1568976

During deployment, one or more OpenDaylight instances may fail to start correctly due to a feature loading bug. This may lead to a deployment or functional failure.

When a deployment passes, only two of the three OpenDaylight instances must be functional for the deployment to succeed. It is possible that the third OpenDaylight instance started incorrectly. Check the health status of each container with the `docker ps` command. If it is unhealthy, restart the container with `docker restart opendaylight_api`.

When a deployment fails, the only option is to restart the deployment. For TLS-based deployments, all OpenDaylight instances must boot correctly or deployment will fail.

BZ#1571864

Temporary removal of Heat stack resources during fast-forward upgrade preparation triggers RHEL unregistration.
As a result, RHEL unregistration is stalled because Heat software deployment signalling does not work properly.

To avoid the problem, while the overcloud is still on OSP 10 and ready to perform the last overcloud minor version update:
1. Edit the template file /usr/share/openstack-tripleo-heat-templates/extraconfig/pre_deploy/rhel-registration/rhel-registration.yaml
2. Delete RHELUnregistration and RHELUnregistrationDeployment resources from the template.
3. Proceed with the minor update and fast-forward upgrade procedure.

BZ#1573597

A poorly performing Swift cluster used as a Gnocchi back end can generate 503 errors in the collectd log and "ConnectionError: ('Connection aborted.', CannotSendRequest())" errors in in gnocchi-metricd.conf.
To mitigate the problem, increase the value of the CollectdDefaultPollingInterval parameter or improve the Swift cluster performance.

BZ#1574708

When an OpenDaylight instance is removed from a cluster and reconnected, the instance may not successfully join the cluster. The node will eventually re-join the cluster.

The following actions should be taken in such a situation:
 * Restart the faulty node.
 * Monitor the REST endpoint to verify the cluster member is healthy: http://$ODL_IP:8081/jolokia/read/org.opendaylight.controller:Category=ShardManager,name=shard-manager-config,type=DistributedConfigDatastore
 * The response should contain a field “SyncStatus”, and a value of “true” will indicate a healthy cluster member.

BZ#1574725

When multiple VMs in the same subnet of a VLAN provider network are scheduled to two different Compute nodes, ARP between the VMs fails sporadically.

Since ARP packets between those VMs fails, there is essentially no networking between the two VMs.

BZ#1575023

The manila-share service fails to initialize because changes to ceph-ansible's complex ceph-keys processing generate incorrect content in the /etc/ceph/ceph.client.manila.keyring file.

To allow the manila-share service to initialize:
1) Make a copy of /usr/share/openstack/tripleo-heat-templates to use for the overcloud deploy.

2) Edit the .../tripleo-heat-templates/docker/services/ceph-ansible/ceph-base.yaml file to change all triple backslashes in line 295 to single backslashes.
Before:
mon_cap: 'allow r, allow command \\\"auth del\\\", allow command \\\"auth caps\\\", allow command \\\"auth get\\\", allow command \\\"auth get-or-create\\\"'
After:
mon_cap: 'allow r, allow command \"auth del\", allow command \"auth caps\", allow command \"auth get\", allow command \"auth get-or-create\"'

3) Deploy the overcloud substituting the path to the copy of tripleo-heat-templates wherever /usr/share/openstack-tripleo-heat templates occurred in your original overcloud-deploy command.

The ceph key /etc/ceph/ceph.client.manila.keyring file will have proper contents and the manila-share service will initialize properly.

BZ#1575118

Ceph Release 12.2.1 lowers the maximum number of PGs allowed for each OSD. The lower limit may cause the monitor to prematurely issue a HEALTH_WARN message.

The monitor warning threshold has been reduced from 300 to 200 PGs per OSD. 200 is still twice the generally recommended target of 100 PGs per OSD. This limit can be adjusted via the mon_max_pg_per_osd option on the monitors. The older mon_pg_warn_max_per_osd option has been removed.

The amount of PGs consumed by a pool can not be decreased. If the upgrade causes a pre-existing deployment to reach the maximum limit, you can raise the limit to its pre-upgrade value during the ceph-upgrade step. In an environment file, add a parameter setting like this:

  parameter_defaults:
    CephConfigOverrides:
      mon_max_pg_per_osd: 300

The setting is applied into ceph.conf and the cluster stays in HEALTH_OK state.

BZ#1575150

There is a known issue where the OpenDaylight cluster may stop responding for up to 30 minutes when an OpenDaylight cluster member is stopped (due to failure or otherwise).  The workaround is wait until the cluster becomes active again.

BZ#1575496

When using a physical host interface for external network with Director, if the interface is not attached to an OVS bridge, the interface will not pass traffic in an OpenDaylight setup. Traffic will not pass and you should avoid this type of configuration.

Always use an OVS bridge in the NIC templates for an overcloud external network.  This bridge is named "br-ex" by default in Director (although you may use any name).  You should attach the physical host interface used for the external network to this OVS bridge.

When you use an interface attached to an OVS bridge, the deployment will function correctly and the external network traffic to tenants will work correctly.

BZ#1577975

OpenDaylight may experience periods of very high CPU usage. This issue should not affect the functionality of OpenDaylight, although it could potentially impact other system services.

BZ#1579025

OVN pacemaker Resource Agent (RA) script sometimes does not handle the promotion action properly when pacemaker tries to promote a slave node. This is seen when the ovsdb-servers report the status as master to the RA script when the master ip is moved to the node. The issue is fixed upstream.

When the issue occurs, the neutron server will not be able to connect the OVN North and South DB servers and all Create/Update/Delete APIs to the neutron server will fail.

Restarting the ovn-dbs-bundle resource will resolve the issue. Run the below command in one of the controller node:

"pcs resource restart ovn-dbs-bundle"

BZ#1579417

SNAT support requires configuring VXLAN tunnels regardless of the encapsulation used in the tenant networks. It is also necessary to configure the MTU correctly when using VLAN tenant networks, since the VXLAN Tunnel header is added to the payload and this could cause the packet to exceed the default MTU (1500 Bytes).

The VXLAN tunnels have to be properly configured in order for the SNAT traffic to flow through them.
When using VLAN tenant networks, use one of the following methods to configure MTU so that SNAT traffic can flow through the VXLAN tunnels::
 * Configure VLAN tenant based networks to use an MTU of 1450 on a per network configuration.
 * Set NeutronGlobalPhysnetMtu heat parameter to 1450.  Note: the implication of this means all flat/VLAN provider networks will have a 1450 MTU, which may not be desirable (especially for external provider networks).
 * Configure tenant network underlay with MTU of 1550 (or higher).  This includes setting the MTU in the NIC templates for tenant network NIC.

BZ#1581337

In order to use  the PING type health monitor, the HAProxy (default software we use in our driver for network load balancing) version must be at least 1.6. Any use of an older HAProxy version makes the health-check be TCP connect without the user's knowledge.

The upstream community fixed that by adding a check in the code, that determine the HAProxy version that is in use and  acts accordingly:
If HAProxy version 1.6 or later, we can use PING.
Otherwise, we keep using TCP connect (in the absence of any other solution for those haproxy versions, it is better to do so rather than breaking it altogether).

The problem we have in OSP13 GA is that we ship HAProxy as a part of RHEL channels, which uses an old version of HAProxy. Thus, when OSP13 users configure the PING type health monitor, they will get TCP connect instead.

BZ#1583541

SRIOV based Compute instances have no connectivity to OVS Compute instances if they are on different networks. The workaround is to use an external router that is connected to both VLAN provider networks.

BZ#1584518

RHOSP does not configure the availability of DifferentHostFilter / SameHostFilter by default in nova, and these settings are necessary to properly complete some tests. As such, several security group tests might randomly fail.

You should skip those tests, or alternatively add those filters to your nova configuration.

BZ#1584762

If Telemetry is manually enabled on the undercloud, `hardware.*` metrics does not work due to a misconfiguration of the firewall on each of the nodes.

As a workaround, you need to manually set the `snmpd` subnet with the control plane network by adding an extra template for the undercloud deployment as follows"

parameter_defaults:
  SnmpdIpSubnet: 192.168.24.0/24

BZ#1588186

A race condition causes Open vSwitch to not connect to the Opendaylight openflowplugin. A fix is currently being implemented for a 13.z release of this product.

BZ#1590114

If Telemetry is manually enabled on the undercloud, `hardware.*` metrics does not work due to a misconfiguration of the firewall on each of the nodes.

As a workaround, you need to manually set the `snmpd` subnet with the control plane network by adding an extra template for the undercloud deployment as follows"

parameter_defaults:
  SnmpdIpSubnet: 192.168.24.0/24

BZ#1590560

The ceph-ansible utility does not always remove the ceph-create-keys container from the same node where it was created.

Because of this, the deployment may fail with the message "Error response from daemon: No such container: ceph-create-keys." This may affect any ceph-ansible run, including fresh deployments, that have:
* multiple compute notes or
* a custom role behaving as ceph client which is also hosting a service consuming ceph.

BZ#1590938

If you deploy more than three OSDs on RHCS3 and set the PG number for your pools as determined by pgcalc (https://access.redhat.com/labs/cephpgc), deployment will fail because ceph-ansible creates pools before all OSDs are active.

To avoid the problem, set the default PG number to 32 and when the deployment is finished, manually raise the PG number as described in the Storage Strategies Guide, https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/storage_strategies_guide/placement_groups_pgs#set_the_number_of_pgs.

BZ#1590939

Because ceph-ansible OpenStack pool tasks have an incorrect container name, it is not yet possible to colocate Ceph MONs and OSDs.
Standard HCI (Computes + OSDs) is not affected.

BZ#1593290

After restarting the nova-compute service when a guest with SR-IOV-based network interface(s) attached is running and removing the guest, it is no longer possible to attach SR-IOV VFs on that node to any guest. This is because available devices are enumerated on service startup but as the device is attached to a guest it is not included in the list of host devices.

You must restart the 'nova-compute' service after removing the guest. After removing the guest and restarting the service, the list of available SR-IOV devices will be correct.

BZ#1593715

Insecure registry list is being updated later than some container images are pulled during a major upgrade. As such, container images from newly introduced insecure registry fails to download during `openstack overcloud upgrade run` command.

You can use one of the following workarounds:

Option A: Update the /etc/sysconfig/docker file manually on nodes which have containers managed by Pacemaker, and add any newly introduced insecure registries.

Option B: run `openstack overcloud deploy` command right before upgrading, and provide the desired new insecure registry list using an environment file with the DockerInsecureRegistryAddress parameter.

All container images should download successfully during upgrade.

BZ#1593757

Enabling Octavia on an existing overcloud deployment reports as a success, but the Octavia API endpoints are not reachable because the firewall rules on the Controller nodes are misconfigured.

Workaround:

On all controller nodes, add firewall rules and make sure they are inserted before the DROP rule:

IPv4:
  # iptables -A INPUT -p tcp -m multiport --dports 9876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy ipv4" -j ACCEPT
  # iptables -A INPUT -p tcp -m multiport --dports 13876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy_ssl ipv4" -j ACCEPT
  # iptables -A INPUT -p tcp -m multiport --dports 9876,13876 -m state --state NEW -m comment --comment "120 octavia_api ipv4" -j ACCEPT


IPv6:
  # ip6tables -A INPUT -p tcp -m multiport --dports 9876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy ipv6" -j ACCEPT
  # ip6tables -A INPUT -p tcp -m multiport --dports 13876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy_ssl ipv6" -j ACCEPT
  # ip6tables -A INPUT -p tcp -m multiport --dports 9876,13876 -m state --state NEW -m comment --comment "120 octavia_api ipv6" -j ACCEPT

Restart HAProxy:
  # docker restart haproxy-bundle-docker-0

BZ#1595363

During the fast forward upgrade process, users upgrade the undercloud from version 10 to version 11. In some situations, the nova-api.log might report the following error:

`Unexpected API Error. Table 'nova_cell0.instances' doesn't exist`

You can resolve this error by running the following command:

$ sudo nova-manage api_db sync

This issue is non-critical and should not impede the fast forward upgrade process in a major way.

3.2. Red Hat OpenStack Platform 13 Maintenance Release 19 July 2018

These release notes highlight technology preview items, recommended practices, known issues, and deprecated functionality to be taken into consideration when deploying this release of Red Hat OpenStack Platform.

3.2.1. Enhancements

This release of Red Hat OpenStack Platform features the following enhancements:

BZ#1592823

Logs from Ansible playbooks now include timestamps that provide information about the timing of actions during deployment, updates, and upgrades.

3.2.2. Release Notes

This section outlines important details about the release, including recommended practices and notable changes to Red Hat OpenStack Platform. You must take this information into account to ensure the best possible outcomes for your deployment.

BZ#1578312

When the OVSDB server fails over to a different controller node, a reconnection from neutron-server/metadata-agent does not take place because they are not detecting this condition.

As a result, booting VMs may not work as metadata-agent will not provision new metadata namespaces and the clustering is not behaving as expected.

A possible workaround is to restart the ovn_metadata_agent container in all the compute nodes after a new controller has been promoted as master for OVN databases. Also increase  the ovsdb_probe_interval on the plugin.ini to a value of 600000 milliseconds.

3.2.3. Known Issues

These known issues exist in Red Hat OpenStack Platform at this time:

BZ#1515815

When the router gateway is cleared, the Layer 3 flows related to learned IP addresses is not removed. The learned IP addresses include the PNF and external gateway IP addresses. This leads stale flows, but not any functional issue. The external gateway and IP address does not change frequently. The stale flows will be removed when the external network is deleted.

BZ#1519783

Neutron may issue an error claiming that the Quota has been exceed for Neutron Router creation.  This is a known issue where multiple router resources are created with a single create request in Neutron DB due to a bug with networking-odl.  The workaround for this issue is to delete the duplicated routers using the OpenStack Neutron CLI and create a router again, resulting with a single instance.

BZ#1559055

OpenDaylight logging might be missing earlier logs.  This is a known issue with journald logging of OpenDaylight (using the “docker logs opendaylght_api” command).  The current workaround is to switch OpenDaylight logging to the “file” mechanism which will log inside of the container to /opt/opendaylight/data/logs/karaf.log.  To do this, configure the following heat parameter: OpenDaylightLogMechanism: ‘file’.

BZ#1568311

Layer 3 connectivity between nova instances across multiple subnets may fail when an instance without a floating IP tries to reach another instance that has a floating IP on another router. This occurs when nova instances are spread across multiple compute nodes. There is no suitable workaround for this issue.

BZ#1568976

During deployment, one or more OpenDaylight instances may fail to start correctly due to a feature loading bug. This may lead to a deployment or functional failure.

When a deployment passes, only two of the three OpenDaylight instances must be functional for the deployment to succeed. It is possible that the third OpenDaylight instance started incorrectly. Check the health status of each container with the `docker ps` command. If it is unhealthy, restart the container with `docker restart opendaylight_api`.

When a deployment fails, the only option is to restart the deployment. For TLS-based deployments, all OpenDaylight instances must boot correctly or deployment will fail.

BZ#1583541

SRIOV based Compute instances have no connectivity to OVS Compute instances if they are on different networks. The workaround is to use an external router that is connected to both VLAN provider networks.

BZ#1588186

A race condition causes Open vSwitch to not connect to the Opendaylight openflowplugin. A fix is currently being implemented for a 13.z release of this product.

BZ#1593757

Enabling Octavia on an existing overcloud deployment reports as a success, but the Octavia API endpoints are not reachable because the firewall rules on the Controller nodes are misconfigured.

Workaround:

On all controller nodes, add firewall rules and make sure they are inserted before the DROP rule:

IPv4:
  # iptables -A INPUT -p tcp -m multiport --dports 9876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy ipv4" -j ACCEPT
  # iptables -A INPUT -p tcp -m multiport --dports 13876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy_ssl ipv4" -j ACCEPT
  # iptables -A INPUT -p tcp -m multiport --dports 9876,13876 -m state --state NEW -m comment --comment "120 octavia_api ipv4" -j ACCEPT


IPv6:
  # ip6tables -A INPUT -p tcp -m multiport --dports 9876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy ipv6" -j ACCEPT
  # ip6tables -A INPUT -p tcp -m multiport --dports 13876 -m state --state NEW -m comment --comment "100 octavia_api_haproxy_ssl ipv6" -j ACCEPT
  # ip6tables -A INPUT -p tcp -m multiport --dports 9876,13876 -m state --state NEW -m comment --comment "120 octavia_api ipv6" -j ACCEPT

Restart HAProxy:
  # docker restart haproxy-bundle-docker-0

3.3. Red Hat OpenStack Platform 13 Maintenance Release 29 August 2018

These release notes highlight technology preview items, recommended practices, known issues, and deprecated functionality to be taken into consideration when deploying this release of Red Hat OpenStack Platform.

3.3.1. Enhancements

This release of Red Hat OpenStack Platform features the following enhancements:

BZ#1561961

This feature adds support for PCI device NUMA affinity policies. These are configured as part of the `[pci]alias` configuration options. There are three policies supported:
- `required`
- `legacy`
- `preferred`
In all cases, strict NUMA affinity is provided if possible. The key difference between the policies is how much NUMA affinity you can forsake before failing to schedule.
These policies allow you to configure how strict your NUMA affinity is on a per-device basis or, more specifically, per device alias. This is useful to ensure maximum resource utilization.
When the 'preferred' policy is configured for a PCI device, nova now utilizes CPUs on a different NUMA node from the NUMA node of the PCI device if this is all that is available. This results in increased resource utilization with the downside of reduced performance for these instances.

BZ#1564918

Previously, Ironic considered just one IPMI error as retryable. That might have caused unjustified Ironic failure. With this enhancement, Ironic treats more types of IPMI error messages as retryable by the IPMI-backed hardware interfaces, such as power and management hardware interfaces. Specifically, "Node busy", "Timeout", "Out of space", and "BMC initialization in progress" IPMI errors cause Ironic to retry the IPMI command. The result is improved reliability of IPMI based communication with BMC.

BZ#1571741

Nova's libvirt driver now allows the specification of granular CPU feature flags when configuring CPU models.

One benefit of this change is the alleviation of a performance degradation experienced on guests running with certain Intel-based virtual CPU models after application of the "Meltdown" CVE fixes.  This guest performance impact is reduced by exposing the CPU feature flag 'PCID' ("Process-Context ID") to the *guest* CPU, assuming that the PCID flag is available in the physical hardware itself.

For more details, refer to the  documentation of ``[libvirt]/cpu_model_extra_flags`` in ``nova.conf`` for usage details.

BZ#1574349

It is possible to create the stonith resources for the cluster automatically before the overcloud deployment.
Before the start of the deployment, run the following command:
openstack overcloud generate fencing --ipmi-lanplus --output /home/stack/fencing.yaml /home/stack/instackenv.json

Then pass '-e /home/stack/fencing.yaml' to the list of arguments to the deploy command. This creates the necessary stonith resources for the cluster automatically.

BZ#1578633

rhosp-director-images are now multi-arch. OSP 13 now has overcloud full and ironic python agent images for ppc64le. The resulting rhosp-director-images were adjusted to accommodate this change.
As a result, rhosp-director-images and rhosp-director-images-ipa are now meta-packages, with rhosp-director-images-<arch> and rhosp-director-images-ipa-<arch> rpms added for multi-arch support.

BZ#1578636

rhosp-director-images are now multi-arch. OSP 13 now has overcloud full and ironic python agent images for ppc64le. The resulting rhosp-director-images were adjusted to accommodate this change.
As a result, rhosp-director-images and rhosp-director-images-ipa are now meta-packages, with rhosp-director-images-<arch> and rhosp-director-images-ipa-<arch> rpms added for multi-arch support.

BZ#1579691

Nova's libvirt driver now allows the specification of granular CPU feature flags when configuring CPU models.
One benefit of this is the alleviation of a performance degradation experienced on guests running with certain Intel-based virtual CPU models after application of the "Meltdown" CVE fixes. This guest performance impact is reduced by exposing the CPU feature flag 'PCID' ("Process-Context ID") to the *guest* CPU, assuming that the PCID flag is available in the physical hardware itself.
This change removes the restriction of having only 'PCID' as the only CPU feature flag and allows for the addition and removal of multiple CPU flags, making way for other use cases.
For more information, refer to the  documentation of ``[libvirt]/cpu_model_extra_flags`` in ``nova.conf``.

BZ#1601472

The procedures for upgrading from RHOSP 10 to RHOSP 13 with NFV deployed have been retested and updated for DPDK and SR-IOV environments.

BZ#1606224

With this update, Ceph storage is supported by KVM virtualization on all CPU architectures supported by Red Hat.

BZ#1609352

This enhancement sees the addition of GA containers for nova and utilities, and Technology Preview containers for Cinder, Glance, Keystone, Neutron, and Swift on IBM Power LE.

BZ#1619311

rhosp-director-images are now multi-arch. OSP 13 now has overcloud full and ironic python agent images for ppc64le. The resulting rhosp-director-images were adjusted to accommodate this change.
As a result, rhosp-director-images and rhosp-director-images-ipa are now meta-packages, with rhosp-director-images-<arch> and rhosp-director-images-ipa-<arch> rpms added for multi-arch support.

3.3.2. Release Notes

This section outlines important details about the release, including recommended practices and notable changes to Red Hat OpenStack Platform. You must take this information into account to ensure the best possible outcomes for your deployment.

BZ#1523864

This update adds support for use of Manila IPv6 export locations and access rules with Dell-EMC Unity and VNX back ends.

BZ#1549770

Containers are now the default deployment method. There is still a way to deploy the baremetal services in environments/baremetal-services.yaml, but this is expected to eventually disappear.

Environment files with resource registries referencing environments/services-docker must be altered to the environments/services paths. If you need to retain any of the
deployed baremetal services, update references to environments/services-baremetal instead of the originally placed environments/services.

BZ#1565028

README has been added to /var/log/opendaylight, stating the correct OpenDaylight log path.

BZ#1570039

The compress option for the containerized logrotate service to compress rotated logs by default has been added. The delaycompress option ensures the first rotation of a log file remains uncompressed.

BZ#1575752

In previous versions, the *NetName parameters (e.g. InternalApiNetName) changed the names of the default networks. This is no longer supported.
To change the names of the default networks, use a custom composable network file (network_data.yaml) and include it with your 'openstack overcloud deploy' command using the '-n' option. In this file, set the "name_lower" field to the custom net name for the network you want to change. For more information, see "Using Composable Networks" in the Advanced Overcloud Customization guide.
In addition, you need to add a local parameter for the ServiceNetMap table to network_environment.yaml and override all the default values for the old network name to the new custom name. You can find the default values in /usr/share/openstack-tripleo-heat-templates/network/service_net_map.j2.yaml. This requirement to modify ServiceNetMap will not be necessary in future OSP-13 releases.

BZ#1592528

In rare circumstances, after rebooting controller nodes several times, RabbitMQ may be running in an inconsistent state that will block API operations on the overcloud.

The symptoms for this issue are:
 - Entries in any of the OpenStack service logs of the form:
 DuplicateMessageError: Found duplicate message(629ff0024219488499b0fac0cacaa3a5). Skipping it.
 - "openstack network agent list" returns that some agents are DOWN

To restore normal operation, run the following command on any of the controller nodes (you only need to do this on one controller):
 pcs resource restart rabbitmq-bundle

3.3.3. Known Issues

These known issues exist in Red Hat OpenStack Platform at this time:

BZ#1557794

A regression was identified in the procedure for backing up and restoring the director undercloud. As a result, the procedure requires modification and verification before it can be published.

The book 'Back Up and Restore the Director Undercloud' is therefore not available with the general availability of Red Hat OpenStack Platform 13. The procedure will be updated as a priority after the general availability release, and published as soon as it is verified.

BZ#1579025

OVN pacemaker Resource Agent (RA) script sometimes does not handle the promotion action properly when pacemaker tries to promote a slave node. This is seen when the ovsdb-servers report the status as master to the RA script when the master ip is moved to the node. The issue is fixed upstream.

When the issue occurs, the neutron server will not be able to connect the OVN North and South DB servers and all Create/Update/Delete APIs to the neutron server will fail.

Restarting the ovn-dbs-bundle resource will resolve the issue. Run the below command in one of the controller node:

"pcs resource restart ovn-dbs-bundle"

BZ#1584762

If Telemetry is manually enabled on the undercloud, `hardware.*` metrics does not work due to a misconfiguration of the firewall on each of the nodes. As a workaround, you need to manually set the `snmpd` subnet with the control plane network by adding an extra template for the undercloud deployment as follows:
parameter_defaults:
  SnmpdIpSubnet: 192.168.24.0/24