Language:
Format:

Red Hat Training

A Red Hat training course is available for Red Hat OpenStack Platform

Chapter 4. Technical Notes

This chapter supplements the information contained in the text of Red Hat OpenStack Platform "Newton" errata advisories released through the Content Delivery Network.

4.1. RHEA-2016:2948 — Red Hat OpenStack Platform 10 Enhancement Update

The bugs contained in this section are addressed by advisory RHEA-2016:2948. Further information about this advisory is available at https://access.redhat.com/errata/RHEA-2016:2948.html.

instack-undercloud

BZ#1266509

Previously, instack-undercloud did not verify that a subnet mask was provided for the `local_ip` parameter, and incorrectly used a /32 mask. Consequently, networking would not work correctly on the undercloud in this case (for example, introspection would not work). With this update, instack-undercloud now validates that a correct subnet mask has been provided.

BZ#1289614

Prior to this update, there was no automated process for periodically purging expired tokens from the Identity Service (keystone) database. Consequently, the keystone database could potentially continue to grow, resulting in a large database size and the possible consumption of all available disk space.
With this update, a crontab entry was added to periodically query and delete expired tokens in the keystone database, running once per day. As a result, the keystone database will no longer face unlimited growth due to expired tokens.

BZ#1320318

Previously, the `pxe_ilo` Bare Metal Service (ironic) driver would automatically switch to UEFI boot when it detected UEFI-capable hardware, even though the environment might not support UEFI.
Consequently, the deployment process failed with pxe_ilo drivers when an environment did not support UEFI. 
With this update, the pxe_ilo driver defaults to BIOS boot mode, and a deployment using pxe_ilo now works out of the box, regardless of whether UEFI is configured properly.

BZ#1323024

A puppet manifest bug incorrectly disables LVM partition automounting during the undercloud installation process. As a result, it is possible for undercloud hosts with partitions other than root and swap (activated on kernel command line) to only boot into an emergency shell.

There are several ways to work around this issue. Choose one from the following:

1. Remove the mountpoints manually from /etc/fstab. Doing so will prevent the issue from manifesting in all future cases. Other partitions could also be removed, and the space added to other partitions (like root or swap).

2. Configure the partitions to be activated in /etc/lvm.conf. Doing so will work until the next update/upgrade, when the undercloud installation is re-run.

3. Restrict initial deployment to only root and swap partitions. This will avoid the issue completely.

BZ#1324842

Previously, the director auto-generated a value for 'readonly_user_name' (in /etc/ceilometer/ceilometer.conf) that exceeded the 32-characters. This resulted in ValueSizeConstraint errors during upgrades. With this release, the director now sets 'readonly_user_name' to 'ro_snmp_user' by default, which ensures compliance with the character limit.

BZ#1355818

Previously, the swift proxy pipeline was misconfigured, with the consequence that swift memory usage continued to grow until it was killed. With this fix, proxy-logging has been configured earlier in the swift proxy pipeline. As a result, swift memory usage will not grow continuously.

mariadb-galera

BZ#1375184

Because Red Hat Enterprise Linux 7.3 changed the return format of the "systemctl is-enabled" command as consumed by shell scripts, the mariadb-galera RPM package, upon installation, erroneously detected that the MariaDB service was enabled when it was not. As a result, the Red Hat OpenStack Platform installer, which then tried to run mariadb-galera using Pacemaker and not systemd, failed to start Galera. With this update, mariadb-galera's RPM installation scripts now use a different systemctl command,  correctly detecting the default MariaDB as disabled, and the installer can succeed.

BZ#1373598

Previously, both the 'mariadb-server' and 'mariadb-galera-server' packages shipped the client-facing libraries: 'dialog.so' and 'mysql_clear_password.so'. As a result, the 'mariadb-galera-server' package would fail to install because of package conflicts.  

With this update, the 'dialog.so' and 'mysql_clear_password.so' libraries have been moved from 'mariadb-galera-server' to 'mariadb-libs'. As a result, the 'mariadb-galera-server' package installs successfully.

openstack-gnocchi

BZ#1377763

With Gnocchi 2.2, job dispatch is coordinated between controllers using Redis. As a result, you can expect improved processing of Telemetry measures.

openstack-heat

BZ#1349120

Prior to this update, Heat would occasionally consider a `FloatingIP` resource deleted while the deletion was in fact still in progress. Consequently, resources that the `FloatingIP` depended on would sometimes fail to be deleted because the `FloatingIP` still existed.
With this update, Heat now checks that the `FloatingIP` can no longer be found before considering the resource deleted, and stack deletes should proceed normally.

BZ#1375930

Previously, the `str_replace` intrinsic function worked by calling the Python `str.replace()` method for each string to be replaced. Consequently, if the replacement text for one replacement contained another of the strings to be replaced, the replacement text itself could be replaced. The result was non-deterministic, since the replacement order was not guaranteed. Therefore users had to be careful to use techniques, such as guard characters, to ensure that there was no misinterpretation.
With this update, replacements are now performed in a single pass, so only the original text is subject to replacement.
As a result, the output of `str_replace` is now deterministic, and consistent with user expectations even without the use of guard characters. When keys overlap in the input, longer matches are preferred. Lexicographically smaller strings will be replaced first if there is still ambiguity.

BZ#1314080

With this enhancement, `heat-manage` now supports a `heat-manage reset_stack_status` subcommand. This was added to manage situations where `heat-engine` was unable to contact the database, causing any stacks that were in-progress to remain stuck due to outdated database information. When this occurred, administrators needed a way to reset the status to allow these stacks to be updated again.
As a result, administrators can now use the `heat-manage reset_stack_status` command to reset a stuck stack.

openstack-ironic

BZ#1347475

This update adds a socat-based serial console for IPMItool drivers. This was added because users may want to access a bare metal node's serial console in the same way that they access a virtual node's console. As a result, the new driver `pxe_ipmitool_socat` was added, with support for the serial console using the `socat` utility.

BZ#1310883

The Bare Metal provisioning service now wipes a disk's metadata before partitioning and writing an image into it. This ensures that the new image boots normally. In previous releases, the Bare Metal provisioning service didn't remove old metadata before starting work on a device, which made it possible for a deployment to fail.

BZ#1319841

The openstack-ironic-conductor service now checks whether all drivers specified in the 'enabled_drivers' option are unique. The service then removes duplicated entries and logs a warning. In previous releases, duplicate entries in the 'enabled_drivers' option simply caused the openstack-ironic-conductor service to fail, thereby preventing the Bare Metal provisioning service from loading any drivers.

BZ#1344004

Previously, 'ironic-conductor' did not correctly pass the authentication token to the 'python-neutronclient'. As a result, automatic node cleaning failed with a tear down error. 

With this update, OpenStack Baremetal Provisioning (ironic) was migrated to use the 'keystoneauth' sessions rather than directly constructing Identity service client objects. As a result, nodes can now be successfully torn down after cleaning.

BZ#1385114

To determine which node is being deployed, the deploy ramdisk (IPA) provides the Bare Metal provisioning service with a list of MAC addresses as unique identifiers for that node. In previous releases, the Bare Metal provisioning service only expected normal MAC address formats; namely, 6 octets. The GID of Infiniband NICs, however, have 20 octets. As such, whenever an Infiniband NIC was present on the node, the deployment would fail since the Bare Metal provisioning API could not validate the MAC address correctly.

With this release, the Bare Metal provisioning service now ignores MAC addresses that don't conform with the normal MAC address format of 6 octets.

BZ#1387322

This release removes a redundant 'dhcp' command from the iPXE templates for deployment and introspection. In some cases, this redundant command caused an incorrect interface to receive an IP address.

openstack-ironic-inspector

BZ#1323735

Previously, the modification dates were not being set on the IPA RAM disk logs when creating a tarfile. As a result, the introspection logs appeared to have the modification date of 1970-01-01, causing GNU tar to issue a warning when extracting the files. 

With this update, the modification dates are set correctly when creating a tarfile. The timestamps are now correct and GNU tar no longer issues the warning.

openstack-ironic-python-agent

BZ#1393008

This release features more thorough error checking and handling around LLDP discovery. This enhancement prevents malformed packages from failing LLDP discovery; in addition, failed LLDP discovery no longer fails the whole introspection process.

openstack-manila

BZ#1380482

Prior to this update, the Manila Ceph FS driver did not check if it could connect to the Ceph server.
Consequently, if the connection to the Ceph server did not work, `manila-share` service kept crashing or respawning without any timeout.
With this update, there is now a check to confirm that the Ceph connection works when initializing the Manila Ceph FS driver. As a result, the Ceph driver checks the Ceph connection on driver init, and if it fails the driver is not initialized and no further steps are performed.

openstack-neutron

BZ#1381620

Previously, the maximum number of client connections (i.e greenlets spawned at a time) opened at any time by the WSGI server was set to 100 with 'wsgi_default_pool_size'. While this setting was adequate for the OpenStack Networking API server, the state change server created heavy CPU loads on the L3 agent, which caused the agent to crash. 

With this release, you can now use the new 'ha_keepalived_state_change_server_threads' setting to configure the number of threads in the state change server. Client connections are no longer limited by 'wsgi_default_pool_size', thereby avoiding an L3 agent crash when many state change server threads are spawned.

BZ#1382717

Previosuly, the 'vport_gre' kernel module had a dependency on the 'ip_gre' kernel module in Red Hat Enterprise Linux 7.3. The 'ip_gre' module created two new interfaces: 'gre0' and 'gretap0'. These interfaces are created in each namespace and cannot be removed. As a result, when 'neutron-netns'cleanup' purged all the interfaces during the namespace cleanup, the 'gre0' and 'gretap0' were not removed. This prevented the network namespace from being deleted due to some interfaces still being present. 

With this update, the 'gre0' and 'gretap0' interfaces are added to the whitelist of interfaces and are ignored when checking whether the namespace contains any interface. As a result, the network namespace is deleted even when it contains the 'gre0' and 'gretap0' interfaces.

BZ#1384334

This release adds a HTTPProxyToWSGI middleware in front of the OpenStack Networking API to set up a request URL correctly in case a proxy (eg. HAProxy) is used between the client and server. This ensures that when a client uses SSL, the server recognizes this and responds using the correct protocol. Previously, using a proxy made it possible for the server to respond with HTTP (instead of HTTPS) even when a client used SSL.

BZ#1387546

Previously, it was possible for the OpenStack networking OVS agent to compare non-translated string to translated, UTF-16 strings when a subprocess didn't run properly. On non-English locales, this could result in an exception, thereby preventing instances from booting.

To address this, failure checks were updated to depend on the actual return value of failed subprocesses instead of strings. This ensures that subprocess failures are handled properly under non-English locales.

BZ#1325682

With this update, IP traffic can be managed by DSCP marking rules attached to QoS policies, which are in turn applied to networks and ports.
This was added because different sources of traffic may require different levels of prioritisation at the network level, especially when dealing with real-time information, or critical control data. As a result, the traffic from the specific ports and networks can be marked with DSCP flags. Note that only Open vSwitch is supported in this release.

openstack-nova

BZ#1188175

This enhancement adds support for virtual device role tagging. This was added because an instance's operating system may need extra information about the virtual devices it is running on. For example, in an instance with multiple virtual network interfaces, the guest operating system needs to distinguish between their intended usage in order to provision them accordingly.
With this update, virtual device role tagging allows users to tag virtual devices when creating an instance. Those tags are then presented to the instance (along with other device metadata) using the metadata API, and through the config drive (if enabled). For more information, see the chapter `Use Tagging for Virtual Device Identification` in the Red Hat OpenStack Platform 10 Networking Guide: https://access.redhat.com/documentation/en/red-hat-openstack-platform/

BZ#1189551

This update adds the `real time` feature, which provides stronger guarantees for worst-case scheduler latency for vCPUs. This update assists tenants that need to run workloads concerned with CPU execution latency, and that require the guarantees offered by a real time KVM guest configuration.

BZ#1233920

This enhancement adds support for virtual device role tagging. This was added because an instance's operating system may need extra information about the virtual devices it is running on. For example, in an instance with multiple virtual network interfaces, the guest operating system needs to distinguish between their intended usage in order to provision them accordingly.
With this update, virtual device role tagging allows users to tag virtual devices when creating an instance. Those tags are then presented to the instance (along with other device metadata) using the metadata API, and through the config drive (if enabled). For more information, see the chapter `Use Tagging for Virtual Device Identification` in the Red Hat OpenStack Platform 10 Networking Guide: https://access.redhat.com/documentation/en/red-hat-openstack-platform/

BZ#1263816

Previously, the nova ironic virt driver wrote an instance UUID in the Bare Metal Provisioning (ironic) node before starting a deployment. If something failed between writing the UUID and starting the deployment, Compute did not remove the instance after it failed to spawn the instance. As a result, the Bare Metal Provisioning (ironic) node would have an instance UUID set and would not be picked for another deployment. 

With this update, if spawning an instance fails at any stage of the deployment, the ironic virt driver ensures that the instance UUID is cleaned up. As a result, nodes will not have an instance UUID set and will be picked up for a new deployment.

openstack-puppet-modules

BZ#1284058

Previously, Object Storage service deployed using the director used ceilometer middleware that had been deprecated since the Red Hat OpenStack Platform 8 (liberty) release.

With this update, the Object Storage service has been fixed to use the ceilometer middleware from python-ceilometermiddleware which is the supported version for this release.

BZ#1372821

Previously, the Time Series Database-as-a-Service (gnocchi) API workers were configured to be deployed be default with a single process and logical cpu_core count for threads, resulting in the gnocchi API running in httpd to be deployed with a single process. 

As a best practice, gnocchi recommends the number of process and threads to be 1.5 * cpu_count. With this update, the worker count is max(($::processorcount + 0)/4, 2) and threads to 1. As a result, the gnocchi API workers run with the right number of workers and threads for better performance.

openstack-tripleo-common

BZ#1382174

Previously, the 'DeployIdentifier' was not being updated for package update, resulting in Puppet not being run on the non-controller nodes. 

With this update, the 'DeployIdentifier' value is incremented. As a result, Puppet runs and updates packages on the non-controller nodes.

BZ#1323700

Previously, in the OpenStack Director, the 'upgrade-non-controller.sh' script used by an operator on the Undercloud to upgrade the non-controller nodes as a part of the major upgrade workflow did not report the upgrade status when the '--query' option was used. As a result, the '--query' option did not work as documented by the '-h' helptext. 

With this update, the '--query' option now provides the last few lines of the 'yum.log' file from the given node as an indication of the upgrade status. Also, the script now accepts the long and short versions for each of the options ('-q' and '--query'). As a result, the 'upgrade-non-controller.sh' script is now improved to provide at least some indication of the node upgrade status.

BZ#1383627

Nodes that are imported using "openstack baremetal import --json instackenv.json" should be powered off prior to attempting import. If the nodes are powered on, Ironic will not attempt to add the nodes or attempt introspection.
As a workaround, power off all overcloud nodes prior to running "openstack baremetal import --json instackenv.json".
As a result, if the nodes are powered off, the import should work successfully.

openstack-tripleo-heat-templates

BZ#1262064

It is now possible to deploy 'cinder-backup' in the overcloud using a Heat environment file when launching the stack deployment. The environment file which enables 'cinder-backup' is /usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml. The 'cinder-backup' service will initially support the use of Swift or Ceph as backends. The 'cinder-backup' service performs backups of Cinder volumes on backends different than the one where the volumes are stored. The 'cinder-backup' service will be running in the overcloud if included at deployment time.

BZ#1282491

Prior to this update, the RabbitMQ maximum open file descriptors was set to 4096. Consequently, customers with larger deployments could hit this limit and face stability issues. With this update, the maximum open file descriptor limit for RabbitMQ has been increased to 65536. As a result, larger deployments should now be significantly less likely to run into this issue.

BZ#1242593

With this enhancement, the OpenStack Bare Metal provisioning service (ironic) can be deployed in the overcloud to support the provision of bare metal instances. This was added because customers may want to deploy bare metal instances in their overcloud.
As a result, the Red Hat OpenStack Platform director can now optionally deploy the Bare metal service in order to provision bare metal instances in the overcloud.

BZ#1274196

With this update, the iptables firewall on the overcloud controller nodes are enabled to ensure better security. As a result, the necessary ports are opened so that overcloud services will continue to function as before.

BZ#1290251

With this update, a new feature to enable connecting the overcloud to a monitoring infrastructure adds availability monitoring agents (sensu-client) to be deployed on the overcloud nodes. 

To enable the monitoring agents deployment, use the environment file '/usr/share/openstack/tripleo-heat-templates/environments/monitoring-environment.yaml' and fill in the following parameters in the configuration YAML file:

MonitoringRabbitHost: host where the RabbitMQ instance for monitoring purposes is running
MonitoringRabbitPort: port on which  the RabbitMQ instance for monitoring purposes is running
MonitoringRabbitUserName: username to connect to RabbitMQ instance
MonitoringRabbitPassword: password to connect to RabbitMQ instance
MonitoringRabbitVhost: RabbitMQ vhost used for monitoring purposes

BZ#1309460

You can now use the director to deploy Ceph RadosGW as your object storage gateway. To do so, include /usr/share/openstack-tripleo-heat-templates/environmens/ceph-radosgw.yaml in your overcloud deployment. When you use this heat template, the default Object Storage service (swift) will not be deployed.

BZ#1325680

Typically, the installation and configuration of OVS+DPDK in OpenStack is performed manually after overcloud deployment. This can be very challenging for the operator and tedious to do over a large number of Compute nodes. The installation of OVS+DPDK has now been automated in tripleo. Identification of the hardware capabilities for DPDK were previously done manually, and is now automated during introspection. This hardware detection also provides the operator with the data needed for configuring Heat templates. At present, it is not possible to have the co-existence of Compute nodes with DPDK-enabled hardware and without DPDK-enabled hardware.
The `ironic` Python Agent discovers the following hardware details and stores it in a swift blob:
* CPU flags for hugepages support - If pse exists then 2MB hugepages are supported If pdpe1gb exists then 1GB hugepages are supported
* CPU flags for IOMMU - If VT-d/svm exists, then IOMMU is supported, provided IOMMU support is enabled in BIOS.
* Compatible nics - compared with the list of NICs whitelisted for DPDK, as listed here http://dpdk.org/doc/nics

Nodes without any of the above-mentioned capabilities cannot be used for the Compute role with DPDK.

* Operator will have a provision to enable DPDK on Compute nodes.
* The overcloud image for the nodes identified to be Compute-capable and having DPDK NICs, will have the OVS+DPDK package instead of OVS. It will also have packages `dpdk` and `driverctl`.
* The device names of the DPDK capable NIC’s will be obtained from T-H-T. The PCI address of DPDK NIC needs to be identified from the device name. It is required for whitelisting the DPDK NICs during PCI probe.
* Hugepages needs to be enabled in the Compute nodes with DPDK.
* CPU isolation needs to be done so that the CPU cores reserved for DPDK Poll Mode Drivers (PMD) are not used by the general kernel balancing, interrupt handling and scheduling algorithms.
* On each Compute node with a DPDK-enabled NIC, puppet will configure the DPDK_OPTIONS for whitelisted NICs, CPU mask, and number of memory channels for DPDK PMD. The DPDK_OPTIONS needs to be set in /etc/sysconfig/openvswitch.

`Os-net-config` performs the following steps:
* Associate the given interfaces with the dpdk drivers (default as vfio-pci driver) by identifying the pci address of the given interface. The driverctl will be used to bind the driver persistently.
* Understand the ovs_user_bridge and ovs_dpdk_port types and configure the ifcfg scripts accordingly.
* The “TYPE” ovs_user_bridge will translate to OVS type OVSUserBridge and based on this OVS will configure the datapath type to ‘netdev’.
* The “TYPE” ovs_dpdk_port will translate OVS type OVSDPDKPort and based on this OVS adds the port to the bridge with interface type as ‘dpdk’
* Understand the ovs_dpdk_bond and configure the ifcfg scripts accordingly.

On each Compute node with a DPDK-enabled NIC, puppet will perform the following steps:
* Enable OVS+DPDK in /etc/neutron/plugins/ml2/openvswitch_agent.ini [OVS] datapath_type=netdev vhostuser_socket_dir=/var/run/openvswitch
* Configure vhostuser ports in /var/run/openvswitch to be owned by qemu.

On each controller node, puppet will perform the following steps:
* Add NUMATopologyFilter to scheduler_default_filters in nova.conf.

As a result, the automation of the above-mentioned enhanced platform awareness has been completed, and verified by QA testing.

BZ#1337782

This release now features Composable Roles. TripleO can now be deployed in a composable way, allowing customers to select what services should run on each node. This, in turn, allows support for more complex use-cases.

BZ#1337783

Generic nodes can now be deployed during the hardware provisioning phase. These nodes are deployed with a generic operating system (namely, Red Hat Enterprise Linux); customers can then deploy additional services directly on these nodes.

BZ#1381628

As described in https://bugs.launchpad.net/tripleo/+bug/1630247, the Sahara service in upstream Newton TripleO is now disabled by default. As part of the upgrade procedure from Red Hat OpenStack Platform 9 to Red Hat OpenStack Platform 10, the Sahara services are enabled/retained by default. If the operator decides they do not want Sahara after the upgrade, they need to include the provided `-e 'major-upgrade-remove-sahara.yaml'` environment file as part of the deployment command for the controller upgrade and converge steps. Note: this environment file must be specified last, especially for the converge step, but it could be done for both steps to avoid confusion. In this case, the Sahara services would not be restarted after the major upgrade.
This approach allows Sahara services to be properly handled during the OSP9 to OSP10 upgrade. As a result, Sahara services are retained as part of the upgrade. In addition, the operator can still explicitly disable Sahara, if necessary.

BZ#1389502

This update allows for custom values for the kernel.pid_max sysctl key using the KernelPidMax Heat parameter with a default of 1048576. On nodes working as Ceph clients there might be a large number of running threads, depending on the number of ceph-osd instances. In such cases, the pid_max might reach the maximum value and cause I/O errors. The pid_max key has a higher default and can be customized via KernelPidMax parameter.

BZ#1243483

Previously, polling the Orchestration service for server metadata resulted in REST API calls to Compute, resulting in a constant load on the nova-api which worsened as the cloud was scaled up. 

With this update, Object Storage service is now polled for server metadata and loading the heat stack no longer makes unnecessary calls to the nova-api. As a result, there is a significant reduction in the load on the undercloud as the overcloud scales up.

BZ#1315899

Previously, the director-deployed swift used a deprecated version of ceilometer middleware that had been dropped in Red Hat OpenStack Platform 8. With this update, the swift proxy config uses ceilometer middleware from python-ceilometermiddleware. As a result, swift proxy now uses a supported version of ceilometer middleware.

BZ#1361285

OpenStack Image Storage (glance) configures with more workers by default, which improves performance. The count is automatically scaled depending on the number of processors.

BZ#1367678

This enhancement adds `NeutronOVSFirewallDriver`, a new parameter for configuring the Open vSwitch (OVS) firewall driver in Red Hat OpenStack Platform director.
This was added because the neutron OVS agent supports a new mechanism for implementing security groups: the 'openvswitch' firewall. `NeutronOVSFirewallDriver` allows users to directly control which implementation is used:
`hybrid` - configures neutron to use the old iptables/hybrid based implementation.
'openvswitch' - enables the new flow-based implementation. 
The new firewall driver includes higher performance and reduces the number of interfaces and bridges used to connect guests to the project network. As a result, users can more easily evaluate the new security group implementation.

BZ#1256850

The Telemetry API (ceilometer-api) now uses apache-wsgi instead of eventlet. When upgrading to this release, ceilometer-api will be migrated accordingly.

This change provides greater flexibility for per-deployment performance and scaling adjustments, as well as straightforward use of SSL.

BZ#1303093

With this update, it is possible to diable the Object Storage service (swift) in the overcloud by using an additional environment file when deploying the overcloud. The environment file should contain the following: 

resource_registry:
  OS::TripleO::Services::SwiftProxy: OS::Heat::None
  OS::TripleO::Services::SwiftStorage: OS::Heat::None
  OS::TripleO::Services::SwiftRingBuilder: OS::Heat::None

As a result, the Object Storage service will not be running in the overcloud and there will not be an endpoint for the Object Storage service in the overcloud Identity service.

BZ#1314732

Previously, while deploying Red Hat OpenStack Platform 8 using director, the Telemetry service was not configured in Compute, causing some of the OpenStack Integration Test Suite tests to fail. 

With this update, the OpenStack Telemetry service is configured in the Compute configuration. As a result, the notification driver is set correctly and the OpenStack Integration Test Suite tests pass.

BZ#1316016

Previously, Telemetry (ceilometer) notifications would fail due to missing messaging configuration in Image Service (glance). Consequently, glance notifications failed to be processed. With this update, the tripleo templates have been amended to add the correct configuration. As a result, glance notifications are now processed correctly.

BZ#1347371

With this enhancement, RabbitMQ introduces the new HA feature of Queue Master distribution. One of the strategies is `min-masters`, which picks the node hosting the minimum number of masters.
This was added because of the possibility that one of the controllers may become unavailable, with  Queue Masters then located on available controllers during queue declarations. Once the lost controller becomes available again, masters of newly-declared queues are not placed with priority to the controller with an obviously lower number of queue masters, and consequently the distribution may be unbalanced, with one of the controllers under significantly higher load in the event of multiple fail-overs.
As a result, this enhancement spreads out the queues across controllers after a controller fail-over.

BZ#1351271

The Red Hat OpenStack Platform director creates OpenStack Block Storage (cinder) v3 API endpoint in OpenStack Identity (keystone) to support the newer Cinder API version.

BZ#1364478

This update allow usage of any isolated network on any role. Some scenarios, like a deployment where 'ceph-osd' is collocated with 'nova-compute', assume that nodes have access to multiple isolated networks. Now custom NIC templates can configure any of the isolated network on any role.

BZ#1366721

The Telemetry service (ceilometer) now uses gnocchi as its default meter dispatcher back end. Gnocchi is more scalable, and is more aligned to the future direction that the Telemetry service is facing.

BZ#1368218

With this update, you can now configure Object Storage service (swift) with additional raw disks by deploying the overcloud with an additional environment file, for example: 

parameter_defaults:
  ExtraConfig:
    SwiftRawDisks:
      sdb:
        byte_size: 2048
        mnt_base_dir: /src/sdb
      sdc:
        byte_size: 2048

As a result, the Object Storage service is not limited by the local node `root` filesystem.

BZ#1369426

AODH now uses MYSQL as its default database back end. Previously, AODH used MongoDB as its default back end to make the transition from Ceilometer to AODH easier.

BZ#1373853

The Compute role and Object Storage role upgrade scripts for upgrading from the Red Hat OpenStack Platform 9 (mitaka) to Red Hat OpenStack Platform 10 (newton) did not exit on error as expected. As a result, the 'upgrade-non-controller.sh' script returned code 0 (success) even when the upgrade failed. 

With this update, the Compute role and the Object Storage role upgrade scripts now exit on error during the upgrade process and the 'upgrade-non-controller.sh' returns a non-zero (failure) value if the upgrade fails.

BZ#1379719

With the move to composable services, the hieradata which was used to configure the NTP servers on overcloud nodes was configured incorrectly. 

This update uses the correct hieradata so the overcloud nodes get the NTP servers configured.

BZ#1385368

To accommodate composable services, NFS mounts used as an Image Service (glance) back end are no longer managed by Pacemaker. As a result, the glance NFS back end parameter interface has changed: The new method is to use an environment file to enable the glance NFS backend. For example:
----
parameter_defaults:
GlanceBackend: file
GlanceNfsEnabled: true
GlanceNfsShare: IP:/some/exported/path
----
Note: the GlanceNfsShare setting will vary depending on your deployment.
In addition, mount options can be customized using the `GlanceNfsOptions` parameter. If the Glance NFS backend was previously used in Red Hat OpenStack Platform 9, the environment file contents must be updated to match the Red Hat OpenStack Platform 10 format.

BZ#1387390

Previously, the TCP port '16509' was blocked in 'iptables'. As a result, the 'nova' Compute 'libvirt' instances could not be live migrated between Compute nodes. 

With this update, TCP port '16509' is configured to be opened in the 'iptables'. As a result, the 'nova' Compute 'libvirt' instances can now be live migrated between Compute nodes.

BZ#1389189

Previously, due to a race condition between Hiera data getting written and Puppet execution on nodes, Puppet on the Overcloud nodes failed occasionally due to the missing Hiera data. 

With this update, ordering is introduced, first writing of the Hiera data is completed on all nodes and then Puppet execution takes place. As a result, Puppet no longer fails during execution as all the necessary Hiera data is present.

BZ#1392773

Previously, after upgrading from Red Hat OpenStack Platform 9 (Mitaka) to Red Hat OpenStack Platform 10 (Newton), the 'ceilometer-compute-agent' failed to collect data.

With this update, restarting the 'ceilometer-compute-agent' post upgrade fixes the issue and allows the 'ceilometer-compute-agent' to restart correctly and gather the relevant data.

BZ#1393487

OpenStack Platform director did not update firewall when deploying OpenStack File Share API (manila-api). If you moved the manila-api service off controllers to its own role, the default firewall rules blocked the endpoints. This fix updates the manila-api firewall rules in the overcloud Heat template collection. You can now reach the endpoints even when manila-api is on a role separate from the controller nodes.

BZ#1382579

The director set the cloudformation (heat-cfn) endpoint to "RegionOne" instead of "regionOne". This caused the UI to display two regions with different services. This fix sets the endpoint to use "regionOne". The UI now displays all services under the same region.

openstack-tripleo-ui

BZ#1353796

With this update, you can now add nodes manually using the UI.

os-collect-config

BZ#1306140

Prior to this update, HTTP requests to `os-collect-config` for configuration did not specify a request timeout. Consequently, polling for data while the undercloud was inaccessible (for example, rebooting undercloud, network connectivity issues) resulted in `os-collect-config` stalling, performing no polling or configuration. This often only became apparent when an overcloud stack operation was performed and software configuration operations timed out.
With this update, `os-collect-config` HTTP requests now always specify a timeout period.
As a result, polling for data will fail when the undercloud is unavailable, and then resume when it is available again.

os-net-config

BZ#1391031

Prior to this update, improvements in the integration between Open vSwitch and neutron could cause issues with the resumption of connectivity after a restart. Consequently, nodes could become unreachable or have reduced connectivity.
With this update, `os-net-config` configures `fail_mode=standalone` by default to allow network traffic if no controlling agent has started yet. As a result, the connection issues on reboot have been resolved.

puppet-ceph

BZ#1372804

Previously, the Ceph Storage nodes use the local filesystem formatted with `ext4` as the back end for the `ceph-osd` service.

Note: Some `overcloud-full` images for Red Hat OpenStack Platform 9 (Mitaka) were created using `ext4` instead of `xfs`.

With the Jewel release, `ceph-osd` checks the maximum file name length allowed by the back end and refuses to start if the limit is lower than the one configured for Ceph itself. As a workaround, it is possible to verify the filesystem in use for `ceph-osd` by logging on the Ceph Storage nodes and using the following command:

# df -l --output=fstype /var/lib/ceph/osd/ceph-$ID

Here, $ID is the OSD ID, for example: 

# df -l --output=fstype /var/lib/ceph/osd/ceph-0

Note: A single Ceph Storage node might host multiple `ceph-osd` instances, in which case there will be multiple subdirectories in `/var/lib/ceph/osd/ for each instance.

If *any* of the OSD instances is backed by an `ext4` filesystem, it is necessary to configure Ceph to use shorter file names, which is possible by deploying/upgrading with an additional environment file, containing the following:

parameter_defaults:
  ExtraConfig:
    ceph::profile::params::osd_max_object_name_len: 256
    ceph::profile::params::osd_max_object_namespace_len: 64

As a result, you can now verify if each and every `ceph-osd` instance is up and running after an upgrade from Red Hat OpenStack Platform 9 to Red Hat OpenStack Platform 10.

BZ#1346401

It is now possible to confine 'ceph-osd' instances with SELinux policies. In OSP10, new deployments have SELinux configured in 'enforcing' mode on the Ceph Storage nodes.

BZ#1370439

Reusing Ceph nodes from an previous cluster in a new overcloud caused the new Ceph cluster to fail without any indication during the overcloud deployment process. This was because the old Ceph OSD node disks needed cleaning before reusing them. This fix adds a check to the Ceph OpenStack Puppet module to make sure the disks are clean as per the instructions in the OpenStack Platform documentation [1]. Now the overcloud deplyoment process properly fails if it detects non-clean OSD disks. The 'openstack stack failures list overcloud' command indicates the disks which have a FSID mismatch. 
[1] https://access.redhat.com/documentation/en/red-hat-openstack-platform/10/single/red-hat-ceph-storage-for-the-overcloud/#Formatting_Ceph_Storage_Nodes_Disks_to_GPT

puppet-cinder

BZ#1356683

A race condition existed between loop device configuration and a check for LVM physical volumes on block storage nodes. This caused the major upgrade convergence step to fail due to Puppet being failing to detect existing LVM physical volumes and attempting to recreate the volume. This fix waits for udev events to complete after setting up the loop device. This means that Puppet waits for the loop device configuration to complete before attempting to check for an existing LVM physical volume. Block storage nodes with LVM backends now upgrade successfully.

puppet-heat

BZ#1381561

The OpenStack Platform director exceeded the default memory limits for using OpenStack Orchestration (heat) YAQL expressions. This caused an "Expression consumed too much memory" error during an overcloud deployment and subsequent deployment failure. This fix increases the default memory limits for the director, which results in a error-free overcloud deployment.

puppet-ironic

BZ#1314665

The ironic-inspector server did not have an iPXE version that worked with UEFI bootloaders. Machines with UEFI bootloaders could not chainload the introspection ramdisk. This fix ensures the ipxe.efi ROM is present on the ironic-inspector server and updates the dnsmasq configuration to send it to the UEFI-based machine during introspection. Now the director can inspect both BIOS and UEFI machines.

puppet-tripleo

BZ#1386611

rabbitmqctl failed to function in an IPv6 environment due to a missing parameter. This fix modifies the RabbitMQ Puppet configuration and adds the missing parameter to /etc/rabbitmq/rabbitmq-env.conf. Now rabbitmqctl does not fail in IPv6 environments

BZ#1389413

Prior to this update, HAProxy checking of MySQL resulted in a long timeout (16 seconds) before a failed node would be removed from service. Consequently, OpenStack services connected to a failed MySQL node could return API errors to users/operators/tools.
With this update, the check interval settings have been reduced to drop failed MySQL nodes within 6 seconds of failure. As a result, OpenStack services should failover to working MySQL nodes much faster and produce fewer API errors to their consumers.

BZ#1262070

You can now use the director to configure Ceph RBD as a Block Storage backup target. This will allow you to deploy an overcloud where volumes are set to back up to a Ceph target. By default, volume backups will be stored in a Ceph pool called 'backups'.

Backup settings are configured in the following environment file (on the undercloud):

/usr/share/openstack-tripleo-heat-templates/environments/cinder-backup.yaml

BZ#1378391

Both Redis and RabbitMQ had a start and stop timeouts of 120s in Pacemaker.  In some environments, this was not enough and caused restarts to fail. This fix increases the timeout to 200s, which is the same for the other systemd resources. Now Redis and RabbitMQ should have enough time to restart on the majority of environments.

BZ#1279554

Using the RBD backend driver (Ceph Storage) for OpenStack Compute (nova) ephemeral disks applies two additional settings to libvirt:

hw_disk_discard : unmap
disk_cachemodes : network=writeback

This allows reclaiming of unused blocks on the Ceph pool and caching of network writes, which improves the performance for OpenStack Compute ephemeral disks using the RBD driver.

Also see http://docs.ceph.com/docs/master/rbd/rbd-openstack/

python-cotyledon

BZ#1374690

Previously, a bug in an older version of `cotyledon` caused `metricsd` to not start properly and throw a traceback.
This update includes a newer 1.2.7-2 `cotyledon` package. As a result, no traceback occurs and `metricsd` starts correctly.

python-django-horizon

BZ#1198602

This enhancement allows the `admin` user to view a list of the floating IPs allocated to instances, using the admin console. This list spans all projects in the deployment.
Previously, this information was only available from the command-line.

BZ#1328830

This update adds support for multiple theme configurations. This was added to allow a user to change a theme dynamically, using the front end. Some use-cases include the ability to toggle between a light and dark theme, or the ability to turn on a high contrast theme for accessibility reasons.
As a result, users can now choose a theme at run time.

python-django-openstack-auth

BZ#1287586

With this enhancement, domain-scoped tokens can be used to login to the Dashboard (horizon).
This was added to fully support the management of identity in keystone v3 when using a richer role set, where a domain-scoped token is required. django_openstack_auth must support obtaining and maintaining this type of token for the session.
As a result, horizon support for domain-scoped tokens has been available since Red Hat OpenStack Platform 9.

python-gnocchiclient

BZ#1346370

This update provides the latest client for OpenStack Telemetry Metrics (gnocchi) to support resource types.

python-ironic-lib

BZ#1381511

OpenStack Bare Metal (ironic) provides user data to new nodes through the creation of a configdrive as an extra primary partition. This requires a free primary partition available on the node's disk. However, a bug caused OpenStack Bare Metal to not distinguish between primary and extended partitions, which caused the partition count to report no free partitions available for the configdrive. This fix distinguishes between primary and extended partitions. Deployments now succeed without error.

BZ#1387148

OpenStack Bare Metal (ironic) contained parsing errors in configdrive implementation for whole disk images, which caused deployment failure. This fix corrects the return value parsing for in configdrive implementation. It is now possible to deploy whole disk images with configdrive.

python-tripleoclient

BZ#1364220

OpenStack Dashboard (horizon) was incorrectly included in list of services the director uses to create endpoints in OpenStack Identity (keystone). A misleading 'Skipping "horizon" postconfig' message appeared when deploying the overcloud. This fix removes horizon from the service list endpoints added to keystone and modifies the "skipping postconfig" messages to only appear in debug mode. The misleading 'Skipping "horizon" postconfig' message no longer appears.

BZ#1383930

If using DHCP HA, the `NeutronDhcpAgentsPerNetwork` value should be set either equal to the number of dhcp-agents, or 3 (whichever is lower), using composable roles. If this is not done, the value will default to `ControllerCount` which may not be optimal as there may not be enough dhcp-agents running to satisfy spawning that many DHCP servers for each network.

BZ#1384246

Node delete functions used Heat's 'parameters' instead of 'parameter_defaults'. This caused Heat to redeploy some resources, such as unintentionally redploying nodes. This fix switches the node delete functions to use only 'parameter_defaults'. Heat resources are correctly left in place and not redeployed.

python-twisted

BZ#1394150

The python-twisted package failed to install as a part of the Red Hat OpenStack Platform 10 undercloud installation due to missing "Obsoletes" for the package. This fix includes a packaging change with an "Obsoletes" list, which removes the obsolete packages during the python-twisted package installation and provides a seamless update and cleanup.

As a manual workaround, make sure not to install any python-twisted-* packages from the Red Hat Enterprise Linux 7.3 Optional repository, such as python-twisted-core. If the undercloud contains these obsolete packages, remove them with:

$ yum erase python-twisted-*

rabbitmq-server

BZ#1357522

RabbitMQ would bind to port 35672. However, port 35672 is in the ephemeral range, which leaves the possibility of other services opening up the same port. This could cause RabbitMQ to fail to start. This fix changes the RabbitMQ port to 25672, which is outside of the ephemeral port range. No other service listens on the same port and RabbitMQ starts successfully.

rhosp-release

BZ#1317669

This update includes a release file to identify the overcloud version deployed with OSP director. This gives a clear indication of the installed version and aids debugging. The overcloud-full image includes a new package (rhosp-release). Upgrades from older versions also install this RPM. All versions starting with OSP 10 will now have a release file. This only applies to Red Hat OpenStack Platform director-based installations. However, users can manually the install the rhosp-release package and achieve the same result.

sahara-image-elements

BZ#1371649

This enhancement updates the main script on `sahara-image-element` to only allow the creation of images for supported plugins. For example, you can use the following command to create a CDH 5.7 image using Red Hat Enterprise Linux 7:
----
>> ./diskimage-create/diskimage-create.sh -p cloudera -v 5.7

Usage: diskimage-create.sh
       [-p cloudera|mapr|ambari]
       [-v 5.5|5.7|2.3|2.4]
       [-r 5.1.0]
----

Select Your Language

Red Hat Training

Chapter 4. Technical Notes

4.1. RHEA-2016:2948 — Red Hat OpenStack Platform 10 Enhancement Update

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Language and Page Formatting Options

Red Hat Training

Chapter 4. Technical Notes

4.1. RHEA-2016:2948 — Red Hat OpenStack Platform 10 Enhancement Update

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links