ovs-vswitchd fails to start with DPDK Open vSwitch 2.9 in Red Hat OpenStack Platform 10
Environment
- Red Hat OpenStack Platform 10
- Open vSwitch 2.9
Issue
- ovs-vswitchd.service starts but does not complete successfully
- journalctl shows
Failed to start Open vSwitch Forwarding Unit
- OSP Director deployment does not progress beyond creation/update of controller and compute nodes
Resolution
In order to deploy with Open vSwitch 2.9, the ovs-vswitchd
service file workarounds must be removed.
- These are usually located in
first-boot.yaml
and were required in older versions of Open vSwitch. - Note that older versions of the OSP 10 minor upgrade guide also instructed users to add the same changes to their
post-install.yaml
.
Advice: Run anegrep
in your template Directory and search forRuntimeDirectoryMode|ovs-vswitchd.service
:
cd /home/stack/templates
egrep `RuntimeDirectoryMode|ovs-vswitchd.service` -R
- Note that the modifications to the tuned service files are still mandatory: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html-single/upgrading_red_hat_openstack_platform/index#ap-rhosp10-ovs-2-9-update-post-install Otherwise, one may run into: https://access.redhat.com/solutions/3552051
In the example below the set_ovs_config
OS::Heat::SoftwareConfig
has the code that should be removed from first-boot.yaml
. Modifications to post-install.yaml
would be similar:
set_ovs_config:
type: OS::Heat::SoftwareConfig
properties:
config:
str_replace:
template: |
#!/bin/bash
FORMAT=$COMPUTE_HOSTNAME_FORMAT
if [[ -z $FORMAT ]] ; then
FORMAT="compute" ;
else
# Assumption: only %index% and %stackname% are the variables in Host name format
FORMAT=$(echo $FORMAT | sed 's/\%index\%//g' | sed 's/\%stackname\%//g') ;
fi
if [[ $(hostname) == *$FORMAT* ]] ; then
if [ -f /usr/lib/systemd/system/openvswitch-nonetwork.service ]; then
ovs_service_path="/usr/lib/systemd/system/openvswitch-nonetwork.service"
elif [ -f /usr/lib/systemd/system/ovs-vswitchd.service ]; then
ovs_service_path="/usr/lib/systemd/system/ovs-vswitchd.service"
fi
grep -q "RuntimeDirectoryMode=.*" $ovs_service_path
if [ "$?" -eq 0 ]; then
sed -i 's/RuntimeDirectoryMode=.*/RuntimeDirectoryMode=0775/' $ovs_service_path
else
echo "RuntimeDirectoryMode=0775" >> $ovs_service_path
fi
grep -Fxq "Group=qemu" $ovs_service_path
if [ ! "$?" -eq 0 ]; then
echo "Group=qemu" >> $ovs_service_path
fi
grep -Fxq "UMask=0002" $ovs_service_path
if [ ! "$?" -eq 0 ]; then
echo "UMask=0002" >> $ovs_service_path
fi
ovs_ctl_path='/usr/share/openvswitch/scripts/ovs-ctl'
grep -q "umask 0002 \&\& start_daemon \"\$OVS_VSWITCHD_PRIORITY\"" $ovs_ctl_path
if [ ! "$?" -eq 0 ]; then
sed -i 's/start_daemon \"\$OVS_VSWITCHD_PRIORITY.*/umask 0002 \&\& start_daemon \"$OVS_VSWITCHD_PRIORITY\" \"$OVS_VSWITCHD_WRAPPER\" \"$@\"/' $ovs_ctl_path
fi
fi
params:
$COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat}
Note: There are other changes required for OVS 2.9. See secton 2.1.2. NFV Pre-Configuration of the OSP 10 Upgrade guide for new template parameters.
Also, if performing an overcloud update then migration of virtual machines using DPDK is required. See section 2.3.2. NFV Post-Configuration of the upgrade guide.
Root Cause
Open vSwitch 2.9 runs with reduced privileges and a different Vhost user mode. These require changes to heat templates.
For further details, also see https://bugzilla.redhat.com/show_bug.cgi?id=1602102
Diagnostic Steps
- Open vSwitch Forwarding Unit (ovs-vswitchd) fails to start:
[root@overcloud-compute-0 ~]# journalctl -p err -b | grep 'Open vSwitch'
Jul 19 16:24:32 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jul 19 16:24:32 overcloud-compute-0 systemd[1]: Dependency failed for Open vSwitch.
Jul 19 16:24:41 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jul 19 16:24:50 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jul 19 16:24:58 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jul 19 16:25:07 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jul 19 16:25:21 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jul 19 16:25:32 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
Jul 19 16:25:44 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
- The ovs-vswitchd.service is in state
activating
:
[root@overcloud-compute-0 ~]# systemctl status ovs-vswitchd -l
● ovs-vswitchd.service - Open vSwitch Forwarding Unit
Loaded: loaded (/usr/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: disabled)
Active: activating (start) since Thu 2018-07-19 18:05:23 EDT; 5s ago
Process: 63540 ExecStartPre=/usr/bin/chmod 0775 /dev/hugepages (code=exited, status=0/SUCCESS)
Process: 63537 ExecStartPre=/bin/sh -c /usr/bin/chown :$${OVS_USER_ID##*:} /dev/hugepages (code=exited, status=0/SUCCESS)
Control: 63543 (ovs-ctl)
Tasks: 4
Memory: 14.3M
CGroup: /system.slice/ovs-vswitchd.service
├─63543 /bin/sh /usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server --no-monitor --system-id=random --ovs- user=openvswitch:hugetlbfs start
├─63575 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-
vswitchd.pid --detach
└─63576 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs- vswitchd.pid --detach
Jul 19 18:05:23 overcloud-compute-0 systemd[1]: Starting Open vSwitch Forwarding Unit...
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments