ovs-vswitchd fails to start with DPDK Open vSwitch 2.9 in Red Hat OpenStack Platform 10

Solution Verified - Updated -

Environment

  • Red Hat OpenStack Platform 10
  • Open vSwitch 2.9

Issue

  • ovs-vswitchd.service starts but does not complete successfully
  • journalctl shows Failed to start Open vSwitch Forwarding Unit
  • OSP Director deployment does not progress beyond creation/update of controller and compute nodes

Resolution

In order to deploy with Open vSwitch 2.9, the ovs-vswitchd service file workarounds must be removed.

  • These are usually located in first-boot.yaml and were required in older versions of Open vSwitch.
  • Note that older versions of the OSP 10 minor upgrade guide also instructed users to add the same changes to their post-install.yaml.
    Advice: Run an egrep in your template Directory and search for RuntimeDirectoryMode|ovs-vswitchd.service:
    cd /home/stack/templates
    egrep  `RuntimeDirectoryMode|ovs-vswitchd.service` -R

In the example below the set_ovs_config OS::Heat::SoftwareConfig has the code that should be removed from first-boot.yaml. Modifications to post-install.yaml would be similar:

set_ovs_config:
    type: OS::Heat::SoftwareConfig
    properties:
      config:
        str_replace:
          template: |
            #!/bin/bash
            FORMAT=$COMPUTE_HOSTNAME_FORMAT
            if [[ -z $FORMAT ]] ; then
              FORMAT="compute" ;
            else
              # Assumption: only %index% and %stackname% are the variables in Host name format
              FORMAT=$(echo $FORMAT | sed  's/\%index\%//g' | sed 's/\%stackname\%//g') ;
            fi
            if [[ $(hostname) == *$FORMAT* ]] ; then
              if [ -f /usr/lib/systemd/system/openvswitch-nonetwork.service ]; then
                ovs_service_path="/usr/lib/systemd/system/openvswitch-nonetwork.service"
              elif [ -f /usr/lib/systemd/system/ovs-vswitchd.service ]; then
                ovs_service_path="/usr/lib/systemd/system/ovs-vswitchd.service"
              fi
              grep -q "RuntimeDirectoryMode=.*" $ovs_service_path
              if [ "$?" -eq 0 ]; then
                sed -i 's/RuntimeDirectoryMode=.*/RuntimeDirectoryMode=0775/' $ovs_service_path
              else
                echo "RuntimeDirectoryMode=0775" >> $ovs_service_path
              fi
              grep -Fxq "Group=qemu" $ovs_service_path
              if [ ! "$?" -eq 0 ]; then
                echo "Group=qemu" >> $ovs_service_path
              fi
              grep -Fxq "UMask=0002" $ovs_service_path
              if [ ! "$?" -eq 0 ]; then
                echo "UMask=0002" >> $ovs_service_path
              fi
              ovs_ctl_path='/usr/share/openvswitch/scripts/ovs-ctl'
              grep -q "umask 0002 \&\& start_daemon \"\$OVS_VSWITCHD_PRIORITY\"" $ovs_ctl_path
              if [ ! "$?" -eq 0 ]; then
                sed -i 's/start_daemon \"\$OVS_VSWITCHD_PRIORITY.*/umask 0002 \&\& start_daemon \"$OVS_VSWITCHD_PRIORITY\"             \"$OVS_VSWITCHD_WRAPPER\" \"$@\"/' $ovs_ctl_path
              fi
            fi
          params:
            $COMPUTE_HOSTNAME_FORMAT: {get_param: ComputeHostnameFormat}

Note: There are other changes required for OVS 2.9. See secton 2.1.2. NFV Pre-Configuration of the OSP 10 Upgrade guide for new template parameters.

Also, if performing an overcloud update then migration of virtual machines using DPDK is required. See section 2.3.2. NFV Post-Configuration of the upgrade guide.

Root Cause

Open vSwitch 2.9 runs with reduced privileges and a different Vhost user mode. These require changes to heat templates.

For further details, also see https://bugzilla.redhat.com/show_bug.cgi?id=1602102

Diagnostic Steps

  • Open vSwitch Forwarding Unit (ovs-vswitchd) fails to start:
    [root@overcloud-compute-0 ~]# journalctl -p err -b  | grep 'Open vSwitch'
    Jul 19 16:24:32 overcloud-compute-0 systemd[1]: Failed to start Open     vSwitch Forwarding Unit.
    Jul 19 16:24:32 overcloud-compute-0 systemd[1]: Dependency failed for Open vSwitch.
    Jul 19 16:24:41 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
    Jul 19 16:24:50 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
    Jul 19 16:24:58 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
    Jul 19 16:25:07 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
    Jul 19 16:25:21 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
    Jul 19 16:25:32 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
    Jul 19 16:25:44 overcloud-compute-0 systemd[1]: Failed to start Open vSwitch Forwarding Unit.
  • The ovs-vswitchd.service is in state activating :
    [root@overcloud-compute-0 ~]# systemctl status ovs-vswitchd -l
    ● ovs-vswitchd.service - Open vSwitch Forwarding Unit
       Loaded: loaded (/usr/lib/systemd/system/ovs-vswitchd.service; static; vendor preset: disabled)
       Active: activating (start) since Thu 2018-07-19 18:05:23 EDT; 5s ago
      Process: 63540 ExecStartPre=/usr/bin/chmod 0775 /dev/hugepages (code=exited, status=0/SUCCESS)
      Process: 63537 ExecStartPre=/bin/sh -c /usr/bin/chown :$${OVS_USER_ID##*:} /dev/hugepages (code=exited,     status=0/SUCCESS)
      Control: 63543 (ovs-ctl)
        Tasks: 4
       Memory: 14.3M
       CGroup: /system.slice/ovs-vswitchd.service
               ├─63543 /bin/sh /usr/share/openvswitch/scripts/ovs-ctl --no-ovsdb-server --no-monitor --system-id=random --ovs-    user=openvswitch:hugetlbfs start
               ├─63575 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user     openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-        
    vswitchd.pid --detach
               └─63576 ovs-vswitchd unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info --mlockall --user         openvswitch:hugetlbfs --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log --pidfile=/var/run/openvswitch/ovs-    vswitchd.pid --detach

    Jul 19 18:05:23 overcloud-compute-0 systemd[1]: Starting Open vSwitch Forwarding Unit...

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.