Chapter 9. High Availability

9.1. Internal Traffic

NICs 3 & 4 are dedicated on each host to carry all internal traffic. This includes:

  • Internal API traffic (VLAN 1020)
  • Storage (VLAN 1030)
  • Storage management (VLAN 1040)

To achieve high availability, NICs 3 & 4 are bonded. This is achieved via the following entries in network-environments.yaml file

Network-environments.yaml file

# Customize bonding options, e.g. "mode=4 lacp_rate=1 updelay=1000 miimon=100"
BondInterfaceOvsOptions: "mode=802.3ad"

After OpenStack is deployed bonds are created on all the nodes. Here is an example of what it looks like on compute0:

[root@overcloud-compute-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens3f0
# This file is autogenerated by os-net-config
DEVICE=ens3f0
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
MASTER=bond1
SLAVE=yes
BOOTPROTO=none
[root@overcloud-compute-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens3f1
# This file is autogenerated by os-net-config
DEVICE=ens3f1
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
MASTER=bond1
SLAVE=yes
BOOTPROTO=none
[root@overcloud-compute-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond1
# This file is autogenerated by os-net-config
DEVICE=bond1
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
DEVICETYPE=ovs
TYPE=OVSPort
OVS_BRIDGE=br-ex
BONDING_OPTS="mode=802.3ad"
[root@overcloud-compute-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-vlan1020
# This file is autogenerated by os-net-config
DEVICE=vlan1020
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
DEVICETYPE=ovs
TYPE=OVSIntPort
OVS_BRIDGE=br-ex
OVS_OPTIONS="tag=1020"
BOOTPROTO=static
IPADDR=172.17.0.20
NETMASK=255.255.255.0
[root@overcloud-compute-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-vlan1030
# This file is autogenerated by os-net-config
DEVICE=vlan1030
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
DEVICETYPE=ovs
TYPE=OVSIntPort
OVS_BRIDGE=br-ex
OVS_OPTIONS="tag=1030"
BOOTPROTO=static
IPADDR=172.18.0.20
NETMASK=255.255.255.0
[root@overcloud-compute-0 ~]# cat /etc/sysconfig/network-scripts/ifcfg-vlan1040
# This file is autogenerated by os-net-config
DEVICE=vlan1040
ONBOOT=yes
HOTPLUG=no
NM_CONTROLLED=no
PEERDNS=no
DEVICETYPE=ovs
TYPE=OVSIntPort
OVS_BRIDGE=br-ex
OVS_OPTIONS="tag=1040"
BOOTPROTO=static
IPADDR=192.0.2.26
NETMASK=255.255.255.0

[root@overcloud-compute-0 ~]# cat /proc/net/bonding/bond1
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer2 (0)
MII Status: up
MII Polling Interval (ms): 0
Up Delay (ms): 0
Down Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: 2c:60:0c:84:32:79
Active Aggregator Info:
    Aggregator ID: 1
    Number of ports: 2
    Actor Key: 13
    Partner Key: 13
    Partner Mac Address: 44:38:39:ff:01:02

Slave Interface: ens3f0
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 2c:60:0c:84:32:79
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 2c:60:0c:84:32:79
    port key: 13
    port priority: 255
    port number: 1
    port state: 61
details partner lacp pdu:
    system priority: 65535
    system mac address: 44:38:39:ff:01:02
    oper key: 13
    port priority: 255
    port number: 1
    port state: 63

Slave Interface: ens3f1
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 2c:60:0c:84:32:7a
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: none
Actor Churned Count: 0
Partner Churned Count: 0
details actor lacp pdu:
    system priority: 65535
    system mac address: 2c:60:0c:84:32:79
    port key: 13
    port priority: 255
    port number: 2
    port state: 61
details partner lacp pdu:
    system priority: 65535
    system mac address: 44:38:39:ff:01:02
    oper key: 13
    port priority: 255
    port number: 1
    port state: 63

VLANs 1020, 1030 and 1040 as well as bond1 are bridged as follows:

[root@overcloud-compute-0 ~]# ovs-vsctl list-ports br-ex
bond1
phy-br-ex
vlan100
vlan1020
vlan1030
vlan1040
vlan1050

ens3f0 and ens3f1 are connected to switch3 and switch4 ports (one to each). This protects against failure of any of of the uplink leaf switches.

9.2. HA with SR-IOV

9.2.1. Bonding and SR-IOV

The compute servers have two physical NICs 5 and 6 that are dedicated to the dataplane traffic of the mobile network. These two NICs are connected to different leaf switches so that failure of a single switch will not result in total isolation of the server.

When using SR-IOV, it is not possible to perform NIC bonding at the host level. The vNICs that correspond to the VFs can be bonded at the VNF (VM) level. Again, it is important to pick two vNICs that are mapped to VFs that in turn are mapped to diverse PNICs.

For example on VM1 we have:

[root@test-sriov ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1446 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:55:bf:02 brd ff:ff:ff:ff:ff:ff
    inet 192.20.1.12/16 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe55:bf02/64 scope link
       valid_lft forever preferred_lft forever
3: ens5: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether fa:16:3e:fd:ac:20 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::f816:3eff:fefd:ac20/64 scope link
       valid_lft forever preferred_lft forever
4: ens6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP qlen 1000
    link/ether fa:16:3e:89:a8:bd brd ff:ff:ff:ff:ff:ff
    inet6 fe80::f816:3eff:fe89:a8bd/64 scope link
       valid_lft forever preferred_lft forever
5: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether fa:16:3e:fd:ac:20 brd ff:ff:ff:ff:ff:ff
    inet 192.30.0.20/24 brd 192.30.0.255 scope global bond0
       valid_lft forever preferred_lft forever
    inet6 2620:52:0:136c:f816:3eff:fefd:ac20/64 scope global mngtmpaddr dynamic
       valid_lft 2591940sec preferred_lft 604740sec
    inet6 fe80::f816:3eff:fefd:ac20/64 scope link
       valid_lft forever preferred_lft forever
Note

The bond bond0 assumes the MAC address of the active interface of the bond. In this case ens5 (fa:16:3e:fd:ac:20). This is necessary because if the active link on the bon fails, the traffic will need to be sent on the backup link which becomes active upon failure.

Ens5 and ens6 ports on the VM are bonded to bond0. Configurations for ens5, ens6 and bond0 are as follows:

[root@test-sriov ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens5
NAME=bond0-slave0
DEVICE=ens5
TYPE=Ethernet
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes

[root@test-sriov ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens6
NAME=bond0-slave0
DEVICE=ens6
TYPE=Ethernet
BOOTPROTO=none
ONBOOT=yes
MASTER=bond0
SLAVE=yes

[root@test-sriov ~]# cat /etc/sysconfig/network-scripts/ifcfg-bond0
NAME=bond0
DEVICE=bond0
BONDING_MASTER=yes
TYPE=Bond
IPADDR=192.30.0.20
NETMASK=255.255.255.0
ONBOOT=yes
BOOTPROTO=none
BONDING_OPTS="mode=active-backup miimon=100 fail_over_mac=active"
Note

In the bond configuration, it is critical to use “fail_over_mac=active”. Without this, if the active link fails, the switch to the mac address of the backup will not be made. Another important point to note is that the host tags SR-IOV packets towards the switch with a VLAN tag (4075 in this case) because of this, unless the layer 2 switches supports LACP on VLAN tagged interface it is not possible to form an adjacency between the VM ports and the switch. Since the validation lab layer 2 switch does not support LACP on VLAN tagged interface, active-backup mode with fail_over_mac=active had to be used.

The actual state of the bond can be examined as follows:

[root@test-sriov ~]# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: fault-tolerance (active-backup) (fail_over_mac active)
Primary Slave: None
Currently Active Slave: ens5
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0

Slave Interface: ens5
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: fa:16:3e:fd:ac:20
Slave queue ID: 0

Slave Interface: ens6
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: fa:16:3e:89:a8:bd
Slave queue ID: 0
[root@test-sriov ~]#

On the host, ens5 and ens6 correspond to the two physical ports ens6f0 and ens6f1:

[root@overcloud-compute-0 ~]# ip l show ens6f0
6: ens6f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000
    link/ether a0:36:9f:47:e4:70 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC 9e:f1:2f:a1:8f:ac, spoof checking on, link-state auto, trust off
    vf 1 MAC 96:4d:7e:1f:a5:38, spoof checking on, link-state auto, trust off
    vf 2 MAC b6:1e:0f:00:8d:87, spoof checking on, link-state auto, trust off
    vf 3 MAC 2e:51:7a:71:9c:c9, spoof checking on, link-state auto, trust off
    vf 4 MAC fa:16:3e:28:ec:b1, vlan 4075, spoof checking on, link-state auto, trust off

root@overcloud-compute-0 ~]# ip l show ens6f1
7: ens6f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT qlen 1000
    link/ether a0:36:9f:47:e4:72 brd ff:ff:ff:ff:ff:ff
    vf 0 MAC a6:5e:b0:68:1b:f9, spoof checking on, link-state auto, trust off
    vf 1 MAC 22:54:b2:fb:82:6a, spoof checking on, link-state auto, trust off
    vf 2 MAC 1e:09:5c:a7:83:b1, spoof checking on, link-state auto, trust off
    vf 3 MAC 2e:ad:e3:cb:14:f8, spoof checking on, link-state auto, trust off
    vf 4 MAC fa:16:3e:9b:48:17, vlan 4075, spoof checking on, link-state auto, trust off

Looking at compute node1 in Figure 7 ens6f0 (nic5) connects to switch3 in the lab and ens6f1 (nic6) connects to switch4 (two different switches). Even if one of these two switches fail, because of creating the bond on the VM data plane traffic can be protected.

fig5 traffic flow from vn1 to vnf2 for sr iov

Figure 7: Traffic flow from VNF1 to VNF2 during steady state for SR-IOV setup

With two VMs (running Centos 7) configured with bonding, when the switch port connecting to the active link of the bond was failed, a small amount of packet loss was observed before traffic passed on the backup link. Traffic flow after failure is shown in Figure 8

fig6 traffic from vnf1 to vnf2 after failure of primary link sriov

Figure 8: Traffic flow from VNF1 to VNF2 after failure of primary link switch port on switch3 with SR-IOV setup

9.3. HA with OVS-DPDK

9.3.1. Bonding and OVS-DPDK

The compute servers have two physical NICs 5 and 6 that are dedicated to the dataplane traffic of the mobile network. These two NICs are connected to different leaf switches (switch3 and switch4) so that failure of a single switch will not result in total isolation of the server.

On the compute node the OVS_DPDK_BOND looks like this:

[root@overcloud-compute-0 ~]# ovs-appctl bond/show dpdkbond0
---- dpdkbond0 ----
bond_mode: active-backup
bond may use recirculation: no, Recirc-ID : -1
bond-hash-basis: 0
updelay: 0 ms
downdelay: 0 ms
lacp_status: negotiated
active slave mac: a0:36:9f:47:e4:70(dpdk0)

slave dpdk0: enabled
        active slave
        may_enable: true

slave dpdk1: enabled
        may_enable: true

During steady state traffic from VM1 to VM2 goes over DPDK bond, NIC5 (active link in bond) to switch3, then flowing over the MLAG connection to switch4 and then into NIC6 of node2 to eth0 of VM2. This is shown in Figure 9.

fig7 traffic flow from vnf1 vnf2 steady state ovs dpdk

Figure 9: Traffic flow from VNF1 to VNF2 during steady state for OVS-DPDK setup

With two VMs (running Centos 7) configured with bonding, when the switch port connecting to the active link of the bond was failed, a small amount of packet loss was observed before traffic passed on the backup link. Traffic flow after failure is shown in Figure 10.

fig8 traffic flow primary failure dpdk

Figure 10: Traffic flow from VNF1 to VNF2 after failure of primary link switch port on switch3 with SR-IOV setup