Intermittent packet loss in the namespaces of OSP HA when pinging Neutron router IPs

Solution In Progress - Updated -

Environment

Red Hat Enterprise Linux OpenStack Platform 6.0
Red Hat Enterprise Linux OpenStack Platform 7.0
Red Hat OpenStack Platform 8.0

Issue

In an high availability (HA) environment, intermittent packet loss occurs when pinging to namespace's external interface (router IP). This doesn't seem to be a physical problem. Issue does not happen with instances that are directly connected in the external network. This issue happens across different tenants.

Ping to namespace external interface (router IP)
--- 172.16.0.10 ping statistics ---
200 packets transmitted, 187 received, 6% packet loss, time 40409ms
rtt min/avg/max/mdev = 0.234/1.803/13.303/2.489 ms, ipg/ewma 203.060/1.364 ms

Ping to namespace external interface (Floating IP)
--- 172.16.0.50 ping statistics ---
200 packets transmitted, 165 received, +20 errors, 17% packet loss, time 39934ms
rtt min/avg/max/mdev = 2.602/3.945/7.688/0.523 ms, pipe 16, ipg/ewma 200.678/3.842 ms

Resolution

Neutron on one of the controllers (controller2) seems to be misconfigured:

IP address of the controller node is 10.0.0.6

ip a ls dev eth2
3: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1550 qdisc mq state UP qlen 1000
    link/ether 11:11:11:11:11 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.6/24 brd 10.0.0.255 scope global eth2
       valid_lft forever preferred_lft forever
    inet6 fe80::66 scope link 
       valid_lft forever preferred_lft forever

But local_ip in ovs_neutron_plugin.ini contains 10.0.0.7

grep local_ip * -R | grep -v '#'
plugins/openvswitch/ovs_neutron_plugin.ini:local_ip=10.0.0.7

Due to the fact that local_ip is wrong, VXLAN tunnels can't not form between controller2 and the other controllers. controller2's keepalived needs to see VRRP keepalived messages from its neighbors over VXLAN tunnels. Otherwise, it will assume that it lost contact to the MASTER and will transition from BACKUP to MASTER role. As a consequence, tenant external and internal gateway IPs appear twice on controller1 and controller2, which explains the packet loss when pinging external gateway IPs from the outside.

First, modify local_ip to 10.0.0.6

grep local_ip * -R | grep -v '#'
plugins/openvswitch/ovs_neutron_plugin.ini:local_ip=10.0.0.6

Afterwards, bring the node into standby mode and bring it back to active with pacemaker:

pcs cluster standby pcmk-controller2
pcs cluster unstandby pcmk-controller2

Root Cause

keepalived provides L3 high availability for neutron's L3 agent. keepalives transition over the HA network.
HA traffic with dedicated network
For more details, please check upstream documentation: https://wiki.openstack.org/wiki/Neutron/L3_High_Availability_VRRP

Due to the fact that local_ip was wrong, VXLAN tunnels did not form between controller2 and the other controllers. controller2's keepalived needs to see VRRP keepalived messages from its neighbors over VXLAN tunnels on the HA network. Otherwise, it will assume that it lost contact to the MASTER and will transition from BACKUP to MASTER role. As a consequence, one can observe that external facing bridges (qg-...) and internal facing bridges (qr-...) for all tenants shared the same IP address on controller1 and controller2., whereas controller3 behaved correctly and did not share the same IP address. Only one node should be active at any given time as a L3 router in any given tenant. This split brain scenario explains the observed intermittent packet loss.

Diagnostic Steps

First, make sure that the underlying physical network is o.k.: verify interface counters and run a MTR in order to make sure that the issue is not elsewhere in the network.

Verify that the interface is part of br-ex

# ovs-vsctl list-ports br-ex
eth0

Verify that the interface is up

# ethtool eth0
Settings for eth0:
        Supported ports: [ TP ]
        Supported link modes:   10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Supported pause frame use: Symmetric
        Supports auto-negotiation: Yes
        Advertised link modes:  10baseT/Half 10baseT/Full 
                                100baseT/Half 100baseT/Full 
                                1000baseT/Full 
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: Yes
        Speed: 100Mb/s
        Duplex: Full
        Port: Twisted Pair
        PHYAD: 1
        Transceiver: internal
        Auto-negotiation: on
        MDI-X: on (auto)
        Supports Wake-on: pumbg
        Wake-on: g
        Current message level: 0x00000007 (7)
                               drv probe link
        Link detected: yes

Verify interface error counters for the interface, the bridge and in the tenant's namespace

# ip -s link ls dev eth0
4: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP mode DEFAULT qlen 1000
    link/ether 00:00:00:00:01 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    38289033908 373689339 0       0       0       27343265 
    TX: bytes  packets  errors  dropped carrier collsns 
    103630677437 138884305 0       0       0       0       

# ip -s link ls dev br-ex
11: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT 
    link/ether 00:00:00:00:01 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    20777934960 330157120 0       164306  0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    83430      1963     0       0       0       0       

# ip netns exec qrouter-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa ip -s link ls
257: qg-00000000-01: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT 
    link/ether 00:00:00:00:02 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    124183203  1758290  0       34      0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    110        1        0       0       0       0       
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    RX: bytes  packets  errors  dropped overrun mcast   
    0          0        0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    0          0        0       0       0       0       
240: ha-00000000-10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT 
    link/ether 00:00:00:00:00:03 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    815063     14957    0       47      0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    846        9        0       0       0       0       
248: qr-00000000-20: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT 
    link/ether 00:00:00:00:00:04 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    2029197    34365    0       101     0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    298        3        0       0       0       0       

Verify if the issue is somewhere else in the network or if it is at the last hop.
In this case, MTR identifies the issue as being a problem with the last hop. Hops with 100% packet loss are configured not to send IP unreachables and thus can be ignored during analysis. What is important in the following output is that we see 0% packet loss elsewhere in the network but 10% packet loss for the last hop.

$ mtr -n -r 172.16.0.10
Start: Fri Feb 12 16:00:34 2016
HOST: localhost Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 192.168.0.10               0.0%    10    0.2   0.2   0.1   0.3   0.0
  2.|-- 192.168.1.1                0.0%    10    0.6   0.6   0.5   0.8   0.0
  3.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
  4.|-- 10.10.1.101             0.0%    10    2.3   2.6   2.2   4.0   0.5
  5.|-- 10.10.2.150               0.0%    10    3.7  13.1   2.0  50.1  15.0
  6.|-- 10.10.3.51             0.0%    10    4.6   4.6   2.2  13.6   3.3
  7.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
  8.|-- 10.10.81.3              0.0%    10    2.9   3.0   2.2   3.6   0.0
  9.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
 10.|-- 10.10.254.254              0.0%    10   15.5   4.8   3.2  15.5   3.8
 11.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
 12.|-- ???                       100.0    10    0.0   0.0   0.0   0.0   0.0
 13.|-- 172.16.0.10            10.0%    10    2.7   3.5   2.7   4.7   0.4

Focusing on IP address configuration in tenant namespace shows duplicate IPs for L3 on controller1 and controller2. controller3 behaves correctly.
Also, controller2 cannot ping controller1 and controller3 within the namespace (on network 169.254.192.0/18)

controller1

ip netns exec qrouter-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa ifconfig -a
ha-00000000-10: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 169.254.192.1  netmask 255.255.192.0  broadcast 169.254.255.255
        inet6 fe80::3  prefixlen 64  scopeid 0x20<link>
        ether 00:00:00:00:00:03  txqueuelen 0  (Ethernet)
        RX packets 5096  bytes 347716 (339.5 KiB)
        RX errors 0  dropped 89  overruns 0  frame 0
        TX packets 618875  bytes 29706450 (28.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
(...)
qg-00000000-01: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.0.10  netmask 255.255.248.0  broadcast 0.0.0.0
        inet6 fe80::2  prefixlen 64  scopeid 0x20<link>
        ether 00:00:00:00:02  txqueuelen 0  (Ethernet)
        RX packets 302702570  bytes 19996121350 (18.6 GiB)
        RX errors 0  dropped 172556  overruns 0  frame 0
        TX packets 13697858  bytes 13265858397 (12.3 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

qr-00000000-20: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::3  prefixlen 64  scopeid 0x20<link>
        ether 00:00:00:00:00:04  txqueuelen 0  (Ethernet)
        RX packets 13235872  bytes 13244125286 (12.3 GiB)
        RX errors 0  dropped 24  overruns 0  frame 0
        TX packets 10814422  bytes 1667874147 (1.5 GiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

controller2

ip netns exec qrouter-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa ifconfig -a
ha-11111111-01: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 169.254.192.3  netmask 255.255.192.0  broadcast 169.254.255.255
        inet6 fe80::10  prefixlen 64  scopeid 0x20<link>
        ether fa:16:3e:99:5b:f0  txqueuelen 0  (Ethernet)
        RX packets 8232  bytes 559049 (545.9 KiB)
        RX errors 0  dropped 231  overruns 0  frame 0
        TX packets 621261  bytes 29820954 (28.4 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
(...)
qg-00000000-01: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 172.16.0.10  netmask 255.255.248.0  broadcast 0.0.0.0
        inet6 fe80::2  prefixlen 64  scopeid 0x20<link>
        ether 00:00:00:00:02  txqueuelen 0  (Ethernet)
        RX packets 292779125  bytes 18409783683 (17.1 GiB)
        RX errors 0  dropped 1677040  overruns 0  frame 0
        TX packets 1751236  bytes 109440906 (104.3 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

qr-00000000-20: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.0.1  netmask 255.255.255.0  broadcast 0.0.0.0
        inet6 fe80::3  prefixlen 64  scopeid 0x20<link>
        ether 00:00:00:00:00:04  txqueuelen 0  (Ethernet)
        RX packets 7810  bytes 527886 (515.5 KiB)
        RX errors 0  dropped 95  overruns 0  frame 0
        TX packets 790052  bytes 46850188 (44.6 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

controller3

ip netns exec qrouter-aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa ifconfig -a 
ha-2222222-01: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 169.254.192.2  netmask 255.255.192.0  broadcast 169.254.255.255
        inet6 fe80::50  prefixlen 64  scopeid 0x20<link>
        ether fa:16:3e:85:c5:5d  txqueuelen 0  (Ethernet)
        RX packets 632355  bytes 30631801 (29.2 MiB)
        RX errors 0  dropped 43  overruns 0  frame 0
        TX packets 21  bytes 1494 (1.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
(...)
qg-00000000-01: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 00:00:00:00:02  txqueuelen 0  (Ethernet)
        RX packets 288826788  bytes 18262780164 (17.0 GiB)
        RX errors 0  dropped 176150995  overruns 0  frame 0
        TX packets 2  bytes 220 (220.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

qr-00000000-20: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        ether 00:00:00:00:00:04  txqueuelen 0  (Ethernet)
        RX packets 1661723  bytes 94516487 (90.1 MiB)
        RX errors 0  dropped 64  overruns 0  frame 0
        TX packets 2  bytes 188 (188.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

For further analysis, keepalived configuration can be identified and verified as follows:

ps auxwww 
root      5555  0.0  0.0 111640  1320 ?        Ss   Feb03   0:22 keepalived -P -f /var/lib/neutron/ha_confs/99999999-9999-9999-9999-999999999999/keepalived.conf -p /var/lib/neutron/ha_confs/99999999-9999-9999-9999-999999999999.pid -r /var/lib/neutron/ha_confs/99999999-9999-9999-9999-999999999999.pid-vrrp
root      5556  0.0  0.0 111640  1320 ?        Ss   Feb03   0:22 keepalived -P -f /var/lib/neutron/ha_confs/99999999-9999-9999-9999-999999998888/keepalived.conf -p /var/lib/neutron/ha_confs/99999999-9999-9999-9999-999999998888.pid -r /var/lib/neutron/ha_confs/99999999-9999-9999-9999-999999998888.pid-vrrp

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.