Both ports of dpdkbond occasionally flaps caused complete loss of traffic

Solution In Progress - Updated -

Issue

  • Recently, a VNF in production network went into split brain causing complete loss of traffic due to complete network isolation (and then network isolation recovery) of some VMs within that VNF.

  • The isolated VMs are on compute host overcloud-computedpdk-0.

  • When we checked the ovs-vswitchd.log on that compute host, we saw both ports(dpdk0 and dpdk1) of the dpdkbond0 flaps occasionally, this has happened several times:.

2019-08-18T08:52:07.940Z|05892|bond|INFO|interface dpdk0: link state down
2019-08-18T08:52:07.940Z|05893|bond|INFO|interface dpdk0: disabled
2019-08-18T08:52:09.216Z|05894|bond|INFO|interface dpdk1: link state down
2019-08-18T08:52:09.216Z|05895|bond|INFO|interface dpdk1: disabled
2019-08-18T08:52:09.216Z|05896|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-18T08:52:47.216Z|05897|bond|INFO|interface dpdk0: link state up
2019-08-18T08:52:47.216Z|05898|bond|INFO|interface dpdk0: enabled
2019-08-18T08:52:47.216Z|05899|bond|INFO|bond dpdkbond0: active interface is now dpdk0
2019-08-18T08:52:48.226Z|05900|bond|INFO|interface dpdk1: link state up
2019-08-18T08:52:48.226Z|05901|bond|INFO|interface dpdk1: enabled
...
2019-08-18T08:52:58.052Z|05903|bond|INFO|interface dpdk0: link state down
2019-08-18T08:52:58.052Z|05904|bond|INFO|interface dpdk0: disabled
2019-08-18T08:52:58.052Z|05905|bond|INFO|bond dpdkbond0: active interface is now dpdk1
2019-08-18T08:52:58.062Z|05906|bond|INFO|interface dpdk1: link state down
2019-08-18T08:52:58.062Z|05907|bond|INFO|interface dpdk1: disabled
2019-08-18T08:52:58.062Z|05908|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-18T08:53:00.055Z|05909|bond|INFO|interface dpdk1: link state up
2019-08-18T08:53:00.055Z|05910|bond|INFO|interface dpdk1: enabled
2019-08-18T08:53:00.055Z|05911|bond|INFO|bond dpdkbond0: active interface is now dpdk1
2019-08-18T08:53:01.039Z|05912|bond|INFO|interface dpdk0: link state up
2019-08-18T08:53:01.040Z|05913|bond|INFO|interface dpdk0: enabled
...
2019-08-19T18:59:48.183Z|10376|bond|INFO|interface dpdk0: link state down
2019-08-19T18:59:48.183Z|10377|bond|INFO|interface dpdk0: disabled
2019-08-19T18:59:49.184Z|10378|bond|INFO|interface dpdk1: link state down
2019-08-19T18:59:49.184Z|10379|bond|INFO|interface dpdk1: disabled
2019-08-19T18:59:49.184Z|10380|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-19T18:59:54.184Z|10381|bond|INFO|interface dpdk0: link state up
2019-08-19T18:59:54.184Z|10382|bond|INFO|interface dpdk0: enabled
2019-08-19T18:59:54.184Z|10383|bond|INFO|bond dpdkbond0: active interface is now dpdk0
2019-08-19T18:59:54.698Z|10384|bond|INFO|interface dpdk1: link state up
2019-08-19T18:59:54.698Z|10385|bond|INFO|interface dpdk1: enabled
...
2019-08-19T19:00:01.063Z|10386|bond|INFO|interface dpdk0: link state down
2019-08-19T19:00:01.063Z|10387|bond|INFO|interface dpdk0: disabled
2019-08-19T19:00:01.063Z|10388|bond|INFO|bond dpdkbond0: active interface is now dpdk1
2019-08-19T19:00:01.709Z|10389|bond|INFO|interface dpdk1: link state down
2019-08-19T19:00:01.709Z|10390|bond|INFO|interface dpdk1: disabled
2019-08-19T19:00:01.709Z|10391|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-19T19:00:02.131Z|02629|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_VRING_BASE
2019-08-19T19:00:02.131Z|02630|timeval|WARN|Unreasonably long 4423ms poll interval (1ms user, 0ms system)
2019-08-19T19:00:02.131Z|02631|timeval|WARN|context switches: 6 voluntary, 0 involuntary
2019-08-19T19:00:02.131Z|02632|coverage|INFO|Skipping details of duplicate event coverage for hash=c4d7daac
2019-08-19T19:00:02.142Z|02633|netdev_dpdk|INFO|vHost Device '/var/lib/vhost_sockets/vhuf2cfeb2a-b6' has been removed
2019-08-19T19:00:02.142Z|02634|dpdk|INFO|VHOST_CONFIG: vring base idx:0 file:54447
2019-08-19T19:00:02.142Z|02635|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_VRING_BASE
2019-08-19T19:00:02.142Z|02636|dpdk|INFO|VHOST_CONFIG: vring base idx:1 file:12237
...
2019-08-19T19:00:17.940Z|02963|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_NUM
2019-08-19T19:00:17.940Z|02964|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_NUM
2019-08-19T19:00:17.940Z|02965|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.940Z|02966|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
2019-08-19T19:00:17.940Z|02967|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
2019-08-19T19:00:17.940Z|02968|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
2019-08-19T19:00:17.940Z|02969|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
2019-08-19T19:00:17.940Z|02970|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_KICK
2019-08-19T19:00:17.940Z|02971|dpdk|INFO|VHOST_CONFIG: vring kick idx:1 file:166
2019-08-19T19:00:17.940Z|02972|dpdk|INFO|VHOST_CONFIG: virtio is now ready for processing.
2019-08-19T19:00:17.940Z|10392|bond|INFO|interface dpdk1: link state up
2019-08-19T19:00:17.940Z|10393|bond|INFO|interface dpdk1: enabled
2019-08-19T19:00:17.940Z|10394|bond|INFO|bond dpdkbond0: active interface is now dpdk1
2019-08-19T19:00:17.941Z|02973|netdev_dpdk|INFO|vHost Device '/var/lib/vhost_sockets/vhu76b0ce10-ab' has been added on numa node 0
2019-08-19T19:00:17.941Z|02974|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.941Z|02975|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.941Z|02976|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.941Z|02977|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.941Z|02978|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
2019-08-19T19:00:17.941Z|02979|dpdk|INFO|VHOST_CONFIG: vring call idx:1 file:167
2019-08-19T19:00:17.941Z|02980|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_KICK
...
2019-08-19T19:00:17.948Z|03047|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 0
2019-08-19T19:00:17.948Z|03048|netdev_dpdk|INFO|State of queue 0 ( tx_qid 0 ) of vhost device '/var/lib/vhost_sockets/vhufffdd6d6-80'changed to 'enabled'
2019-08-19T19:00:17.948Z|03049|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
2019-08-19T19:00:17.948Z|03050|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
2019-08-19T19:00:17.948Z|03051|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
2019-08-19T19:00:17.949Z|03052|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
2019-08-19T19:00:17.949Z|03053|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
2019-08-19T19:00:17.949Z|03054|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
2019-08-19T19:00:17.949Z|03055|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
2019-08-19T19:00:17.949Z|03056|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
2019-08-19T19:00:18.185Z|10395|bond|INFO|interface dpdk0: link state up
2019-08-19T19:00:18.185Z|10396|bond|INFO|interface dpdk0: enabled
...
2019-08-19T19:00:25.130Z|10398|bond|INFO|interface dpdk1: link state down
2019-08-19T19:00:25.130Z|10399|bond|INFO|interface dpdk1: disabled
2019-08-19T19:00:25.130Z|10400|bond|INFO|interface dpdk0: link state down
2019-08-19T19:00:25.130Z|10401|bond|INFO|interface dpdk0: disabled
2019-08-19T19:00:25.130Z|10402|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-19T19:00:31.147Z|10403|bond|INFO|interface dpdk1: link state up
2019-08-19T19:00:31.147Z|10404|bond|INFO|interface dpdk1: enabled
2019-08-19T19:00:31.147Z|10405|bond|INFO|interface dpdk0: link state up
2019-08-19T19:00:31.147Z|10406|bond|INFO|interface dpdk0: enabled

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In