Both ports of dpdkbond occasionally flaps caused complete loss of traffic

Solution In Progress - Updated -

Issue

  • Recently, a VNF in production network went into split brain causing complete loss of traffic due to complete network isolation (and then network isolation recovery) of some VMs within that VNF.

  • The isolated VMs are on compute host overcloud-computedpdk-0.

  • When we checked the ovs-vswitchd.log on that compute host, we saw both ports(dpdk0 and dpdk1) of the dpdkbond0 flaps occasionally, this has happened several times:.

2019-08-18T08:52:07.940Z|05892|bond|INFO|interface dpdk0: link state down
2019-08-18T08:52:07.940Z|05893|bond|INFO|interface dpdk0: disabled
2019-08-18T08:52:09.216Z|05894|bond|INFO|interface dpdk1: link state down
2019-08-18T08:52:09.216Z|05895|bond|INFO|interface dpdk1: disabled
2019-08-18T08:52:09.216Z|05896|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-18T08:52:47.216Z|05897|bond|INFO|interface dpdk0: link state up
2019-08-18T08:52:47.216Z|05898|bond|INFO|interface dpdk0: enabled
2019-08-18T08:52:47.216Z|05899|bond|INFO|bond dpdkbond0: active interface is now dpdk0
2019-08-18T08:52:48.226Z|05900|bond|INFO|interface dpdk1: link state up
2019-08-18T08:52:48.226Z|05901|bond|INFO|interface dpdk1: enabled
...
2019-08-18T08:52:58.052Z|05903|bond|INFO|interface dpdk0: link state down
2019-08-18T08:52:58.052Z|05904|bond|INFO|interface dpdk0: disabled
2019-08-18T08:52:58.052Z|05905|bond|INFO|bond dpdkbond0: active interface is now dpdk1
2019-08-18T08:52:58.062Z|05906|bond|INFO|interface dpdk1: link state down
2019-08-18T08:52:58.062Z|05907|bond|INFO|interface dpdk1: disabled
2019-08-18T08:52:58.062Z|05908|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-18T08:53:00.055Z|05909|bond|INFO|interface dpdk1: link state up
2019-08-18T08:53:00.055Z|05910|bond|INFO|interface dpdk1: enabled
2019-08-18T08:53:00.055Z|05911|bond|INFO|bond dpdkbond0: active interface is now dpdk1
2019-08-18T08:53:01.039Z|05912|bond|INFO|interface dpdk0: link state up
2019-08-18T08:53:01.040Z|05913|bond|INFO|interface dpdk0: enabled
...
2019-08-19T18:59:48.183Z|10376|bond|INFO|interface dpdk0: link state down
2019-08-19T18:59:48.183Z|10377|bond|INFO|interface dpdk0: disabled
2019-08-19T18:59:49.184Z|10378|bond|INFO|interface dpdk1: link state down
2019-08-19T18:59:49.184Z|10379|bond|INFO|interface dpdk1: disabled
2019-08-19T18:59:49.184Z|10380|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-19T18:59:54.184Z|10381|bond|INFO|interface dpdk0: link state up
2019-08-19T18:59:54.184Z|10382|bond|INFO|interface dpdk0: enabled
2019-08-19T18:59:54.184Z|10383|bond|INFO|bond dpdkbond0: active interface is now dpdk0
2019-08-19T18:59:54.698Z|10384|bond|INFO|interface dpdk1: link state up
2019-08-19T18:59:54.698Z|10385|bond|INFO|interface dpdk1: enabled
...
2019-08-19T19:00:01.063Z|10386|bond|INFO|interface dpdk0: link state down
2019-08-19T19:00:01.063Z|10387|bond|INFO|interface dpdk0: disabled
2019-08-19T19:00:01.063Z|10388|bond|INFO|bond dpdkbond0: active interface is now dpdk1
2019-08-19T19:00:01.709Z|10389|bond|INFO|interface dpdk1: link state down
2019-08-19T19:00:01.709Z|10390|bond|INFO|interface dpdk1: disabled
2019-08-19T19:00:01.709Z|10391|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-19T19:00:02.131Z|02629|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_VRING_BASE
2019-08-19T19:00:02.131Z|02630|timeval|WARN|Unreasonably long 4423ms poll interval (1ms user, 0ms system)
2019-08-19T19:00:02.131Z|02631|timeval|WARN|context switches: 6 voluntary, 0 involuntary
2019-08-19T19:00:02.131Z|02632|coverage|INFO|Skipping details of duplicate event coverage for hash=c4d7daac
2019-08-19T19:00:02.142Z|02633|netdev_dpdk|INFO|vHost Device '/var/lib/vhost_sockets/vhuf2cfeb2a-b6' has been removed
2019-08-19T19:00:02.142Z|02634|dpdk|INFO|VHOST_CONFIG: vring base idx:0 file:54447
2019-08-19T19:00:02.142Z|02635|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_GET_VRING_BASE
2019-08-19T19:00:02.142Z|02636|dpdk|INFO|VHOST_CONFIG: vring base idx:1 file:12237
...
2019-08-19T19:00:17.940Z|02963|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_NUM
2019-08-19T19:00:17.940Z|02964|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_NUM
2019-08-19T19:00:17.940Z|02965|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.940Z|02966|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
2019-08-19T19:00:17.940Z|02967|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
2019-08-19T19:00:17.940Z|02968|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
2019-08-19T19:00:17.940Z|02969|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_BASE
2019-08-19T19:00:17.940Z|02970|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_KICK
2019-08-19T19:00:17.940Z|02971|dpdk|INFO|VHOST_CONFIG: vring kick idx:1 file:166
2019-08-19T19:00:17.940Z|02972|dpdk|INFO|VHOST_CONFIG: virtio is now ready for processing.
2019-08-19T19:00:17.940Z|10392|bond|INFO|interface dpdk1: link state up
2019-08-19T19:00:17.940Z|10393|bond|INFO|interface dpdk1: enabled
2019-08-19T19:00:17.940Z|10394|bond|INFO|bond dpdkbond0: active interface is now dpdk1
2019-08-19T19:00:17.941Z|02973|netdev_dpdk|INFO|vHost Device '/var/lib/vhost_sockets/vhu76b0ce10-ab' has been added on numa node 0
2019-08-19T19:00:17.941Z|02974|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.941Z|02975|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.941Z|02976|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.941Z|02977|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ADDR
2019-08-19T19:00:17.941Z|02978|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_CALL
2019-08-19T19:00:17.941Z|02979|dpdk|INFO|VHOST_CONFIG: vring call idx:1 file:167
2019-08-19T19:00:17.941Z|02980|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_KICK
...
2019-08-19T19:00:17.948Z|03047|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 0
2019-08-19T19:00:17.948Z|03048|netdev_dpdk|INFO|State of queue 0 ( tx_qid 0 ) of vhost device '/var/lib/vhost_sockets/vhufffdd6d6-80'changed to 'enabled'
2019-08-19T19:00:17.948Z|03049|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
2019-08-19T19:00:17.948Z|03050|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
2019-08-19T19:00:17.948Z|03051|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
2019-08-19T19:00:17.949Z|03052|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
2019-08-19T19:00:17.949Z|03053|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
2019-08-19T19:00:17.949Z|03054|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
2019-08-19T19:00:17.949Z|03055|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
2019-08-19T19:00:17.949Z|03056|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
2019-08-19T19:00:18.185Z|10395|bond|INFO|interface dpdk0: link state up
2019-08-19T19:00:18.185Z|10396|bond|INFO|interface dpdk0: enabled
...
2019-08-19T19:00:25.130Z|10398|bond|INFO|interface dpdk1: link state down
2019-08-19T19:00:25.130Z|10399|bond|INFO|interface dpdk1: disabled
2019-08-19T19:00:25.130Z|10400|bond|INFO|interface dpdk0: link state down
2019-08-19T19:00:25.130Z|10401|bond|INFO|interface dpdk0: disabled
2019-08-19T19:00:25.130Z|10402|bond|INFO|bond dpdkbond0: all interfaces disabled
2019-08-19T19:00:31.147Z|10403|bond|INFO|interface dpdk1: link state up
2019-08-19T19:00:31.147Z|10404|bond|INFO|interface dpdk1: enabled
2019-08-19T19:00:31.147Z|10405|bond|INFO|interface dpdk0: link state up
2019-08-19T19:00:31.147Z|10406|bond|INFO|interface dpdk0: enabled

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content