OVS is dead

Solution In Progress - Updated -

Issue

= We got the following error in neutron openvswitch agent and not able to create any vm any more on the compute node:

2020-01-07 00:08:22.140 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-3ea9dc56-412e-4f8f-8f36-89c75107b263 - - - - -] ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0xbd3bc40b,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out: Timeout: 300 seconds
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int [req-3ea9dc56-412e-4f8f-8f36-89c75107b263 - - - - -] Failed to communicate with the switch: RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0xbd3bc40b,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py", line 54, in check_canary_table
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     flows = self.dump_flows(constants.CANARY_TABLE)
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 147, in dump_flows
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     reply_multi=True)
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 95, in _send_msg
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     raise RuntimeError(m)
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0xbd3bc40b,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2020-01-07 00:08:22.142 21430 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-3ea9dc56-412e-4f8f-8f36-89c75107b263 - - - - -] OVS is dead. OVSNeutronAgent will keep running and checking OVS status periodically.
  • When we try any ovs-ofctl or ovs-appctl commands all timed out. We were able to run ovs-vsctl commands without any problem.

  • VMs are failing to spawn on this compute node.

  • The following messages are seen in /var/log/openvswitch/ovs-vswitchd.log:

2020-01-06T16:38:22.751Z|00001|ovs_rcu(urcu3)|WARN|blocked 1000 ms waiting for main to quiesce
2020-01-06T16:38:23.751Z|00002|ovs_rcu(urcu3)|WARN|blocked 2000 ms waiting for main to quiesce
2020-01-06T16:38:25.751Z|00003|ovs_rcu(urcu3)|WARN|blocked 4000 ms waiting for main to quiesce
2020-01-06T16:38:29.751Z|00004|ovs_rcu(urcu3)|WARN|blocked 8000 ms waiting for main to quiesce
2020-01-06T16:38:37.751Z|00005|ovs_rcu(urcu3)|WARN|blocked 16000 ms waiting for main to quiesce
2020-01-06T16:38:53.751Z|00006|ovs_rcu(urcu3)|WARN|blocked 32000 ms waiting for main to quiesce
2020-01-06T16:39:17.135Z|00012|ofproto_dpif_xlate(pmd365)|WARN|Dropped 3 log messages in last 63 seconds (most recently, 55 seconds ago) due to excessive rate
2020-01-06T16:39:17.135Z|00013|ofproto_dpif_xlate(pmd365)|WARN|dropping VLAN 1462 tagged packet received on port tpi-8c052be3-29 configured as VLAN 8 access port while processing arp,in_port=1,dl_vlan=1462,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=00:00:07:da:05:18,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.212.197.35,arp_tpa=10.212.197.39,arp_op=1,arp_sha=00:00:07:da:05:18,arp_tha=00:00:00:00:00:00 on bridge br-int
2020-01-06T16:39:25.751Z|00007|ovs_rcu(urcu3)|WARN|blocked 64000 ms waiting for main to quiesce
2020-01-06T16:40:29.751Z|00008|ovs_rcu(urcu3)|WARN|blocked 128000 ms waiting for main to quiesce
2020-01-06T16:42:37.751Z|00009|ovs_rcu(urcu3)|WARN|blocked 256000 ms waiting for main to quiesce
2020-01-06T16:46:53.751Z|00010|ovs_rcu(urcu3)|WARN|blocked 512000 ms waiting for main to quiesce
2020-01-06T16:55:25.751Z|00011|ovs_rcu(urcu3)|WARN|blocked 1024000 ms waiting for main to quiesce
2020-01-06T17:12:29.751Z|00012|ovs_rcu(urcu3)|WARN|blocked 2048000 ms waiting for main to quiesce
2020-01-06T17:46:37.751Z|00013|ovs_rcu(urcu3)|WARN|blocked 4096000 ms waiting for main to quiesce
2020-01-06T18:54:53.751Z|00014|ovs_rcu(urcu3)|WARN|blocked 8192000 ms waiting for main to quiesce
2020-01-06T21:11:25.751Z|00015|ovs_rcu(urcu3)|WARN|blocked 16384000 ms waiting for main to quiesce
2020-01-07T01:44:29.751Z|00016|ovs_rcu(urcu3)|WARN|blocked 32768000 ms waiting for main to quiesce

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)
  • Red Hat OpenStack Platform 10.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content