OVS is dead
Issue
= We got the following error in neutron openvswitch agent and not able to create any vm any more on the compute node:
2020-01-07 00:08:22.140 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-3ea9dc56-412e-4f8f-8f36-89c75107b263 - - - - -] ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0xbd3bc40b,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out: Timeout: 300 seconds
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int [req-3ea9dc56-412e-4f8f-8f36-89c75107b263 - - - - -] Failed to communicate with the switch: RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0xbd3bc40b,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py", line 54, in check_canary_table
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int flows = self.dump_flows(constants.CANARY_TABLE)
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 147, in dump_flows
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int reply_multi=True)
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 95, in _send_msg
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int raise RuntimeError(m)
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int RuntimeError: ofctl request version=0x4,msg_type=0x12,msg_len=0x38,xid=0xbd3bc40b,OFPFlowStatsRequest(cookie=0,cookie_mask=0,flags=0,match=OFPMatch(oxm_fields={}),out_group=4294967295,out_port=4294967295,table_id=23,type=1) timed out
2020-01-07 00:08:22.141 21430 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int
2020-01-07 00:08:22.142 21430 WARNING neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-3ea9dc56-412e-4f8f-8f36-89c75107b263 - - - - -] OVS is dead. OVSNeutronAgent will keep running and checking OVS status periodically.
-
When we try any ovs-ofctl or ovs-appctl commands all timed out. We were able to run ovs-vsctl commands without any problem.
-
VMs are failing to spawn on this compute node.
-
The following messages are seen in
/var/log/openvswitch/ovs-vswitchd.log
:
2020-01-06T16:38:22.751Z|00001|ovs_rcu(urcu3)|WARN|blocked 1000 ms waiting for main to quiesce
2020-01-06T16:38:23.751Z|00002|ovs_rcu(urcu3)|WARN|blocked 2000 ms waiting for main to quiesce
2020-01-06T16:38:25.751Z|00003|ovs_rcu(urcu3)|WARN|blocked 4000 ms waiting for main to quiesce
2020-01-06T16:38:29.751Z|00004|ovs_rcu(urcu3)|WARN|blocked 8000 ms waiting for main to quiesce
2020-01-06T16:38:37.751Z|00005|ovs_rcu(urcu3)|WARN|blocked 16000 ms waiting for main to quiesce
2020-01-06T16:38:53.751Z|00006|ovs_rcu(urcu3)|WARN|blocked 32000 ms waiting for main to quiesce
2020-01-06T16:39:17.135Z|00012|ofproto_dpif_xlate(pmd365)|WARN|Dropped 3 log messages in last 63 seconds (most recently, 55 seconds ago) due to excessive rate
2020-01-06T16:39:17.135Z|00013|ofproto_dpif_xlate(pmd365)|WARN|dropping VLAN 1462 tagged packet received on port tpi-8c052be3-29 configured as VLAN 8 access port while processing arp,in_port=1,dl_vlan=1462,dl_vlan_pcp=0,vlan_tci1=0x0000,dl_src=00:00:07:da:05:18,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=10.212.197.35,arp_tpa=10.212.197.39,arp_op=1,arp_sha=00:00:07:da:05:18,arp_tha=00:00:00:00:00:00 on bridge br-int
2020-01-06T16:39:25.751Z|00007|ovs_rcu(urcu3)|WARN|blocked 64000 ms waiting for main to quiesce
2020-01-06T16:40:29.751Z|00008|ovs_rcu(urcu3)|WARN|blocked 128000 ms waiting for main to quiesce
2020-01-06T16:42:37.751Z|00009|ovs_rcu(urcu3)|WARN|blocked 256000 ms waiting for main to quiesce
2020-01-06T16:46:53.751Z|00010|ovs_rcu(urcu3)|WARN|blocked 512000 ms waiting for main to quiesce
2020-01-06T16:55:25.751Z|00011|ovs_rcu(urcu3)|WARN|blocked 1024000 ms waiting for main to quiesce
2020-01-06T17:12:29.751Z|00012|ovs_rcu(urcu3)|WARN|blocked 2048000 ms waiting for main to quiesce
2020-01-06T17:46:37.751Z|00013|ovs_rcu(urcu3)|WARN|blocked 4096000 ms waiting for main to quiesce
2020-01-06T18:54:53.751Z|00014|ovs_rcu(urcu3)|WARN|blocked 8192000 ms waiting for main to quiesce
2020-01-06T21:11:25.751Z|00015|ovs_rcu(urcu3)|WARN|blocked 16384000 ms waiting for main to quiesce
2020-01-07T01:44:29.751Z|00016|ovs_rcu(urcu3)|WARN|blocked 32768000 ms waiting for main to quiesce
Environment
- Red Hat OpenStack Platform 13.0 (RHOSP)
- Red Hat OpenStack Platform 10.0 (RHOSP)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.