Internal interfaces stopped working

Solution In Progress - Updated -

Issue

  • The internal interfaces (lacp bond) for Openstack API and network storage stopped working on 1 compute. We know it was caused by OVS (lots of OVS errors in the OVS log files at the same time the issue started occurring).

  • All instances on this compute stopped working because they were using Netapp Solidfire volumes. The API bond is used for internal openstack traffic and storage traffic.

  • Rebooting the compute node fixed the issue.

  • The following errors are seen in /var/log/openvswitch/ovs-vswitchd.log:

Nov 18 19:51:44 overcloud-compute028 ovs-vswitchd: ovs|00011|util(handler34)|EMER|../lib/dp-packet.h:307: assertion pad_size <= dp_packet_size(b) failed in dp_packet_set_l2_pad_size()
Nov 18 19:51:44 overcloud-compute028 ovs-ctl: 2020-11-18T19:51:44Z|00001|unixctl|WARN|failed to connect to /var/run/openvswitch/ovs-vswitchd.39219.ctl
Nov 18 19:51:44 overcloud-compute028 ovs-appctl: ovs|00001|unixctl|WARN|failed to connect to /var/run/openvswitch/ovs-vswitchd.39219.ctl
Nov 18 19:51:44 overcloud-compute028 ovs-ctl: ovs-appctl: cannot connect to "/var/run/openvswitch/ovs-vswitchd.39219.ctl" (Connection refused)
  • The following errors are seen in /var/log/containers/neuron/openvswitch-agent.log:
2020-11-18 19:52:14.886 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-6766bd40-20b5-42cb-a889-45d982278397 - - - - -] Switch connection timeout
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int [req-6766bd40-20b5-42cb-a889-45d982278397 - - - - -] Failed to communicate with the switch: RuntimeError: Switch connection timeout
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py", line 52, in check_
canary_table
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     flows = self.dump_flows(constants.CANARY_TABLE)
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 141, in dum
p_flows
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     (dp, ofp, ofpp) = self._get_dp()
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_bridge.py", line 67, in _g
et_dp
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     self._cached_dpid = new_dpid
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     self.force_reraise()
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     six.reraise(self.type_, self.value, self.tb)
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_bridge.py", line 50, in _g
et_dp
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     dp = self._get_dp_by_dpid(self._cached_dpid)
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 69, in _get_dp_by_dpid
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     raise RuntimeError(m)
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int RuntimeError: Switch connection timeout

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In