Internal interfaces stopped working

Solution In Progress - Updated -

Issue

  • The internal interfaces (lacp bond) for Openstack API and network storage stopped working on 1 compute. We know it was caused by OVS (lots of OVS errors in the OVS log files at the same time the issue started occurring).

  • All instances on this compute stopped working because they were using Netapp Solidfire volumes. The API bond is used for internal openstack traffic and storage traffic.

  • Rebooting the compute node fixed the issue.

  • The following errors are seen in /var/log/openvswitch/ovs-vswitchd.log:

Nov 18 19:51:44 overcloud-compute028 ovs-vswitchd: ovs|00011|util(handler34)|EMER|../lib/dp-packet.h:307: assertion pad_size <= dp_packet_size(b) failed in dp_packet_set_l2_pad_size()
Nov 18 19:51:44 overcloud-compute028 ovs-ctl: 2020-11-18T19:51:44Z|00001|unixctl|WARN|failed to connect to /var/run/openvswitch/ovs-vswitchd.39219.ctl
Nov 18 19:51:44 overcloud-compute028 ovs-appctl: ovs|00001|unixctl|WARN|failed to connect to /var/run/openvswitch/ovs-vswitchd.39219.ctl
Nov 18 19:51:44 overcloud-compute028 ovs-ctl: ovs-appctl: cannot connect to "/var/run/openvswitch/ovs-vswitchd.39219.ctl" (Connection refused)
  • The following errors are seen in /var/log/containers/neuron/openvswitch-agent.log:
2020-11-18 19:52:14.886 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.ofswitch [req-6766bd40-20b5-42cb-a889-45d982278397 - - - - -] Switch connection timeout
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int [req-6766bd40-20b5-42cb-a889-45d982278397 - - - - -] Failed to communicate with the switch: RuntimeError: Switch connection timeout
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int Traceback (most recent call last):
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py", line 52, in check_
canary_table
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     flows = self.dump_flows(constants.CANARY_TABLE)
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 141, in dum
p_flows
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     (dp, ofp, ofpp) = self._get_dp()
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_bridge.py", line 67, in _g
et_dp
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     self._cached_dpid = new_dpid
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     self.force_reraise()
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     six.reraise(self.type_, self.value, self.tb)
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ovs_bridge.py", line 50, in _g
et_dp
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     dp = self._get_dp_by_dpid(self._cached_dpid)
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/ofswitch.py", line 69, in _get_dp_by_dpid
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int     raise RuntimeError(m)
2020-11-18 19:52:14.887 75071 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.openflow.native.br_int RuntimeError: Switch connection timeout

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content