Unable to launch instances after a while - Failed to allocate the network(s)

Solution In Progress - Updated -

Issue

  • We've been deploying numerous instances and heat stacks in our environment in RHOSP 13 using OVN. Just updated to container versions released late last week on Monday. Today all was going well, and we are up to 115 stacks in one project. All of a sudden, we are unable to launch stacks or individual instances in any projects, so it doesn't seem like a quota problem. Seeing Failed to allocate the network(s) messages on controller 3. Uploading our controller sosreports, but this is a major issue as group is here to deploy services only for another day. Thank you.

  • The following error is seen on the controllers in /var/log/containers/neutron/server.log:

2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers [req-3938763b-d18c-4e47-8073-a4ea6ec8ba80 63abb58b0cfa46bca9f7c2e9772f7183 fdf586078d6f465695db34d793436147 - default default] Mechanism driver 'ovn' failed in update_port_postcommit: RowNotFound: Cannot find Logical_Switch_Port with name=9934adc9-eb68-4acd-9c63-20e9895bb842
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers Traceback (most recent call last):
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/managers.py", line 427, in _call_on_drivers
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers     getattr(driver.obj, method_name)(context)
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python2.7/site-packages/networking_ovn/ml2/mech_driver.py", line 531, in update_port_postcommit
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers     self._ovn_client.update_port(port, port_object=original_port)
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python2.7/site-packages/networking_ovn/common/ovn_client.py", line 440, in update_port
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers     ovn_port = self._nb_idl.lookup('Logical_Switch_Port', port['id'])
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 103, in lookup
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers     return self._lookup(table, record)
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 143, in _lookup
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers     row = idlutils.row_by_value(self, rl.table, rl.column, record)
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers   File "/usr/lib/python2.7/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 63, in row_by_value
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers     raise RowNotFound(table=table, col=column, match=match)
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers RowNotFound: Cannot find Logical_Switch_Port with name=9934adc9-eb68-4acd-9c63-20e9895bb842
2019-07-18 15:44:50.671 36 ERROR neutron.plugins.ml2.managers 
  • Almost all nodes have similar error messages in /var/log/messages:
Jul 16 16:13:31 overcloud-control-0 journal: 2019-07-16T16:13:31Z|96095|poll_loop|INFO|wakeup due to [POLLIN] on fd 15 (10.10.10.10:56806<->10.10.10.10:6642) at ../lib/stream-fd.c:157 (100% CPU usage)
  • At some point, performances get so bad that VMs can no longer be spawned and the following error messages will appear in /var/log/containers/nova/nova-compute.log on the compute nodes:
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [req-c3d2fdd0-b514-4514-91d4-622a39d6327d 70bfd15a9f9a3f32a589a3cfcbcb78886d3be060e961dd38048eef87870202a9 01b728e67a5a4d88b655e5b3bc232450 - 0ef23cf14b4a4cecade7d9f1cee1862b 0ef23cf14b4a4cecade7d9f1cee1862b] [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6] Instance failed to spawn: VirtualInterfaceCreateException: Virtual Interface creation failed
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6] Traceback (most recent call last):
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2273, in _build_resources
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6]     yield resources
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2053, in _build_and_run_instance
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6]     block_device_info=block_device_info)
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3129, in spawn
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6]     destroy_disks_on_failure=True)
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5652, in _create_domain_and_network
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6]     raise exception.VirtualInterfaceCreateException()
2019-07-17 19:15:42.696 1 ERROR nova.compute.manager [instance: 6014d6fc-dff7-493c-96ac-c01dee94a8d6] VirtualInterfaceCreateException: Virtual Interface creation failed

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In