After power cycling, no VMs are able to be created or moved to some computes

Solution In Progress - Updated -

Issue

  • Our datacenter suffered from a power outage for more than a hour. After power supply was back on, we observed it was impossible to create or migrate VMs on compute nodes overcloud-compute-0 and overcloud-compute-1.

  • We see the folloeing error messages in /var/log/containers/nova/nova-compute.log:

2022-06-13 04:28:33.064 7 ERROR nova.compute.manager [req-a5ca03b4-ea83-41da-a069-8006680c9d18 - - - - -] Error updating resources for node overcloud-compute-0.localdomain.: TimedOut: [errno 110] error connecting to the cluster
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager Traceback (most recent call last):
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 7573, in update_available_resource_for_node
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager     rt.update_available_resource(context, nodename)
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 690, in update_available_resource
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager     resources = self.driver.get_available_resource(nodename)
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 6467, in get_available_resource
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager     disk_info_dict = self._get_local_gb_info()
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5734, in _get_local_gb_info
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager     info = LibvirtDriver._get_rbd_driver().get_pool_info()
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 368, in get_pool_info
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager     with RADOSClient(self) as client:
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 102, in __init__
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager     self.cluster, self.ioctx = driver._connect_to_rados(pool)
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py", line 133, in _connect_to_rados
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager     client.connect()
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager   File "rados.pyx", line 885, in rados.Rados.connect (/builddir/build/BUILD/ceph-12.2.12/build/src/pybind/rados/pyrex/rados.c:9785)
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager TimedOut: [errno 110] error connecting to the cluster
2022-06-13 04:28:33.064 7 ERROR nova.compute.manager 
2022-06-13 04:28:33.066 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [8808e101-b9b5-469c-81a1-6ebd22fe4de7] AMQP server on overcloud-controller-2.localdomain:5672 is unreachable: [Errno 32] Broken pipe. Trying again in 1 seconds.: error: [Errno 32] Broken pipe
2022-06-13 04:28:33.066 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [29696748-533d-41c2-ae75-18439c4ffe79] AMQP server on overcloud-controller-2.localdomain:5672 is unreachable: [Errno 32] Broken pipe. Trying again in 1 seconds.: error: [Errno 32] Broken pipe
2022-06-13 04:28:33.068 7 ERROR oslo.messaging._drivers.impl_rabbit [-] [2ad8a021-678f-4f4e-9711-537f208b70ba] AMQP server on overcloud-controller-0.localdomain:5672 is unreachable: [Errno 32] Broken pipe. Trying again in 1 seconds.: error: [Errno 32] Broken pipe

Environment

  • Red Hat OpenStack Platform 13.0 (RHOSP)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content