An instance failed to hard reboot even if only one path of the multipath was failed.
Issue
- Not able to hard reboot or stop/start the instance when primary path of multipath cinder volume is disconnected. Issue is not reproducible when secondary path is down. In below scenario issue was reproducible only when "21:0:0:0" path is down not when "22:0:0:0" path is down.
# multipath -l
360014058459641xxxxxxxxxxxxxx dm-0 LIO-ORG ,IBLOCK
size=1.0G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| `- 21:0:0:0 sda 8:0 failed faulty running <-- this path was failed.
`-+- policy='service-time 0' prio=0 status=active
`- 22:0:0:0 sdb 8:16 active undef running
# multipath -l /dev/sdb
360014058459641xxxxxxxxxxxxxx dm-0 LIO-ORG ,IBLOCK <-- Can get the multipath_id with the connecting well path.
size=1.0G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| `- 21:0:0:0 sda 8:0 failed faulty running
`-+- policy='service-time 0' prio=0 status=active
`- 22:0:0:0 sdb 8:16 active undef running
- Following call trace is reported in nova-compute log file while trying to hard reboot the instance.
2017-03-08 15:14:48.835 8704 ERROR oslo_messaging.rpc.dispatcher [req-1da033c4-1fca-42c5-859c-d7c0e569ca61 fa0110b263b64da283f29fef80c1d993 1c868a13b2b248ee94322cdf0d89c190 - - -] Exception during message handling: internal error: process exited while connecting to monitor: 2017-03-08T06:14:48.245900Z qemu-kvm: -drive file=/dev/disk/by-path/ip-xx.xx.xx.100:3260-iscsi-iqn.2001-03.example.com:storage01:ist-m000-sn-0000000j1bn00489.lx-openstack0-0095.target0073-lun-0,if=none,id=drive-virtio-disk1,format=raw,serial=5dbdf72c-7afd-4289-9ab7-0e06459e69d0,cache=none,iops=750: Could not open '/dev/disk/by-path/ip-xx.xx.xx.100:3260-iscsi-iqn.2001-03.example.com:storage01:ist-m000-sn-0000000j1bn00489.lx-openstack0-0095.target0073-lun-0': No such device or address
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6862, in reboot_instance
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher reboot_type)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher payload)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 341, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher LOG.warning(msg, e, instance_uuid=instance_uuid)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 312, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 391, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 369, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info())
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 357, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3237, in reboot_instance
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher self._set_instance_obj_error_state(context, instance)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3218, in reboot_instance
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher bad_volumes_callback=bad_volumes_callback)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2141, in reboot
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher block_device_info)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2253, in _hard_reboot
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher vifs_already_plugged=True)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4587, in _create_domain_and_network
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher power_on=power_on)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4518, in _create_domain
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher LOG.error(err)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4508, in _create_domain
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher domain.createWithFlags(launch_flags)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher result = proxy_call(self._autowrap, f, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher rv = execute(f, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(c, e, tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher rv = meth(*args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1059, in createWithFlags
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher libvirtError: internal error: process exited while connecting to monitor: 2017-03-08T06:14:48.245900Z qemu-kvm: -drive file=/dev/disk/by-path/ip-xx.xx.xx.100:3260-iscsi-iqn.2001-03.example.com:storage01:ist-m000-sn-0000000j1bn00489.lx-openstack0-0095.target0073-lun-0,if=none,id=drive-virtio-disk1,format=raw,serial=5dbdf72c-7afd-4289-9ab7-0e06459e69d0,cache=none,iops=750: Could not open '/dev/disk/by-path/ip-xx.xx.xx.100:3260-iscsi-iqn.2001-03.example.com:storage01:ist-m000-sn-0000000j1bn00489.lx-openstack0-0095.target0073-lun-0': No such device or address
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher
-
Issue is reproducible only while performing action which will require the rebuild of instance like nova stop/start and nova hard reboot. Issue didn't notice with soft reboot of the instance.
-
Doesn't matter instance is spawned using image or cinder volume, issue is always reported when primary path of multipath cinder volume attached to an instance is down.
Environment
- Red Hat OpenStack Platform 7.0
- Red Hat OpenStack Platform 8.0
- Red Hat OpenStack Platform 9.0
- Red Hat OpenStack Platform 10.0
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
