An instance failed to hard reboot even if only one path of the multipath was failed.
Issue
- Not able to hard reboot or stop/start the instance when primary path of multipath cinder volume is disconnected. Issue is not reproducible when secondary path is down. In below scenario issue was reproducible only when "21:0:0:0" path is down not when "22:0:0:0" path is down.
# multipath -l
360014058459641xxxxxxxxxxxxxx dm-0 LIO-ORG ,IBLOCK
size=1.0G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| `- 21:0:0:0 sda 8:0 failed faulty running <-- this path was failed.
`-+- policy='service-time 0' prio=0 status=active
`- 22:0:0:0 sdb 8:16 active undef running
# multipath -l /dev/sdb
360014058459641xxxxxxxxxxxxxx dm-0 LIO-ORG ,IBLOCK <-- Can get the multipath_id with the connecting well path.
size=1.0G features='0' hwhandler='0' wp=rw
|-+- policy='service-time 0' prio=0 status=enabled
| `- 21:0:0:0 sda 8:0 failed faulty running
`-+- policy='service-time 0' prio=0 status=active
`- 22:0:0:0 sdb 8:16 active undef running
- Following call trace is reported in nova-compute log file while trying to hard reboot the instance.
2017-03-08 15:14:48.835 8704 ERROR oslo_messaging.rpc.dispatcher [req-1da033c4-1fca-42c5-859c-d7c0e569ca61 fa0110b263b64da283f29fef80c1d993 1c868a13b2b248ee94322cdf0d89c190 - - -] Exception during message handling: internal error: process exited while connecting to monitor: 2017-03-08T06:14:48.245900Z qemu-kvm: -drive file=/dev/disk/by-path/ip-xx.xx.xx.100:3260-iscsi-iqn.2001-03.example.com:storage01:ist-m000-sn-0000000j1bn00489.lx-openstack0-0095.target0073-lun-0,if=none,id=drive-virtio-disk1,format=raw,serial=5dbdf72c-7afd-4289-9ab7-0e06459e69d0,cache=none,iops=750: Could not open '/dev/disk/by-path/ip-xx.xx.xx.100:3260-iscsi-iqn.2001-03.example.com:storage01:ist-m000-sn-0000000j1bn00489.lx-openstack0-0095.target0073-lun-0': No such device or address
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 6862, in reboot_instance
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher reboot_type)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher payload)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 341, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher LOG.warning(msg, e, instance_uuid=instance_uuid)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 312, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 391, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 369, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info())
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 357, in decorated_function
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3237, in reboot_instance
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher self._set_instance_obj_error_state(context, instance)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 3218, in reboot_instance
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher bad_volumes_callback=bad_volumes_callback)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2141, in reboot
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher block_device_info)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2253, in _hard_reboot
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher vifs_already_plugged=True)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4587, in _create_domain_and_network
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher power_on=power_on)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4518, in _create_domain
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher LOG.error(err)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4508, in _create_domain
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher domain.createWithFlags(launch_flags)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher result = proxy_call(self._autowrap, f, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher rv = execute(f, *args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher six.reraise(c, e, tb)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher rv = meth(*args, **kwargs)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1059, in createWithFlags
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher libvirtError: internal error: process exited while connecting to monitor: 2017-03-08T06:14:48.245900Z qemu-kvm: -drive file=/dev/disk/by-path/ip-xx.xx.xx.100:3260-iscsi-iqn.2001-03.example.com:storage01:ist-m000-sn-0000000j1bn00489.lx-openstack0-0095.target0073-lun-0,if=none,id=drive-virtio-disk1,format=raw,serial=5dbdf72c-7afd-4289-9ab7-0e06459e69d0,cache=none,iops=750: Could not open '/dev/disk/by-path/ip-xx.xx.xx.100:3260-iscsi-iqn.2001-03.example.com:storage01:ist-m000-sn-0000000j1bn00489.lx-openstack0-0095.target0073-lun-0': No such device or address
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher
2017-03-08 15:14:48.835 8704 TRACE oslo_messaging.rpc.dispatcher
-
Issue is reproducible only while performing action which will require the rebuild of instance like nova stop/start and nova hard reboot. Issue didn't notice with soft reboot of the instance.
-
Doesn't matter instance is spawned using image or cinder volume, issue is always reported when primary path of multipath cinder volume attached to an instance is down.
Environment
- Red Hat OpenStack Platform 7.0
- Red Hat OpenStack Platform 8.0
- Red Hat OpenStack Platform 9.0
- Red Hat OpenStack Platform 10.0
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.