Live migration of RHEV VM fails with "could not open disk image" and "Permission denied"

Solution Verified - Updated -

Environment

  • Red Hat Virtualization 4.x

Issue

  • Trying to migrate a VM in RHEV, and the process immediately fails.
  • Cannot migrate RHEV VM from existing host to a newly added host.
  • In RHEV-GUI I can see the following event;
Migration started (VM: VM1, Source: rhevh1.example.com, Destination: rhevh2.example.com, User: admin).
Migration failed due to Error: Fatal error during migration (VM: VM1, Source: rhevh1.example.com, Destination: rhevh2.example.com).
  • The vdsm.log file on the source host indicates that the migration failed with the error could not open disk image due to Permission denied. Please refer to the Diagnostics Steps below for details.

Resolution

  • Verify the SELinux settings are consistent across all hypervisors in the environment;
    • If there are hypervisors with SELinux enabled and others with SELinux disabled, migrations will fail. All hypervisors in the environment have to have SELinux enabled
    • SELinux may be permissive on some hypervisors and enabled on others as a workaround to allow VM migration between hosts
    • If host was built with SELinux in Disabled mode, remember to reboot it after changing SELinux to Enabled/Enforcing/Permissive
    • If SELinux is in Enforcing mode and migration still fails, there could be issues with SELinux label file context in the files used for the migration process. Check under 'Diagnostic Steps'
    • Use restorecon command to set the correct SELinux context or re-label the entire filesystem
    • Follow this solution: How to enable SELinux in RHEV?.

Root Cause

  • Existing Hypervisors were originally configured with SELinux Enabled/Enforcing.
  • Newly added Hypervisors were originally configured with SELinux Disabled, and then changed to Enabled/Enforcing.
  • Any time SELinux is disabled and files are created one can introduce problems as these files will be created without labels. Refer to the link.
  • SELinux files that have no security context have a label set to file_t. Please refer to link.
  • As per the audit.log file on the destination host, qemu-kvm is being denied read access to the symlink file due to the wrong SElinux label.

Diagnostic Steps

  • The following error should be logged in /var/log/ovirt-engine/engine.log;
2015-01-07 18:21:13,329 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (org.ovirt.thread.pool-4-thread-50) Command MigrateStatusVDSCommand(HostName = rhevh1.example.com, HostId = 7c92b07e-888d-428e-8271-8705d9fb5ecc, vmId=09ea2554-6e83-4b9c-9108-15b4c9fb325c) execution failed. Exception: VDSErrorException: VDSGenericException: VDSErrorException: Failed to MigrateStatusVDS, error = Fatal error during migration, code = 12

2015-01-07 18:21:13,334 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-4-thread-50) Correlation ID: 155b50c8, Job ID: a79af7f8-65e6-4dd8-989f-9a6f0a042c14, Call Stack: null, Custom Event ID: -1, Message: Migration failed due to Error: Fatal error during migration (VM: VM1, Source: rhevh1.example.com, Destination: rhevh2.example.com).
  • The vdsm.log on source host will contain the following;
Thread-15429::DEBUG::2015-01-08 12:31:18,347::libvirtconnection::124::root::(wrapper) Unknown libvirterror: ecode: 1 edom: 10 level: 2 message: internal error process exited while connecting to monitor: char device redirected to /dev/pts/1 qemu-kvm: -drive file=/rhev/data-center/ea3f61b2-c234-4335-8515-ab44298a7b0b/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9/d5cf1619-89ee-4cce-9f83-f3103c93b0c2,if=none,id=drive-virtio-disk0,format=raw,serial=9f32792a-fce9-4e73-9370-c1616545bfc9,cache=none,werror=stop,rerror=stop,aio=native: could not open disk image /rhev/data-center/ea3f61b2-c234-4335-8515-ab44298a7b0b/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9/d5cf1619-89ee-4cce-9f83-f3103c93b0c2: Permission denied

Thread-15429::ERROR::2015-01-08 12:31:18,352::vm::229::vm.Vm::(_recover) vmId=`09ea2554-6e83-4b9c-9108-15b4c9fb325c`::internal error process exited while connecting to monitor: char device redirected to /dev/pts/1 qemu-kvm: -drive file=/rhev/data-center/ea3f61b2-c234-4335-8515-ab44298a7b0b/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9/d5cf1619-89ee-4cce-9f83-f3103c93b0c2,if=none,id=drive-virtio-disk0,format=raw,serial=9f32792a-fce9-4e73-9370-c1616545bfc9,cache=none,werror=stop,rerror=stop,aio=native: could not open disk image /rhev/data-center/ea3f61b2-c234-4335-8515-ab44298a7b0b/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9/d5cf1619-89ee-4cce-9f83-f3103c93b0c2: Permission denied

Thread-15429::ERROR::2015-01-08 12:31:19,186::vm::328::vm.Vm::(run) vmId=`09ea2554-6e83-4b9c-9108-15b4c9fb325c`::Failed to migrate
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 314, in run
    self._startUnderlyingMigration(time.time())
  File "/usr/share/vdsm/vm.py", line 396, in _startUnderlyingMigration
    None, maxBandwidth)
  File "/usr/share/vdsm/vm.py", line 891, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1178, in migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', dom=self)
libvirtError: internal error process exited while connecting to monitor: char device redirected to /dev/pts/1
qemu-kvm: -drive file=/rhev/data-center/ea3f61b2-c234-4335-8515-ab44298a7b0b/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9/d5cf1619-89ee-4cce-9f83-f3103c93b0c2,if=none,id=drive-virtio-disk0,format=raw,serial=9f32792a-fce9-4e73-9370-c1616545bfc9,cache=none,werror=stop,rerror=stop,aio=native: could not open disk image /rhev/data-center/ea3f61b2-c234-4335-8515-ab44298a7b0b/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9/d5cf1619-89ee-4cce-9f83-f3103c93b0c2: Permission denied
  • The audit.log file on destination host will contain;
type=AVC msg=audit(Tue Jan 13 11:18:41 2015.524:2156): avc:  denied  { read } for  pid=9503 comm="qemu-kvm" name="9f32792a-fce9-4e73-9370-c1616545bfc9" dev=dm-1 ino=9318 scontext=system_u:system_r:svirt_t:s0:c864,c961 tcontext=system_u:object_r:file_t:s0 tclass=lnk_file
type=AVC msg=audit(Tue Jan 13 11:18:41 2015.525:2157): avc:  denied  { read } for  pid=9503 comm="qemu-kvm" name="9f32792a-fce9-4e73-9370-c1616545bfc9" dev=dm-1 ino=9318 scontext=system_u:system_r:svirt_t:s0:c864,c961 tcontext=system_u:object_r:file_t:s0 tclass=lnk_file
type=AVC msg=audit(Tue Jan 13 11:18:41 2015.525:2158): avc:  denied  { read } for  pid=9503 comm="qemu-kvm" name="9f32792a-fce9-4e73-9370-c1616545bfc9" dev=dm-1 ino=9318 scontext=system_u:system_r:svirt_t:s0:c864,c961 tcontext=system_u:object_r:file_t:s0 tclass=lnk_file
  • Finding the inode file which is having issues with SELinux;
# find / -inum 9318 -exec ls -liaZ '{}' \;
# find /var -inum 9318 -exec ls -liaZ '{}' \;
  • List SELinux context for the block device (dm-xx), symlink, /var/run/vdsm/storage and /rhev/data-center;
# ls -lZ /dev/dm-33
brw-rw----. vdsm qemu system_u:object_r:svirt_image_t:s0:c452,c861 /dev/dm-33

# ls -lZ /rhev/data-center/mnt
drwxr-xr-x. vdsm kvm system_u:object_r:file_t:s0      blockSD

# ls -lZ /var/run/vdsm/storage/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/9f32792a-fce9-4e73-9370-c1616545bfc9
lrwxrwxrwx. vdsm kvm system_u:object_r:virt_var_run_t:s0 d5cf1619-89ee-4cce-9f83-f3103c93b0c2 -> /dev/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/d5cf1619-89ee-4cce-9f83-f3103c93b0c2

# ls -lZ /rhev/data-center/mnt/blockSD/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9
lrwxrwxrwx. vdsm kvm system_u:object_r:file_t:s0      /rhev/data-center/mnt/blockSD/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9 -> /var/run/vdsm/storage/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/9f32792a-fce9-4e73-9370-c1616545bfc9
  • Note the incorrect SELinux context file_t in the /rhev/data-center/mnt/blockSD/44d43c93-4b8c-4e90-bfa1-65b5da80afa5/images/9f32792a-fce9-4e73-9370-c1616545bfc9 symlink.

  • As per the SELinux base policy, it should be mnt_t context;

# semanage fcontext -l | grep rhev | head -3
/rhev                                              directory          system_u:object_r:mnt_t:s0 
/rhev(/[^/]*)?                                     directory          system_u:object_r:mnt_t:s0 
/rhev/[^/]*/.*                                     all files          <<None>>
  • You can list what context changes will be done over the file/path with the restorecon command;
# restorecon -nvr /path/to/file

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments