RHEVM : VM no longer bootable after snapshot removal
Hi Everybody
Just been hit by a rather nasty problem. I'm kinda hoping that we're doing something wrong because I fear that if what we have found is a bug, then we may have a big problems. If you wish to reproduce this issue then please do NOT do so on a VM that is important to you, as it has the potential to trash it.
Version : RHEV 3.1.0-50
Issue: VM is no longer bootable after snapshot removal. VM reports "Boot failed. Could not read the bootdisk. No bootable device"
Reproducable: Yes- Everytime
VM type: RHEL5/6 built on PreAllocated disk
Steps to reproduce (don't do this on a VM that you want !)
1. Via the webadmin tool, take a "live" snapshot of a VM. ("live" = with VM power up)
2. Make some changes to some files on you VM to prove the snap process
3. Power off your VM
4. Via the webadmin tool, remove the snapshot you created in step 1. Wait for confirmation of snap removal
5. Power on your VM and start console session
Console reports errors and VM fails to boot.
This problem is reproducable on all of our RHEV3.1.0-50 environments, but works fine on our RHEV3.0.7 environment (albeit not a "live" snapshot)
Support case raised.
Responses
I tried to reprodue this couple of times tody by following steps given without any luck. Will you provide below details?
- What are the changes done on the guest? Was it during snapshot creation or after that?
- Does the "qemu-img info <lv>" on the base volume look same before creating the snashot and after it was deleted? (Base volume will be different before snapshot creation and deletion)
Hi,
We ran into this issue today, and discovered that the "normal" deactivate/active vdisk trick didn't work. However once we did a storage migrate to a different domain, the vm booted again.
..in case anyone else encounters this
Regards,
Brian
May be you are facing the same issue fixed in http://rhn.redhat.com/errata/RHSA-2013-0886.html
See bug: https://bugzilla.redhat.com/show_bug.cgi?id=962549
Details: After upgrading to 3.1, a snapshot of a virtual machine from the older environment can be successfully removed, but the virtual machine would fail to start. This was due to a failure to tear down the snapshot's volume path on the host storage manager prior to merging the snapshot, which left the volume activated on both the storage pool manager and the host storage manager. This update removes unnecessary volume paths and deactivates the snapshot volumes after they are deleted, so virtual machines can run successfully under these conditions.
Sounds like the same problem, but not quite the same conditions as the bz. This was on a vm created from scratch, not a template. And thin provisioned disks. But the snapshot might have come from an older environment, though.
Anyways, we'll upgrade to 3.2 soon, and we do have a lot of snapshot clean-up to do, so fingers crossed this is fixed in 3.2.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
