Unable to start one of the VM due to storage domain is not available

Latest response

Dears can any one explain to me why this happened, I have a RHEV environment with 3 hosts in one cluster connected with one FC Storage and every this is fine, suddenlly, after we make power off to one of VM has RHEL 5.7 cann't boot and stop with error that the inittab is not exist,

whereever other VM's still up and runnign 

I check the Evnet and found this error

VM.xxx is down exit message, Domian not found no domain with matching uuid" xxxx" 

 

when I tried to boot the VM in rescue mode I found the disk but without any lvm partitions,

I tried to deactivate and activate the disk, withoout any luck, finally I just cloned the vm from the latest snapshoot, and after finish, I started the original VM again as the last try but I found it up and running without any issues, So I just need to know the reasone and how can I prevent it in future. 

Responses

Admin note: this topic appeared to have a duplicate post which has been deleted.

When this problem happened, did you check the state of each snapshot? If yes, did you see any snapshot in BROKEN status?

Please open a case with support with below details.

- Log collector with sosreport from SPM and hypervisor where this vm started in broken state. Log collector should have the database back up.

- Name of the vm and approximate time and date this problem happened.

As an aside, the message:

VM.xxx is down exit message, Domian not found no domain with matching uuid" xxxx"

although it sounds frightening, it does not always indicate there is a serious issue, or any issue at all.  I will see the same message occasionally while I am doing maintenance, etc.. (similar to when you migrate a VM between hosts and the event log shows the VM is down - I believe).

I have run into a similar issue where my VM was unable to boot.  My problem turned out to be that my Satellite had pushed an lvm.conf which was created for physical machines out to my VM.  It was missing VD* for devices.
I agree with Sadique in that you should run the log collect on your RHEV manager.  I'm not sure, but I would run this fairly soon so that the logs don't roll-off.  The collector will reach out to all of your hypervisors from your manager as well.  (I believe you do not need to run the tool on the individual hosts - but support can advise you on that).

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.0/html/Installation_Guide/References_RHEV_3_Log_Collector.html

If you happen to be curious and wanted to look around

# tail -f /var/log/ovirt-engine/engine.log

There are also some postgres commands that could help you look at status.