RHEVM: "Storage Permission Problem"

Latest response

Hi

I wanted to share with you a problem we encountered recently with starting a RHEV based VM following a complete powerdown of the RHEV environment. Maybe someone has hit the same issue.

Due to a scheduled datacentre powerdown, we were forced to shutdown our entire RHEV  environment consisting of a Manager instance and seven hypervisors. The VM's , hypervisors and manager were all shutdown gracefully with no problem. Upon restart, one single VM out of 30+ failed to boot and instead kept reverting to a "suspend" state, showing the Paused type icon from the manager. The event log showed the message "VM has paused due to a storage permission problem".

The console showed that the VM was booting to the point of interacting with its second disk at which point it would become suspended. We suspected an issue with the underlying LV that provided the second disk. Luckily the VM was fairly small in terms of attached disk and we were able to export, remove and then re import to get around the problem but we are keen to understand what caused the initial "storage permission problem".

Support case 00760769 has been raised and we're currently awaiting results of log analysis.

I'll keep you posted.

Responses

Happy new year, Rich!

Made any progress with this issue?

Hi David.  Seasons greetings :)

Daniele Consoli is dealing with this case for me.  
Nothing as yet. Probably still sifting through my 980Mb logcollect.  Poor thing!



Hello,

 

Was the problem uncovered?  I had a similair incident which occurred today where an active vm went into a suspended state during the late night hours.  I am moving the vm to another storage domain and will see if this vm can boot into an active state. 

 

thanks

Hi Mike. It looks like this case was resolved, so Rich might be able to shed some light on the situation for you. This knowledgebase solution was also linked to the case, so it may or may not be helpful with your issue: https://access.redhat.com/site/solutions/130323

After some analysis , support suspected that the issue was caused by some malformed entries in the "vm_snapshot_id" field in the “images” table of the RHEV DB . In short, instead reflecting the correct id, it simply contained zeros.  (ie 000000-00000-00000-......)

This was tracked down to a bug with virt-v2v.  The VM which was experiencing problems was migrated from a legacy KVM platform using virt-v2v.

It transpired that ALL of the VM we had migrated had the zero'd field which was causing conflict.

Coincidentally, this was also identified as the cause of another issue that I had raised where we were unable to remove a snapshot.

The fix was to run a simple SQL insert which replaced the zero’d enries with something more meaningful.

This fixed the snapshot problem but we will never know for sure if it would have fixed the issue “Storage Permission” issue since I got around it by exporting and re-importing the affected VM.

The virt-v2v bug was subsequently fixed.

Apologies for not updating this thread earlier.

No problem, Rich! Thanks for dropping back in to help out.