RHEV3.0-Hypervisor to maintenance mode shutdown's iSCSI sessions
Hi,
We are running RHEV 3.0 managing two RHEV 6.2 hypervisor using iSCSI data center. When I moved the hypervisor to maintenance mode, all iSCSI sessions got disconnected and that causes storage domains to lose all the underlying devices.
So can someone help me why iSCSI sessions are getting dropped when host goes to maintenance mode.
Thanks,
Inbaraj
Responses
A host in maintenance mode is a host you want to be able to take down for service. This means all the VMs will be live migrated away from this host and all storage connections will be disconnected.
Storage domains should not be affected, because they are being accessed by the other hypervisors. If the host that was put in maintenance mode was SPM, a new SPM will be elected automatically (this takes a few seconds, so SDs might appear to be down, but this does not really affect running VMs)
Hi Dan,
Even I am facing the exact issue what Inbaraj is facing. Looks like a potential issue with RHEV.
Also, I wanted to know if you have any write-up of what happens when you move the hypervisros in maintenance mode.
Thanks,
-Ranjan
Hey Inbaraj,
I doubt the 'reboot' operation of 'rhev-h" failed at the time of upgrade..
Are you sure you saw upgraded version of RHEV-H before you reboot or when you manually executed 'iscsi' logins ?
In my imagination, this is the scenario when 'upgrade' reboot failed..
How did you check whether the upgraded version was the one you were operating after the error ?
Please let us know more details on this.
Also, pulling/logging out all the sessions at time of "maintanance mode' is expected..
--Humble
So now the update of a host in maintenance mode failed, right? Nothing to do with iSCSI... Maybe we should start a separate thread for this?
Hi Inbaraj,
I checked your logs. Looks like there is a problem with vdsm.
Dan, Humble ? Can you people confirm ?
-Ranjan
OK, looks like this is a case that requires deeper investigation than a simple forum can provide. Please open a support case for this, and provide a full log collector output.
Hi Inbaraj,
Thanks for more information on this.
Did you get a chance to check different lvm commands in your system 'before' the vdsm/rhev update and 'after' the host moved to maintanance mode? As you said, it seems like 'lvm' layer is hung which cause further issues.
To check further, we need the LVM commands status and other outputs like "multipath -ll"..etc .
It is better to open a support case as Dan mentioned..
I still have to catch up with the logs.. but it is better to move and work on a support case..
--Humble
Hi Ranjan,
I still have to catch up with the logs and vdsm source code for further confirmation..
Even-though I see less chance for vdsm bug, I can not 100% be sure..
Afaik , I have n't seen this issue reported with any other customers.
Are you also seeing the same behaviour ( LVM commands hang) in your setup as well ?
>>
Looks like there is a problem with vdsm.
>>
Did you notice anything from the logs to think it is a bug with vdsm ? If yes, please let me know , so that I can pay more attention on this.
--Humble