RHEV-H hosts becomes Non Operational aftter first power off
Sirs
We are doing a RHEV lab with the following environment:
RHEV-H host: ica.optec2.com (6.3-20120710.0.el6_3)
Eth2 : 192.1.80.171/24
Eth1 : Unconfigured
Eth0 : 192.2.80.12/24
RHEV-H host: lima.optec2.com (6.3-20120710.0.el6_3)
Eth2 : 192.1.80.170/24Storage : Openfiler ver 2.3
Eth1 : Unconfigured
Eth0 : 192.2.80.11/24
iSCSI Storage: Openfiler Ver 2.3
Target1: 192.2.80.170
Target2: 192.2.80.171
Logical rhevm Network : 192.1.80.0/24
Logical Storage Network: 192.2.80.0/24
The instalation was done with no problems, the hosts was approved, the storage network was configured as ISCSI-share (Data Master Domain), also we attach the ISO domain. ... at this point everything were OK......we put the hosts in maintenance and shutdown all equipment (first the hosts. then the storage, an then the Manager console.
After that we power on the equipment in reverse orden but we notice: Heading 3
- We can activate one host , the other one becomes Non operational and shows a "red X" in the rhev manager.; if we put in maintenance mode the active host , we can activate the other, but no both at the same time.
- The storage network is disable in both hosts (we see that in admin environment of every host)
- when we try to activate the storage ISCSI-share (Data Master Domain) in Rhev Manager we get "Cannot find master domain (Error code: 304)"
- if we try to activate the ISO domain we get "• Cannot activate Storage. The relevant Storage Domain is inaccessible. Please handle Storage Domain issues and retry the operation. Failed to activate Storage due to an error on the Data Center Master Domain. Activate the Master Domain first.
We are stucked since friday with no solution, Please your help, We dont wat to think this can happen in a production environment with lots of Virtual machines.
Thanks in advance
Responses
When you power RHEV down, you do it like this:
1. turn VMs off
2. put storage domains in maintenance
3. put hosts in maintenance
4. power off storage, hosts and RHEV-M
To turn everything back on:
1. turn on storage
2. turn on hosts
3. turn on RHEV-M
4. activate hosts
5. activate storage, wait for a host to become SPM
6. start VMs.
Did you follow this order?
Right now, according to what you are describing, you have the storage still down. All you need t do is bring one host up, and activate the data storage domains. Once they are up, and the host is SPM, you can activate the rest
"Cannot find master domain (Error code: 304)" means the host could not access the master data domain. Is it visible in the output of "multipath -ll" on any of the hosts? If it is, I suggest you open a support case, as this is not supposed to happen, if not - check your storage connection and make sure the host can access the storage.
One additional note - in my own experience, I have found openfiler to be very unstable for VM loads, it had died under stress so many times, I stopped using it and switched to RHEL with tgtd and nfs-utils, which is much more stable. In production, of course, I would use a proper enterprise grade storage, but for testing... Just saying it might be an openfiler issue.
Try it with a simple RHEL 6 server:
yum install scsi-target-utils edit /etc/tgt/targets.conf add a section that looks like this: <target iqn.2008-09.com.redhat.example:tgt1> allow-in-use yes backing-store /dev/sdb </target> service tgtd restart
This will share /dev/sdb as a target called tgt1. You can share out LVs if you don't want to hand an entire disk out.