RHEV-H hosts becomes Non Operational aftter first power off

Latest response

Sirs

We are doing a RHEV lab with the following environment:

RHEV-H host: ica.optec2.com  (6.3-20120710.0.el6_3)

Eth2 : 192.1.80.171/24

Eth1 : Unconfigured

Eth0 : 192.2.80.12/24

RHEV-H host: lima.optec2.com (6.3-20120710.0.el6_3)

Eth2 : 192.1.80.170/24Storage : Openfiler ver 2.3  

Eth1 : Unconfigured

Eth0 : 192.2.80.11/24

iSCSI Storage: Openfiler Ver 2.3 

Target1: 192.2.80.170

Target2: 192.2.80.171

Logical rhevm Network : 192.1.80.0/24

Logical Storage Network: 192.2.80.0/24 

The instalation was done with no problems, the hosts was approved, the storage network was configured as ISCSI-share (Data Master Domain), also we attach the ISO domain. ... at this point everything were OK......we put the hosts in maintenance and shutdown all equipment  (first the hosts. then the storage, an then the Manager console.

After that we power on the equipment in reverse orden but we notice: Heading 3

- We can activate one host , the other one becomes Non operational and shows a "red X" in the rhev manager.; if we put in maintenance mode the active host , we can activate the other, but no both at the same time.

- The storage network is disable in both hosts (we see that in admin environment of every host)

- when we try to activate the storage ISCSI-share (Data Master Domain) in Rhev Manager we get "Cannot find master domain (Error code: 304)"

- if we try to activate the ISO domain we get "• Cannot activate Storage. The relevant Storage Domain is inaccessible. Please handle Storage Domain issues and retry the operation. Failed to activate Storage due to an error on the Data Center Master Domain. Activate the Master Domain first.

We are stucked since friday with no solution, Please your help, We dont wat to think this can happen in a production environment with lots of Virtual machines. 

Thanks in advance

Responses

When you power RHEV down, you do it like this:

1. turn VMs off

2. put storage domains in maintenance

3. put hosts in maintenance

4. power off storage, hosts and RHEV-M

 

To turn everything back on:

1. turn on storage

2. turn on hosts

3. turn on RHEV-M

4. activate hosts

5. activate storage, wait for a host to become SPM

6. start VMs.

 

Did you follow this order?

 

Right now, according to what you are describing, you have the storage still down. All you need t do is bring one host up, and activate the data storage domains. Once they are up, and the host is SPM, you can activate the rest

Dan,

Thaks for your answer,

To RHEV down I  didnt put the storage in maintenance mode, 

To turn it on back Idid it like you say ... but ..

I can activate only one host at time (any of them), no both (one of them stays " No operational" )

Anyway , with one host activated, when I try to activate the storage domain I get "Cannot find master domain (Error code: 304)" message. 

so ... 

I can not ativate storage domains ( neither the other host)

Any aditional suggestion ?? do you think this is happenig because not to put the storage in maintenance mode before shutdown ?? please your comments.

"Cannot find master domain (Error code: 304)" means the host could not access the master data domain. Is it visible in the output of "multipath -ll" on any of the hosts? If it is, I suggest you open a support case, as this is not supposed to happen, if not - check your storage connection and make sure the host can access the storage.

 

One additional note - in my own experience, I have found openfiler to be very unstable for VM loads, it had died under stress so many times, I stopped using it and switched to RHEL with tgtd and nfs-utils, which is much more stable. In production, of course, I would use a proper enterprise grade storage, but for testing... Just saying it might be an openfiler issue.

Dan

 

I found a "No valid Data StorageDomain, Check your Storage infraestructure"  message in events Tab from RHEV manager, 

Also the storage volume is not seen in multipath -ll output (only sees the internal disks of every hotst ... I dont know why !!!) 

I can see the open filer targets in the output of "iscsiadm -m sessions" command  .... It is weird ... when I test the openfiler with a phisical host it works OK ... I dont know why not with Rhev. 

Also I have noticed there is a phantom volume in Openfiler ( the same capacity of  Storage Domain) that was no there before to create storage domain ... and I can no delete it ... weird weird ...

 

I m going to test with FreeNAS, if not with an NFS Server. ...here we go, to start from cero again ....

 

Thanks for help

Try it with a simple RHEL 6 server:

yum install scsi-target-utils

edit /etc/tgt/targets.conf

add a section that looks like this:


<target iqn.2008-09.com.redhat.example:tgt1>

        allow-in-use yes

        backing-store /dev/sdb

</target>


service tgtd restart

This will share /dev/sdb as a target called tgt1. You can share out LVs if you don't want to hand an entire disk out. 

Thanks for your valuable help, finally I got to install the storage domain with iSCSI FreeNAS , i only had troubles to see LUNs in the storage ... I have notice that the LUN must be blank in order to be seen, if there were any partition in it I had troubles... today every things are fine...and testing the other functions    

The fact that LUNs should be blank is well documented, by the way