iLo-Setup: "Test Failed, Host Status is: unknown"
Hi there,
Our situation:
We put the iLo-Interfaces into a separate, private network (172.32.32.0/24). Our RHEV-Manager has the IP 172.32.32.1 on his second interface to communicate with the iLos. We've tested it with an HP ProLiant DL380 G7 (using iLo3) and HP ProLiant DL380 G5 (using iLo2).
Problem:
While trying to configure Power-Management in the Manager-GUI, we're confronted with following error message (from rhevm.log):
2012-06-13 15:49:06,371 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (http-0.0.0.0-8443-9) FINISH, FenceVdsVDSCommand, return: Test Failed, Host Status is: unknown. The fence-agent script reported the following error: Getting status of IPMI:172.32.32.12...Chassis power = Unknown
Our troubleshooting yet:
- ping is ok, iLo is reachable from the manager
- iptables are ok as well
- tcpdump didn't gave us any usable results
- (In the Manager-GUI) Used an IP which isn't assigned. But this also resulted in the same error
Did we miss something or what could be the source of this error?
Greetings and thanks in advance
Responses
RHEV-M does not itself fence hosts, it delegates a fence request to cluster nodes. So if node X needs fencing, RHEv-M finds another available node, Y, and tells Y to issue a fence_xxxx command with X as the target.
This means that if you have only one node in the setup - fencing will not work.
As I explained, RHEV-M does not contact the fence devices at all. What it does is contact one of the hypervisors, and tell it to run a fence command against another hypervisor. This is done over the management network.
According to what you posted, you're using the NIC IP for fencing instead of the ILO IP, so this will not work of course.
Typically, the fencing devices are placed on the rhevm network, since it's the management network and the hosts are on this network anyway, so no need for the extra NIC, extra logical network or additional IPs.
No problem. By the way, the iLOs have to be enabled for lanplus access iirc.
Also, if you are sure the hosts are able to contact the iLOs, and fencing still fails, the next step would be to take a look in vdsm.log on the host that is attempting to fence the other host, the output of the fence_xxxx command should be in there for diagnostics
Hi everybody
I work with Daniel Balsiger on the power management within our RHEV environment. Currently, our power managment works meaning, we can e.g. stop hypervisors which are in maintenance mode from the admin portal.
But we're not sure if the RHEVM really powers down/up the hypervisors according to the power saving policies applied for the cluster containing all our hypervisors (3 hosts). The policy works since depending on the thresholds, VMs are migrated to/from hosts.
Is it correct that hosts without load should be shutdown? That's the behavior we expect from the power management, according to the Technical Reference section 5.1 "Power Management".
Thx in advance for your help
Kind regards,
Sebastian
Hi Sebastian,
RHEV 3.0 doesn't actually power hosts down, it only prepares them for shutdown, by removing the VMs away. With no load, most modern servers will idle and save power even without restarting.
5.1. Power Management
The Red Hat Enterprise Virtualization Manager is capable of rebooting hosts that have entered a non-operational or non-responsive state, as well as preparing to power off under-utilized hosts to save power.