We have a RHEV 4.4 system running on 2 Dell 740's. RHEV01 and RHEV02. Went to update RHEV01 with an update so migrated the Hosted Engine to RHEV02 and halted the domain VM's.
Update complete and turns out I did the wrong update and now RHEV01 is down and out. All the VM's were able to be started on RHEV02 so the system seems usable for now.
Other than the trouble above the Hosted Engine seems to reset every 10 minutes due to Power Management. We lose connectivity for a minute or so then have to login again.
I have tried a few things like:
putting the dead host into maintenance mode in the manager, it does show it in maintenance mode.
unplugged the IDRAC network connection.
Unchecked the Power Manger check box for the dead host.
None of that worked. Maybe I need to do #3 in the Host the Hosted Engine is running on?
I also see mention of Fencing. That might be the real issue as I do see it configured on Advanced Features. Shows Cluster followed by dc.
Also every 10 minutes this is logged:
"rhev manager event execution of power management status on host RHEV01 using proxy host RHEV02 and fence agent ipmilan .
Any ideas here. Thanks.