High Availability tuning in RHEV beta 3

Latest response

Hi, I've been testing RHEV beta3 and tried to test HA functionality . In order to proceed with that I simulated host (RHEL 6.1 ) to be non-responsive by switching rhevm and logical interfaces : #ifdown rhevm #ifdown rhev_lab Question: It took4 minutes till the host was set in non-responsive mode ! Is it possible to make some tuning in oreder the RHEV manager will react more faster ?





Hi Vlad,


The normal timeout is 1 minute, and since it took 4 minutes to become non-responsive, I suppose the queries RHEV-M sends to the host did not fail immediately. A more correct way to test this (and limit the timeout to 1min) is by stopping the vdsmd service on the host in question.


To answer your question, there are several options in rhevm-config that control these timeouts (the actual timeout and a number of retries) but those are there for a reason: for example, you don't want a server running critical VMs to get fenced on a minor network glitch, so it needs to be allowed time to recover and communication failure should be retested before it is confirmed.


Hope this helps