Why is fail-over taking too long when restarting one gluster server
Issue
Fail-over is different for different types of gluster restart.
When glusterd is stop or restarted the time taken is:
real 0m0.039s
user 0m0.002s
sys 0m0.004s
For a reboot:
real 0m25.954s <<<<<=====
user 0m0.000s
sys 0m0.007s
For a halt:
real 1m22.415s <<<<<=====
user 0m0.002s
sys 0m0.009s
real 1m14.180s <<<<<=====
user 0m0.002s
sys 0m0.007s
Therefore with the node completely down it takes around 1m14s to switch to the other node.
With a reboot the delay depends on the amount of time it took to reboot.
The time to switch is the same for a down NIC.
ifdown ens3:
real 1m8.210s <<<<<=====
user 0m0.002s
sys 0m0.005s
Environment
RHGS 3.1*
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
