Why is fail-over taking too long when restarting one gluster server

Solution Unverified - Updated -

Issue

Fail-over is different for different types of gluster restart.

When glusterd is stop or restarted the time taken is:

real    0m0.039s
user    0m0.002s
sys 0m0.004s

For a reboot:

real    0m25.954s    <<<<<=====
user    0m0.000s
sys 0m0.007s

For a halt:

real    1m22.415s     <<<<<=====
user    0m0.002s
sys 0m0.009s

real    1m14.180s      <<<<<=====
user    0m0.002s
sys 0m0.007s

Therefore with the node completely down it takes around 1m14s to switch to the other node.
With a reboot the delay depends on the amount of time it took to reboot.

The time to switch is the same for a down NIC.
ifdown ens3:

real    1m8.210s     <<<<<=====
user    0m0.002s
sys 0m0.005s

Environment

RHGS 3.1*

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content