RHCS 6.3 : How to create a WORKING cluster config with qdisk AND I/O fencing ?

Latest response

Hello !

I'm trying to configure a two-node cluster really providing HA service in various cases of failure :
- public network failure
- private network failure
- power supply failure
- SAN failure

I've tried various configs, with/without a quorum disk, with power fencing only or with power and I/O fencing. I can't determine what is the best practice, because there's always a case where the cluster behaviour is not satisfying :

  • power supply failure on one node can be addressed through I/O fencing, but this is NOT compatible with using a quorum disk (when restarted, the failed node cannot unfence itself because qdiskd starts before fenced and provokes cman stop);
  • private network failure HAS to be addressed with a quorum disk and a delay on one of the fence devices to avoid entering a fence-loop where each node kills the other one on restart;

So it seems that I cannot have a cluster config addressing all of the possible failures at the same time...
Do you have any suggestions/recommendations ?
Is it better to have a quorum disk or not ?
Is there a way to get a node unfence itself BEFORE starting qdiskd ?

Regards,
Frédéric Hummel.

Responses