gfs2 cluster issue
Hi guys,
thanks for your help always.
I have a gfs2 shared fs on 11 nodes. Problem is, when one node goes down, the entire cluster stops working. The fs system becomes unresponsive.
Scenario : A node had memory issue and it was taking offline. The rest of the nodes become unresponsive. Users cold not login, commands where handing. Yes, I tried restarted the cluster service on all node and it failed.
Eventually, the clustered nodes where rebooted. Even after reboot, the gfs2 file system would not start. Actually, clvmd service will not start until after the node with bad memory was brought back up .
Is there anyway to remediate this problem so that the system stays up properly if a member leaves the cluster?
Thanks