One or more nodes repeatedly reporting "cpg_mcast_joined retry XXX MSG_PLOCK" in a RHEL cluster
Issue
- Why does six nodes cluster throw "cpg_mcast_joined retry XXX MSG_PLOCK" error ?
- When one node leaves the cluster, the other cluster nodes starts logging a series of repeating
cpg_mcast_joined retry
error messages and POSIX lock requests do not return until the node rejoins the RHEL 5 cluster
Jul 3 00:52:21 node1 gfs_controld[6003]: cpg_mcast_joined retry 100 MSG_PLOCK
Jul 3 00:52:21 node1 gfs_controld[6003]: cpg_mcast_joined retry 200 MSG_PLOCK
Jul 3 00:52:21 node1 gfs_controld[6003]: cpg_mcast_joined retry 300 MSG_PLOCK
- The cluster nodes are logging the following messages on RHEL 6:
Mar 6 19:55:16 node42 dlm_controld[4905]: cpg_mcast_joined retry 100 plock_drop
[....]
Mar 6 19:57:47 node42 dlm_controld[5523]: cpg_mcast_joined retry 141300 plock_drop
Mar 6 19:57:47 node42 dlm_controld[5523]: cpg_mcast_joined retry 141400 plock_drop
Environment
- Red Hat Enterprise Linux Server 5 to 8 High Availability or Resilient Storage cluster.
- GFS or GFS2 file system
- One or more applications utilizing POSIX locking via
fcntl
or similar against a file on a GFS or GFS2 file system
- One or more applications utilizing POSIX locking via
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.