One or more nodes repeatedly reporting "cpg_mcast_joined retry XXX MSG_PLOCK" in a RHEL cluster

Solution Unverified - Updated -

Issue

  • Why does six nodes cluster throw "cpg_mcast_joined retry XXX MSG_PLOCK" error ?
  • When one node leaves the cluster, the other cluster nodes starts logging a series of repeating cpg_mcast_joined retry error messages and POSIX lock requests do not return until the node rejoins the RHEL 5 cluster
Jul  3 00:52:21 node1 gfs_controld[6003]: cpg_mcast_joined retry 100 MSG_PLOCK
Jul  3 00:52:21 node1 gfs_controld[6003]: cpg_mcast_joined retry 200 MSG_PLOCK
Jul  3 00:52:21 node1 gfs_controld[6003]: cpg_mcast_joined retry 300 MSG_PLOCK
  • The cluster nodes are logging the following messages on RHEL 6:
Mar  6 19:55:16 node42 dlm_controld[4905]: cpg_mcast_joined retry 100 plock_drop
[....]
Mar  6 19:57:47 node42 dlm_controld[5523]: cpg_mcast_joined retry 141300 plock_drop
Mar  6 19:57:47 node42 dlm_controld[5523]: cpg_mcast_joined retry 141400 plock_drop

Environment

  • Red Hat Enterprise Linux Server 5 to 8 High Availability or Resilient Storage cluster.
  • GFS or GFS2 file system
    • One or more applications utilizing POSIX locking via fcntl or similar against a file on a GFS or GFS2 file system

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content