rgmanager segfaults in a RHEL 6 High Availability cluster using RRP and cpglockd when stopping cman

Solution Unverified - Updated 2024-08-09T04:24:30+00:00 -

Issue

When stopping cman on a node, rgmanager crashes
If I service cman stop, the node reboots itself.

Jun  4 14:40:37 node1 corosync[42343]:   [QUORUM] Members[5]: 1 3
Jun  4 14:40:37 node1 corosync[42343]:   [TOTEM ] A processor joined or left the membership and a new membership was formed.
Jun  4 14:40:37 node1 kernel: dlm: closing connection to node 2
Jun  4 14:40:37 node1 corosync[42343]:   [CPG   ] chosen downlist: sender r(0) ip(192.168.10.11) r(1) ip(192.168.11.11) ; members(old:6 left:1)
Jun  4 14:40:37 node1 corosync[42343]:   [MAIN  ] Completed service synchronization, ready to provide service.
Jun  4 14:43:17 node1 kernel: dlm: closing connection to node 3
Jun  4 14:43:17 node1 kernel: dlm: closing connection to node 1
Jun  4 14:43:20 node1 cpglockd[43118]: cman requested shutdown. Exiting.
Jun  4 14:43:20 node1 abrtd: Directory 'ccpp-2016-06-04-14:43:20-43197' creation detected
Jun  4 14:43:20 node1 abrt[40893]: Saved core dump of pid 43197 (/usr/sbin/rgmanager) to /var/spool/abrt/ccpp-2016-06-04-14:43:20-43197 (54935552 bytes)
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Unloading all Corosync service engines.
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Service engine unloaded: corosync extended virtual synchrony service
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Service engine unloaded: corosync configuration service
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Service engine unloaded: corosync cluster closed process group service v1.01
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Service engine unloaded: corosync cluster config database access v1.01
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Service engine unloaded: corosync profile loading service
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Service engine unloaded: openais checkpoint service B.01.01
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Service engine unloaded: corosync CMAN membership service 2.90
Jun  4 14:43:20 node1 corosync[42343]:   [SERV  ] Service engine unloaded: corosync cluster quorum service v0.1
Jun  4 14:43:20 node1 corosync[42343]:   [MAIN  ] Corosync Cluster Engine exiting with status 0 at main.c:1947.

rgmanager segfaults with a backtrace showing a SIGSEGV in _cpg_lock

Core was generated by `rgmanager'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000309f80f5db in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
42               sig);
#0  0x000000309f80f5db in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x00000000004235f2 in _cpg_lock (mode=<value optimized out>, lksb=0x6323b0, options=<value optimized out>, resource=<value optimized out>) at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/clulib/cpg_lock.c:78
#2  0x00000000004117d6 in event_master () at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/daemons/rg_event.c:339
#3  0x0000000000411a75 in _event_thread_f (arg=<value optimized out>) at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/daemons/rg_event.c:419
#4  0x000000309f8079d1 in start_thread (arg=0x7f3fddcb4700) at pthread_create.c:301
#5  0x000000309f4e8b6d in signalfd (fd=-573880576, mask=0x7f3fddcb3b20, flags=16843009) at ../sysdeps/unix/sysv/linux/signalfd.c:30
#6  0x0000000000000000 in ?? ()

Core was generated by `rgmanager'.
Program terminated with signal 11, Segmentation fault.
#0  0x000000309f80f5db in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
42               sig);
#0  0x000000309f80f5db in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1  0x00000000004235f2 in _cpg_lock (mode=<value optimized out>, lksb=0x6323b0, options=<value optimized out>, resource=<value optimized out>) at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/clulib/cpg_lock.c:78
#2  0x00000000004117d6 in event_master () at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/daemons/rg_event.c:339
#3  0x0000000000411a75 in _event_thread_f (arg=<value optimized out>) at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/daemons/rg_event.c:419
#4  0x000000309f8079d1 in start_thread (arg=0x7f3225963700) at pthread_create.c:301
#5  0x000000309f4e8b6d in signalfd (fd=630601472, mask=0x7f3225962b20, flags=16843009) at ../sysdeps/unix/sysv/linux/signalfd.c:30
#6  0x0000000000000000 in ?? ()

Environment

Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add-On
Cluster configured to use RRP - clusternodes contain <altname/> in /etc/cluster/cluster.conf
rgmanager
- Use of RRP causes cpglockd to be running

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

rgmanager segfaults in a RHEL 6 High Availability cluster using RRP and cpglockd when stopping cman

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links