rgmanager segfaults in a RHEL 6 High Availability cluster using RRP and cpglockd when stopping cman
Issue
- When stopping
cmanon a node,rgmanagercrashes - If I
service cman stop, the node reboots itself.
Jun 4 14:40:37 node1 corosync[42343]: [QUORUM] Members[5]: 1 3
Jun 4 14:40:37 node1 corosync[42343]: [TOTEM ] A processor joined or left the membership and a new membership was formed.
Jun 4 14:40:37 node1 kernel: dlm: closing connection to node 2
Jun 4 14:40:37 node1 corosync[42343]: [CPG ] chosen downlist: sender r(0) ip(192.168.10.11) r(1) ip(192.168.11.11) ; members(old:6 left:1)
Jun 4 14:40:37 node1 corosync[42343]: [MAIN ] Completed service synchronization, ready to provide service.
Jun 4 14:43:17 node1 kernel: dlm: closing connection to node 3
Jun 4 14:43:17 node1 kernel: dlm: closing connection to node 1
Jun 4 14:43:20 node1 cpglockd[43118]: cman requested shutdown. Exiting.
Jun 4 14:43:20 node1 abrtd: Directory 'ccpp-2016-06-04-14:43:20-43197' creation detected
Jun 4 14:43:20 node1 abrt[40893]: Saved core dump of pid 43197 (/usr/sbin/rgmanager) to /var/spool/abrt/ccpp-2016-06-04-14:43:20-43197 (54935552 bytes)
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Unloading all Corosync service engines.
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Service engine unloaded: corosync extended virtual synchrony service
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Service engine unloaded: corosync configuration service
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Service engine unloaded: corosync cluster closed process group service v1.01
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Service engine unloaded: corosync cluster config database access v1.01
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Service engine unloaded: corosync profile loading service
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Service engine unloaded: openais checkpoint service B.01.01
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Service engine unloaded: corosync CMAN membership service 2.90
Jun 4 14:43:20 node1 corosync[42343]: [SERV ] Service engine unloaded: corosync cluster quorum service v0.1
Jun 4 14:43:20 node1 corosync[42343]: [MAIN ] Corosync Cluster Engine exiting with status 0 at main.c:1947.
rgmanagersegfaults with a backtrace showing a SIGSEGV in_cpg_lock
Core was generated by `rgmanager'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000309f80f5db in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
42 sig);
#0 0x000000309f80f5db in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1 0x00000000004235f2 in _cpg_lock (mode=<value optimized out>, lksb=0x6323b0, options=<value optimized out>, resource=<value optimized out>) at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/clulib/cpg_lock.c:78
#2 0x00000000004117d6 in event_master () at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/daemons/rg_event.c:339
#3 0x0000000000411a75 in _event_thread_f (arg=<value optimized out>) at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/daemons/rg_event.c:419
#4 0x000000309f8079d1 in start_thread (arg=0x7f3fddcb4700) at pthread_create.c:301
#5 0x000000309f4e8b6d in signalfd (fd=-573880576, mask=0x7f3fddcb3b20, flags=16843009) at ../sysdeps/unix/sysv/linux/signalfd.c:30
#6 0x0000000000000000 in ?? ()
Core was generated by `rgmanager'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000309f80f5db in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
42 sig);
#0 0x000000309f80f5db in raise (sig=11) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:42
#1 0x00000000004235f2 in _cpg_lock (mode=<value optimized out>, lksb=0x6323b0, options=<value optimized out>, resource=<value optimized out>) at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/clulib/cpg_lock.c:78
#2 0x00000000004117d6 in event_master () at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/daemons/rg_event.c:339
#3 0x0000000000411a75 in _event_thread_f (arg=<value optimized out>) at /usr/src/debug/rgmanager-3.0.12.1/rgmanager/src/daemons/rg_event.c:419
#4 0x000000309f8079d1 in start_thread (arg=0x7f3225963700) at pthread_create.c:301
#5 0x000000309f4e8b6d in signalfd (fd=630601472, mask=0x7f3225962b20, flags=16843009) at ../sysdeps/unix/sysv/linux/signalfd.c:30
#6 0x0000000000000000 in ?? ()
Environment
- Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add-On
- Cluster configured to use RRP -
clusternodescontain<altname/>in/etc/cluster/cluster.conf rgmanager- Use of RRP causes
cpglockdto be running
- Use of RRP causes
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
