Cluster node becomes unresponsive and services fail after executing SysRq-T in RHEL 5
Issue
- A node stopped responding temporarily minutes and services in our cluster went down after running a SysRq-T to dump process states while troubleshooting another problem.
- A status check on an
ipresource failed after triggering asysrqkey
Jun 13 16:49:52 node1 clurgmgrd[16908]: <notice> status on ip "192.168.10.10" returned 1 (generic error)
qdiskdstarting reporting the other node was missing updates when I executed asysrqkey on that other node:
Jun 13 16:49:05 node2 qdiskd[8138]: <debug> Node 1 missed an update (2/20)
Jun 13 16:49:13 node2 qdiskd[8138]: <debug> Node 1 missed an update (3/20)
Jun 13 16:49:21 node2 qdiskd[8138]: <debug> Node 1 missed an update (4/20)
Jun 13 16:49:29 node2 qdiskd[8138]: <debug> Node 1 missed an update (5/20)
Jun 13 16:49:37 node2 qdiskd[8138]: <debug> Node 1 missed an update (6/20)
Jun 13 16:49:45 node2 qdiskd[8138]: <debug> Node 1 missed an update (7/20)
Environment
- Red Hat Enterprise Linux (RHEL) 5 with the High Availability Add On
- Executing
SysRqkeys such as T or P
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
