Cluster node becomes unresponsive and is fenced after running out of swap on Red Hat Enterprise Linux 5
Issue
- We also like to know why the server got rebooted? Can We get any reason why it is got rebooted?
Environment
- Red Hat Enterprise Linux Server 5 (with the High Availability Add on)
- Red Hat High Availability Cluster with 2 or more nodes
- One of the nodes stops responding and is fenced by the remaining node(s) in the cluster.
- Sar data captured before the fencing event shows that the server has run low on memory and consumed all swap space:
00:00:01 kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
20:00:01 286264 7758972 96.44 13644 1054116 98768 8289832 98.82 484860
20:10:01 46120 7999116 99.43 3672 2195512 388072 8000528 95.37 186380
20:20:02 46076 7999160 99.43 3976 1924392 391864 7996736 95.33 185884
20:30:01 46912 7998324 99.42 4936 1481048 394296 7994304 95.30 183444
20:40:03 46520 7998716 99.42 4744 527692 395064 7993536 95.29 183816
20:50:02 44944 8000292 99.44 3668 684088 397332 7991268 95.26 184520
21:00:02 44444 8000792 99.45 5328 485688 324 8388276 100.00 169476
21:10:02 43672 8001564 99.46 3168 660664 2076 8386524 99.98 102240
21:20:02 43948 8001288 99.45 3868 500476 0 8388600 100.00 100888 <-- kbswpfree is 0kb free
21:30:04 42504 8002732 99.47 3124 361372 4 8388596 100.00 127104 <-- kbswpfree is 4kb free
Average: 357611 7687625 95.55 8947 1292453 2906231 5482369 65.35 575329
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.