RHEL8: systemd crashes with Segmentation Fault, causing slowness logging in and later system to become unusable
Issue
All of the below symptoms are seen simultaneously:
-
Logging into the server using
ssh
or the console takes 25 seconds to complete, the following message can be seen in the journal[...] pam_systemd(...): Failed to create session: Connection timed out
-
Cron jobs take 25 seconds to be started, the following message can be seen in the journal
[...] pam_systemd(crond:session): Failed to create session: Connection timed out
-
On OCP cluster version 4.8.z, found the following error on a node,
[...] Failed to list units: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
-
systemd
crashed with SEGV signal, the following messages can be seen in the journal[...] systemd-coredump[SOMEPID]: Due to PID 1 having crashed coredump collection will now be turned off. [...] systemd[1]: Caught <SEGV>, dumped core as pid SOMEPID. [...] systemd[1]: Freezing execution.
-
systemd
didn't crash yet (no above message), but the following kernel stack can be seen# cat /proc/1/stack [<0>] futex_wait_queue_me+0xb6/0x110 [<0>] futex_wait+0x11f/0x210 [<0>] do_futex+0x317/0x4b0 [<0>] __x64_sys_futex+0x145/0x1f0 [<0>] do_syscall_64+0x5b/0x1a0 [<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
-
The coredump of
systemd
shows one of the following backtraces (non-exhaustive list), all related to memory allocation issues (addresses may vary)#0 0x00007fc726d9f67b in kill () at ../sysdeps/unix/syscall-template.S:78 #1 0x000055efd5314f7a in crash (sig=6) at ../src/core/main.c:194 #2 <signal handler called> #3 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #4 0x00007fc726d89db5 in __GI_abort () at abort.c:79 #5 0x00007fc726de24e7 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7fc726ef1a0e "%s\n") at ../sysdeps/posix/libc_fatal.c:181 #6 0x00007fc726de95ec in malloc_printerr (str=str@entry=0x7fc726ef3a88 "malloc(): smallbin double linked list corrupted") at malloc.c:5374 [...]
#0 0x00007f2cf051667b in kill () at ../sysdeps/unix/syscall-template.S:78 #1 0x000055e7679b6f7a in crash (sig=11) at ../src/core/main.c:194 #2 <signal handler called> #3 tcache_get (tc_idx=1) at malloc.c:2951 #4 __GI___libc_malloc (bytes=bytes@entry=34) at malloc.c:3058 #5 0x00007f2cf056880e in __GI___strdup (...) at strdup.c:42 [...]
#0 0x00007f2f50ac467b in kill () at ../sysdeps/unix/syscall-template.S:78 #1 0x00005558a6d7bf7a in crash (sig=11) at ../src/core/main.c:194 #2 <signal handler called> #3 0x00007f2f50b11818 in _int_malloc (av=av@entry=0x7f2f50e4cbc0 <main_arena>, bytes=bytes@entry=14) at malloc.c:3683 #4 0x00007f2f50b12c72 in __GI___libc_malloc (bytes=bytes@entry=14) at malloc.c:3073 #5 0x00007f2f50b1680e in __GI___strdup (...) at strdup.c:42 [...]
#0 0x00007f2f7ad0f67b in kill () at ../sysdeps/unix/syscall-template.S:78 #1 0x00005571223aef7a in crash (sig=11) at ../src/core/main.c:194 #2 <signal handler called> #3 _int_malloc (av=av@entry=0x7f2f7b097bc0 <main_arena>, bytes=bytes@entry=28) at malloc.c:3655 #4 0x00007f2f7ad5dc72 in __GI___libc_malloc (bytes=bytes@entry=28) at malloc.c:3073 #5 0x00007f2f7c4d4261 in malloc_multiply (need=28, size=1) at ../src/basic/alloc-util.h:63 [...]
#0 0x00007f7e5221d67b in kill () at ../sysdeps/unix/syscall-template.S:78 #1 0x0000559c07060f7a in crash (sig=11) at ../src/core/main.c:194 #2 <signal handler called> #3 _int_malloc (av=av@entry=0x7f7e525a5bc0 <main_arena>, bytes=bytes@entry=24) at malloc.c:3655 #4 0x00007f7e5226c8d6 in __libc_calloc (n=n@entry=1, elem_size=elem_size@entry=24) at malloc.c:3444 [...]
Environment
- Red Hat Enterprise Linux 8.4
- systemd-239-45.el8_4.8 and earlier
- Red Hat Enterprise Linux 8.5
- systemd-239-51.el8_5.1 and earlier
- Red Hat CoreOS on Red Hat Openshift Container Platform
- 4.8.35 with systemd-239-45.el8_4.8
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.