systemd hangs when tmpfile cleanup is requested if a process has a spinning SCHED_RR class thread

Solution Verified - Updated 2024-06-14T16:18:35+00:00 -

Issue

The test case here is the simplest case I could put together that exhibits the problem we have. Basically, just create a thread and put it into the real time scheduling class (SCHED_RR). Then force systemd to do temp file cleanup and systemd hangs on a rmdir() syscall.

$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-862.el7.x86_64 root=/dev/mapper/VGi0-LV_root ro nofb quiet splash=quiet transparent_hugepage=never crashkernel=auto rd.lvm.lv=VGi0/LV_root rd.lvm.lv=VGi0/LV_swap console=ttyS0,115200 idle=poll LANG=en_US.UTF-8 **isolcpus=3-19** nohz=on nohz_full=3-19 rcu_nocb_poll rcu_nocbs=3-19 transparent_hugepage=never intel_idle.max_cstate=1 intel_pstate=disable nosoftlockup skew_tick=1

As visible isolated CPU's on this system but I can reproduce the problem with the same symptoms regardless of whether systemd and the spinning process are bound into the same set of CPUs or different ones.
Compile the following code:

$ g++ -o rtsched rtsched.cc -lpthread

$ cat rtsched.cc
#include <iostream>
#include <unistd.h>
#include <pthread.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <sys/syscall.h>
#include <time.h>

using namespace std;

class A
{
public:
    A(int prio)
    {
        struct sched_param param;
                struct timespec ts = { 5, 0 };
        int policy = SCHED_RR;

        param.sched_priority = prio;

        if (pthread_setschedparam(pthread_self(),
                                  policy,
                                  ¶m) != 0 )
        {
            cout << "Failed to set scheduling policy"
                 << strerror(errno) << endl;
                        exit(1);
        }
                cout << "tid: " << syscall(SYS_gettid)
             <<  " spinning in SCHED_RR class" << endl;
                cout.flush();
                nanosleep(&ts, NULL);

        while (1)
        {
            ;
        }
     };
    ~A() { };
};

void *bar(void *)
{
    A myA(40);
}


int main(int argc, char *argv[])
{
    pthread_t t1;
    void* result;

    pthread_create(&t1, NULL, &bar, NULL);
        sleep(5);
    pthread_join(t1, &result);

    return 0;
}

On the target system run the executable (assuming the user has appropriate permissions to set scheduling class). Then make sure the thread whose tid is printed is in the SCHED_RR class and is spinning:

# ps -eLfc | grep rtsched
jfs       2471 13707  2471    2 TS   19 20:06 pts/5    00:00:00 ./rtsched
jfs       2471 13707  2472    2 RR   80 20:06 pts/5    00:00:18 ./rtsched

In another shell, as root, strace the rmdir(2) calls that systemd is making and in another shell execute the systemctl command as shown below:

# strace -e rmdir -p 1
strace: Process 1 attached
rmdir("/sys/fs/cgroup/systemd/system.slice/systemd-tmpfiles-clean.service") = 0
rmdir("/sys/fs/cgroup/cpu/system.slice/systemd-tmpfiles-clean.service") = 0
rmdir("/sys/fs/cgroup/cpuacct/system.slice/systemd-tmpfiles-clean.service") = -1 ENOENT (No such file or directory)
rmdir("/sys/fs/cgroup/blkio/system.slice/systemd-tmpfiles-clean.service") = 0
rmdir("/sys/fs/cgroup/memory/system.slice/systemd-tmpfiles-clean.service"

systemctl process

# systemctl start systemd-tmpfiles-clean.service

The output from the strace invocation shows the process hanging on the rmdir() call.

# cat /proc/1/stack
[<ffffffffa16b3e5d>] flush_work+0xfd/0x190
[<ffffffffa17a3792>] lru_add_drain_all+0x142/0x190
[<ffffffffa180ce94>] mem_cgroup_reparent_charges+0x34/0x3c0
[<ffffffffa180d3d4>] mem_cgroup_css_offline+0x84/0x140
[<ffffffffa171c1da>] cgroup_destroy_locked+0xea/0x370
[<ffffffffa171c482>] cgroup_rmdir+0x22/0x40
[<ffffffffa18295ac>] vfs_rmdir+0xdc/0x150
[<ffffffffa182cb31>] do_rmdir+0x1f1/0x220
[<ffffffffa182dd66>] SyS_rmdir+0x16/0x20
[<ffffffffa1d1fa51>] tracesys+0x9d/0xc3
[<ffffffffffffffff>] 0xffffffffffffffff

This problem is reproducible with quite a high frequency (70%) of the time. If the problem doesn't reproduce then just restart the 'rtsched' process and try again.
This is impacting systems under development for our next major release of software. This is causing problems with a new architecture within our low-latency business that is looking to use real-time threads. This therefore needs resolving within a reasonable time frame if possible (weeks rather than months).

Environment

Red Hat Enterprise Linux 7

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Select Your Language

systemd hangs when tmpfile cleanup is requested if a process has a spinning SCHED_RR class thread

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Issue

Environment

Subscriber exclusive content

Current Customers and Partners

New to Red Hat?

Using a Red Hat product through a public cloud?

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links