systemd hangs when tmpfile cleanup is requested if a process has a spinning SCHED_RR class thread

Solution Verified - Updated -

Issue

  • The test case here is the simplest case I could put together that exhibits the problem we have. Basically, just create a thread and put it into the real time scheduling class (SCHED_RR). Then force systemd to do temp file cleanup and systemd hangs on a rmdir() syscall.
$ cat /proc/cmdline 
BOOT_IMAGE=/vmlinuz-3.10.0-862.el7.x86_64 root=/dev/mapper/VGi0-LV_root ro nofb quiet splash=quiet transparent_hugepage=never crashkernel=auto rd.lvm.lv=VGi0/LV_root rd.lvm.lv=VGi0/LV_swap console=ttyS0,115200 idle=poll LANG=en_US.UTF-8 **isolcpus=3-19** nohz=on nohz_full=3-19 rcu_nocb_poll rcu_nocbs=3-19 transparent_hugepage=never intel_idle.max_cstate=1 intel_pstate=disable nosoftlockup skew_tick=1
  • As visible isolated CPU's on this system but I can reproduce the problem with the same symptoms regardless of whether systemd and the spinning process are bound into the same set of CPUs or different ones.

  • Compile the following code:

$ g++ -o rtsched rtsched.cc -lpthread

$ cat rtsched.cc
#include <iostream>
#include <unistd.h>
#include <pthread.h>
#include <errno.h>
#include <string.h>
#include <stdlib.h>
#include <sys/syscall.h>
#include <time.h>

using namespace std;

class A
{
public:
    A(int prio)
    {
        struct sched_param param;
                struct timespec ts = { 5, 0 };
        int policy = SCHED_RR;

        param.sched_priority = prio;

        if (pthread_setschedparam(pthread_self(),
                                  policy,
                                  ¶m) != 0 )
        {
            cout << "Failed to set scheduling policy"
                 << strerror(errno) << endl;
                        exit(1);
        }
                cout << "tid: " << syscall(SYS_gettid)
             <<  " spinning in SCHED_RR class" << endl;
                cout.flush();
                nanosleep(&ts, NULL);

        while (1)
        {
            ;
        }
     };
    ~A() { };
};

void *bar(void *)
{
    A myA(40);
}


int main(int argc, char *argv[])
{
    pthread_t t1;
    void* result;

    pthread_create(&t1, NULL, &bar, NULL);
        sleep(5);
    pthread_join(t1, &result);

    return 0;
}
  • On the target system run the executable (assuming the user has appropriate permissions to set scheduling class). Then make sure the thread whose tid is printed is in the SCHED_RR class and is spinning:
# ps -eLfc | grep rtsched
jfs       2471 13707  2471    2 TS   19 20:06 pts/5    00:00:00 ./rtsched
jfs       2471 13707  2472    2 RR   80 20:06 pts/5    00:00:18 ./rtsched
  • In another shell, as root, strace the rmdir(2) calls that systemd is making and in another shell execute the systemctl command as shown below:
# strace -e rmdir -p 1
strace: Process 1 attached
rmdir("/sys/fs/cgroup/systemd/system.slice/systemd-tmpfiles-clean.service") = 0
rmdir("/sys/fs/cgroup/cpu/system.slice/systemd-tmpfiles-clean.service") = 0
rmdir("/sys/fs/cgroup/cpuacct/system.slice/systemd-tmpfiles-clean.service") = -1 ENOENT (No such file or directory)
rmdir("/sys/fs/cgroup/blkio/system.slice/systemd-tmpfiles-clean.service") = 0
rmdir("/sys/fs/cgroup/memory/system.slice/systemd-tmpfiles-clean.service"
  • systemctl process
# systemctl start systemd-tmpfiles-clean.service
  • The output from the strace invocation shows the process hanging on the rmdir() call.
# cat /proc/1/stack
[<ffffffffa16b3e5d>] flush_work+0xfd/0x190
[<ffffffffa17a3792>] lru_add_drain_all+0x142/0x190
[<ffffffffa180ce94>] mem_cgroup_reparent_charges+0x34/0x3c0
[<ffffffffa180d3d4>] mem_cgroup_css_offline+0x84/0x140
[<ffffffffa171c1da>] cgroup_destroy_locked+0xea/0x370
[<ffffffffa171c482>] cgroup_rmdir+0x22/0x40
[<ffffffffa18295ac>] vfs_rmdir+0xdc/0x150
[<ffffffffa182cb31>] do_rmdir+0x1f1/0x220
[<ffffffffa182dd66>] SyS_rmdir+0x16/0x20
[<ffffffffa1d1fa51>] tracesys+0x9d/0xc3
[<ffffffffffffffff>] 0xffffffffffffffff
  • This problem is reproducible with quite a high frequency (70%) of the time. If the problem doesn't reproduce then just restart the 'rtsched' process and try again.

  • This is impacting systems under development for our next major release of software. This is causing problems with a new architecture within our low-latency business that is looking to use real-time threads. This therefore needs resolving within a reasonable time frame if possible (weeks rather than months).

Environment

  • Red Hat Enterprise Linux 7

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content