Processes requiring Real-Time Scheduling fail with "sched_setscheduler: Operation not permitted" error or similar

Solution Verified - Updated -


  • Red Hat Enterprise Linux (RHEL) 7
  • Red Hat Enterprise Linux 8
    • systemd


  • Services trying to acquire real-time scheduling fail to start, strace on the service executable shows a EPERM (Operation not permitted) error when calling the sched_setscheduler syscall with SCHED_RR parameter:

    # strace <program> 2>&1 >/dev/null | grep sched_setscheduler
    sched_setscheduler(0, SCHED_RR, { 99 }) = -1 EPERM (Operation not permitted)
  • Services acquiring real-time scheduling start normally at boot, but fail to be restarted, and usually show the error message shown above

  • Oracle RAC or other applications which use Real Time process scheduling fails but runs without issue after running cgclear


Root Cause

  • If a service acquiring real-time scheduling starts normally at boot but fails to be restarted later, then

    • either CPU Accounting is partially enabled at boot (i.e. enabled in /etc/systemd/system.conf but not in the initramfs)
    • or CPU Accounting is enabled at run-time by a service starting after the service acquiring real-time scheduling
  • If a service acquiring real-time scheduling doesn't start anymore at boot after a reboot of the system, then

    • either CPU Accounting was enabled at boot explicitly prior to rebooting
    • or a service which was not started prior to rebooting is now enabling CPU Accounting implicitly or explicitly
  • For insights-client, the unit file for the application included CPUQuota=30% which triggered the behavior described in the above article. This has since been changed.

Diagnostic Steps

  • Verify that systemd makes use of the cpu and cpuacct CGroup controllers

    # ls -d /sys/fs/cgroup/{cpu,cpuacct}/*.slice
    ls: cannot access /sys/fs/cgroup/cpu/*.slice: No such file or directory
    ls: cannot access /sys/fs/cgroup/cpuacct/*.slice: No such file or directory

    In the example above, systemd is not currently using the cpu and cpuacct CGroup controllers.
    In such case, the issue being hit is not being handled by this solution.

    # ls -d /sys/fs/cgroup/{cpu,cpuacct}/*.slice
    /sys/fs/cgroup/cpuacct/system.slice    /sys/fs/cgroup/cpu/system.slice
    /sys/fs/cgroup/cpuacct/user.slice      /sys/fs/cgroup/cpu/user.slice

    In the example above, systemd is currently using the cpu and cpuacct CGroup controllers.
    This happens when CPU Accounting has been enabled at boot or at run-time.
    Please proceed further.

  • Verify if systemd is configured to enable CPU Accounting at boot

    # grep ^DefaultCPUAccounting= /etc/systemd/system.conf

    In the example above, CPU Accounting is enabled at boot. If this is not the case, proceed to next diagnostic step directly.

    As sanity check, verify that the booted initramfs is also configuring CPU Accounting at boot, otherwise this may lead to unexpected behaviour:

    # lsinitrd /boot/initramfs-$(uname -r).img /etc/systemd/system.conf | grep ^DefaultCPUAccounting=

    In the example above, the /etc/systemd/system.conf file on the system and in the initramfs are not synchronized, which may lead to unexpected behaviour.
    In such case, please rebuild the initramfs:

    # dracut -f /boot/initramfs-$(uname -r).img $(uname -r)

    Since CPU Accounting is enabled at boot, proceed directly to Resolution section.

  • Find out why systemd enabled CPU Accounting at run-time

    Even though CPU Accounting is disabled at boot, systemd may enable it automatically when a unit being started makes use of a CPU* property, such as CPUAccounting=yes or CPUQuota=<value> (refer to systemd.resource-control manpage for a full list).
    Note that Delegate=yes also enables CPU Accounting.

    To find out which unit made systemd enable CPU Accounting, the following command can be executed:

    # egrep -ri "^(Startup)?CPU.*=(.*%|1|yes|true|on)" /usr/lib/systemd/system /etc/systemd/system

    In the example above, a custom drop-in to mariadb.service unit is implicitly enabling CPU Accounting because it makes use of CPUQuota.

    Since CPU Accounting is enabled at run-time due to starting a service which implicitly or explicitly enables CPU Accounting, proceed to Resolution section.

  • If you reach this step, then CPU Accounting is enabled due to some unknown reason not handled by this solution. Please contact your Red Hat Support representative.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.


The solution article is somewhat misleading and is missing a significant amount of exposition as the issue is not as clear cut as indicated. There is a longer and better explaination of the issue here:

It says:

"SysV services, even those with root privileges, cannot acquire real-time scheduling when the CPUAccounting option is enabled. With CPUAccounting enabled for any service, systemd makes use of the CGroup CPU bandwidth controller globally, and subsequent sched_setscheduler() system calls terminate unexpectedly due to real-time scheduling priority. To avoid this error to recur, the CGroup cpu.rt_runtime_us option can be set for the real-time using service."

That means that if CPUAccounting is enabled this can happen but if it is not then it will not happen. CPUAccounting can be enabled explicitly (by default it is not) or implicitly by using one of the options here that implies "CPUAccounting=true":

So if you were to define a service that had CPUQuota=30% in its definition that would imply that CPUAccounting was true and cause the issue to be seen. If CPUAccounting not true anywhere (and default off) and nothing used any configuration that implied it as true then the issue wouldn't be seen (from what I can work out).

I've seen a RHEL 7.5 system with a service that definied CPUQuota=30% on a two node cluster. On one node with that service disabled a process could get real time and on the other system where the service was running it could not.

If, for example, that has changed at 7.6 (CPUAccounting is now always true) you need to be more explicit about when the behaviour has changed why it changed and not say that it applies to all of RHEL 7.

I can see that it's not 100% accurate at least up to 7.5 but the article presents it as just the way that all of RHEL 7 works. Accurate explainations of the cause of issues helps with understanding (in my case disabling the service that defined CPUQuota=30% shoud allow the real time process to start which allows the system to function as expected while a fix is developed for the real time process). In the longer term that real time process on the system I was looking does need to change to use a real time slice but saying that it simply does not work on RHEL 7 would appear to be simply untrue without any exposition on the underlying reasons for how it can happen.

Please improve the quality of the solution article.

Thanks for pointing that, I'll update the document.