Chapter 13. Using control groups version 1 with systemd

The following sections provide an overview of tasks related to creation, modification and removal of the control groups (cgroups). The utilities provided by the systemd system and service manager are the preferred way of the cgroups management and will be supported in the future.

13.1. Role of systemd in control groups version 1

Red Hat Enterprise Linux 8 moves the resource management settings from the process level to the application level by binding the system of cgroup hierarchies with the systemd unit tree. Therefore, you can manage the system resources with the systemctl command, or by modifying the systemd unit files.

By default, the systemd system and service manager makes use of the slice, the scope and the service units to organize and structure processes in the control groups. The systemctl command enables you to further modify this structure by creating custom slices. Also, systemd automatically mounts hierarchies for important kernel resource controllers in the /sys/fs/cgroup/ directory.

Three systemd unit types are used for resource control:

  • Service - A process or a group of processes, which systemd started according to a unit configuration file. Services encapsulate the specified processes so that they can be started and stopped as one set. Services are named in the following way:

    <name>.service
  • Scope - A group of externally created processes. Scopes encapsulate processes that are started and stopped by the arbitrary processes through the fork() function and then registered by systemd at runtime. For example, user sessions, containers, and virtual machines are treated as scopes. Scopes are named as follows:

    <name>.scope
  • Slice - A group of hierarchically organized units. Slices organize a hierarchy in which scopes and services are placed. The actual processes are contained in scopes or in services. Every name of a slice unit corresponds to the path to a location in the hierarchy. The dash ("-") character acts as a separator of the path components to a slice from the -.slice root slice. In the following example:

    <parent-name>.slice

    parent-name.slice is a sub-slice of parent.slice, which is a sub-slice of the -.slice root slice. parent-name.slice can have its own sub-slice named parent-name-name2.slice, and so on.

The service, the scope, and the slice units directly map to objects in the control group hierarchy. When these units are activated, they map directly to control group paths built from the unit names.

The following is an abbreviated example of a control group hierarchy:

Control group /:
-.slice
├─user.slice
│ ├─user-42.slice
│ │ ├─session-c1.scope
│ │ │ ├─ 967 gdm-session-worker [pam/gdm-launch-environment]
│ │ │ ├─1035 /usr/libexec/gdm-x-session gnome-session --autostart /usr/share/gdm/greeter/autostart
│ │ │ ├─1054 /usr/libexec/Xorg vt1 -displayfd 3 -auth /run/user/42/gdm/Xauthority -background none -noreset -keeptty -verbose 3
│ │ │ ├─1212 /usr/libexec/gnome-session-binary --autostart /usr/share/gdm/greeter/autostart
│ │ │ ├─1369 /usr/bin/gnome-shell
│ │ │ ├─1732 ibus-daemon --xim --panel disable
│ │ │ ├─1752 /usr/libexec/ibus-dconf
│ │ │ ├─1762 /usr/libexec/ibus-x11 --kill-daemon
│ │ │ ├─1912 /usr/libexec/gsd-xsettings
│ │ │ ├─1917 /usr/libexec/gsd-a11y-settings
│ │ │ ├─1920 /usr/libexec/gsd-clipboard
…​
├─init.scope
│ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18
└─system.slice
  ├─rngd.service
  │ └─800 /sbin/rngd -f
  ├─systemd-udevd.service
  │ └─659 /usr/lib/systemd/systemd-udevd
  ├─chronyd.service
  │ └─823 /usr/sbin/chronyd
  ├─auditd.service
  │ ├─761 /sbin/auditd
  │ └─763 /usr/sbin/sedispatch
  ├─accounts-daemon.service
  │ └─876 /usr/libexec/accounts-daemon
  ├─example.service
  │ ├─ 929 /bin/bash /home/jdoe/example.sh
  │ └─4902 sleep 1
  …​

The example above shows that services and scopes contain processes and are placed in slices that do not contain processes of their own.

Additional resources

  • For more information about systemd, unit files, and a complete list of systemd unit types, see the relevant sections in Configuring basic system settings.
  • For more information about resource controllers, see the What are kernel resource controllers section and the systemd.resource-control(5), cgroups(7) manual pages.
  • For more information about fork(), see the fork(2) manual pages.

13.2. Creating transient control groups

The transient cgroups set limits on resources consumed by a unit (service or scope) during its runtime.

Procedure

  • To create a transient control group, use the systemd-run command in the following format:

    # systemd-run --unit=<name> --slice=<name>.slice <command>

    This command creates and starts a transient service or a scope unit and runs a custom command in such a unit.

    • The --unit=<name> option gives a name to the unit. If --unit is not specified, the name is generated automatically.
    • The --slice=<name>.slice option makes your service or scope unit a member of a specified slice. Replace <name>.slice with the name of an existing slice (as shown in the output of systemctl -t slice), or create a new slice by passing a unique name. By default, services and scopes are created as members of the system.slice.
    • Replace <command> with the command you wish to execute in the service or the scope unit.

      The following message is displayed to confirm that you created and started the service or the scope successfully:

      # Running as unit <name>.service
  • Optionally, keep the unit running after its processes finished to collect run-time information:

    # systemd-run --unit=<name> --slice=<name>.slice --remain-after-exit <command>

    The command creates and starts a transient service unit and runs a custom command in such a unit. The --remain-after-exit option ensures that the service keeps running after its processes have finished.

Additional resources

  • For more information about the concept of control groups, see Understanding control groups.
  • For more information about systemd and cgroups cooperation, see Role of systemd in control groups section.
  • For more information about systemd, unit configuration files and their locations, and a complete list of systemd unit types, see the relevant sections in Configuring basic system settings.
  • For a detailed description of systemd-run including further options and examples, see the systemd-run(1) manual pages.

13.3. Creating persistent control groups

To assign a persistent control group to a service, it is necessary to edit its unit configuration file. The configuration is preserved after the system reboot, so it can be used to manage services that are started automatically.

Procedure

  • To create a persistent control group, execute:

    # systemctl enable <name>.service

    The command above automatically creates a unit configuration file into the /usr/lib/systemd/system/ directory and by default, it assigns <name>.service to the system.slice unit.

Additional resources

  • For more information about the concept of control groups, see Understanding control groups.
  • For more information about systemd and cgroups cooperation, see Role of systemd in control groups section.
  • For more information about systemd, unit configuration files and their locations, and a complete list of systemd unit types, see the relevant sections in Configuring basic system settings.
  • For a detailed description of systemd-run including further options and examples, see the systemd-run(1) manual pages.

13.4. Configuring memory resource control settings on the command-line

Executing commands in the command-line interface is one of the ways how to set limits, prioritize, or control access to hardware resources for groups of processes.

Procedure

  • To limit the memory usage of a service, run the following:

    # systemctl set-property example.service MemoryLimit=1500K

    The command instantly assigns the memory limit of 1,500 kilobytes to processes executed in a control group the example.service service belongs to. The MemoryLimit parameter, in this configuration variant, is defined in the /etc/systemd/system.control/example.service.d/50-MemoryLimit.conf file and controls the value of the /sys/fs/cgroup/memory/system.slice/example.service/memory.limit_in_bytes file.

  • Optionally, to temporarily limit the memory usage of a service, run:

    # systemctl set-property --runtime example.service MemoryLimit=1500K

    The command instantly assigns the memory limit to the example.service service. The MemoryLimit parameter is defined until the next reboot in the /run/systemd/system.control/example.service.d/50-MemoryLimit.conf file. With a reboot, the whole /run/systemd/system.control/ directory and MemoryLimit are removed.

Note

The 50-MemoryLimit.conf file stores the memory limit as a multiple of 4096 bytes - one kernel page size specific for AMD64 and Intel 64. The actual number of bytes depends on a CPU architecture.

Additional resources

13.5. Configuring memory resource control settings with unit files

Each persistent unit is supervised by the systemd system and service manager, and has a unit configuration file in the /usr/lib/systemd/system/ directory. To change the resource control settings of the persistent units, modify its unit configuration file either manually in a text editor or from the command-line interface.

Manually modifying unit files is one of the ways how to set limits, prioritize, or control access to hardware resources for groups of processes.

Procedure

  1. To limit the memory usage of a service, modify the /usr/lib/systemd/system/example.service file as follows:

    …​
    [Service]
    MemoryLimit=1500K
    …​

    The configuration above places a limit on maximum memory consumption of processes executed in a control group, which example.service is part of.

    Note

    Use suffixes K, M, G, or T to identify Kilobyte, Megabyte, Gigabyte, or Terabyte as a unit of measurement.

  2. Reload all unit configuration files:

    # systemctl daemon-reload
  3. Restart the service:

    # systemctl restart example.service
  4. Reboot the system.
  5. Optionally, check that the changes took effect:

    # cat /sys/fs/cgroup/memory/system.slice/example.service/memory.limit_in_bytes
    1536000

    The example output shows that the memory consumption was limited at around 1,500 Kilobytes.

    Note

    The memory.limit_in_bytes file stores the memory limit as a multiple of 4096 bytes - one kernel page size specific for AMD64 and Intel 64. The actual number of bytes depends on a CPU architecture.

Additional resources

13.6. Removing transient control groups

You can use the systemd system and service manager to remove transient control groups (cgroups) if you no longer need to limit, prioritize, or control access to hardware resources for groups of processes.

Transient cgroups are automatically released once all the processes that a service or a scope unit contains, finish.

Procedure

  • To stop the service unit with all its processes, execute:

    # systemctl stop <name>.service
  • To terminate one or more of the unit processes, execute:

    # systemctl kill <name>.service --kill-who=PID,…​ --signal=signal

    The command above uses the --kill-who option to select process(es) from the control group you wish to terminate. To kill multiple processes at the same time, pass a comma-separated list of PIDs. The --signal option determines the type of POSIX signal to be sent to the specified processes. The default signal is SIGTERM.

Additional resources

13.7. Removing persistent control groups

You can use the systemd system and service manager to remove persistent control groups (cgroups) if you no longer need to limit, prioritize, or control access to hardware resources for groups of processes.

Persistent cgroups are released when a service or a scope unit is stopped or disabled and its configuration file is deleted.

Procedure

  1. Stop the service unit:

    # systemctl stop <name>.service
  2. Disable the service unit:

    # systemctl disable <name>.service
  3. Remove the relevant unit configuration file:

    # rm /usr/lib/systemd/system/<name>.service
  4. Reload all unit configuration files so that changes take effect:

    # systemctl daemon-reload

Additional resources

13.8. Listing systemd units

The following procedure describes how to use the systemd system and service manager to list its units.

Procedure

  • To list all active units on the system, execute the # systemctl command and the terminal will return an output similar to the following example:

    # systemctl
    UNIT                                                LOAD   ACTIVE SUB       DESCRIPTION
    …​
    init.scope                                          loaded active running   System and Service Manager
    session-2.scope                                     loaded active running   Session 2 of user jdoe
    abrt-ccpp.service                                   loaded active exited    Install ABRT coredump hook
    abrt-oops.service                                   loaded active running   ABRT kernel log watcher
    abrt-vmcore.service                                 loaded active exited    Harvest vmcores for ABRT
    abrt-xorg.service                                   loaded active running   ABRT Xorg log watcher
    …​
    -.slice                                             loaded active active    Root Slice
    machine.slice                                       loaded active active    Virtual Machine and Container Slice system-getty.slice                                                                       loaded active active    system-getty.slice
    system-lvm2\x2dpvscan.slice                         loaded active active    system-lvm2\x2dpvscan.slice
    system-sshd\x2dkeygen.slice                         loaded active active    system-sshd\x2dkeygen.slice
    system-systemd\x2dhibernate\x2dresume.slice         loaded active active    system-systemd\x2dhibernate\x2dresume>
    system-user\x2druntime\x2ddir.slice                 loaded active active    system-user\x2druntime\x2ddir.slice
    system.slice                                        loaded active active    System Slice
    user-1000.slice                                     loaded active active    User Slice of UID 1000
    user-42.slice                                       loaded active active    User Slice of UID 42
    user.slice                                          loaded active active    User and Session Slice
    …​
    • UNIT - a name of a unit that also reflects the unit position in a control group hierarchy. The units relevant for resource control are a slice, a scope, and a service.
    • LOAD - indicates whether the unit configuration file was properly loaded. If the unit file failed to load, the field contains the state error instead of loaded. Other unit load states are: stub, merged, and masked.
    • ACTIVE - the high-level unit activation state, which is a generalization of SUB.
    • SUB - the low-level unit activation state. The range of possible values depends on the unit type.
    • DESCRIPTION - the description of the unit content and functionality.
  • To list inactive units, execute:

    # systemctl --all
  • To limit the amount of information in the output, execute:

    # systemctl --type service,masked

    The --type option requires a comma-separated list of unit types such as a service and a slice, or unit load states such as loaded and masked.

Additional resources

13.9. Viewing a control group version 1 hierarchy

The following procedure describes how to display control groups (cgroups) hierarchy and processes running in specific cgroups.

Procedure

  • To display the whole cgroups hierarchy on your system, execute # systemd-cgls:

    # systemd-cgls
    Control group /:
    -.slice
    ├─user.slice
    │ ├─user-42.slice
    │ │ ├─session-c1.scope
    │ │ │ ├─ 965 gdm-session-worker [pam/gdm-launch-environment]
    │ │ │ ├─1040 /usr/libexec/gdm-x-session gnome-session --autostart /usr/share/gdm/greeter/autostart
    …​
    ├─init.scope
    │ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18
    └─system.slice
      …​
      ├─example.service
      │ ├─6882 /bin/bash /home/jdoe/example.sh
      │ └─6902 sleep 1
      ├─systemd-journald.service
        └─629 /usr/lib/systemd/systemd-journald
      …​

    The example output returns the entire cgroups hierarchy, where the highest level is formed by slices.

  • To display the cgroups hierarchy filtered by a resource controller, execute # systemd-cgls <resource_controller>:

    # systemd-cgls memory
    Controller memory; Control group /:
    ├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18
    ├─user.slice
    │ ├─user-42.slice
    │ │ ├─session-c1.scope
    │ │ │ ├─ 965 gdm-session-worker [pam/gdm-launch-environment]
    …​
    └─system.slice
      |
      …​
      ├─chronyd.service
      │ └─844 /usr/sbin/chronyd
      ├─example.service
      │ ├─8914 /bin/bash /home/jdoe/example.sh
      │ └─8916 sleep 1
      …​

    The example output of the above command lists the services that interact with the selected controller.

  • To display detailed information about a certain unit and its part of the cgroups hierarchy, execute # systemctl status <system_unit>:

    # systemctl status example.service
    ● example.service - My example service
       Loaded: loaded (/usr/lib/systemd/system/example.service; enabled; vendor preset: disabled)
       Active: active (running) since Tue 2019-04-16 12:12:39 CEST; 3s ago
     Main PID: 17737 (bash)
        Tasks: 2 (limit: 11522)
       Memory: 496.0K (limit: 1.5M)
       CGroup: /system.slice/example.service
               ├─17737 /bin/bash /home/jdoe/example.sh
               └─17743 sleep 1
    Apr 16 12:12:39 redhat systemd[1]: Started My example service.
    Apr 16 12:12:39 redhat bash[17737]: The current time is Tue Apr 16 12:12:39 CEST 2019
    Apr 16 12:12:40 redhat bash[17737]: The current time is Tue Apr 16 12:12:40 CEST 2019

Additional resources

13.10. Viewing resource controllers

The following procedure describes how to learn which processes use which resource controllers.

Procedure

  1. To view which resource controllers a process interacts with, execute the # cat proc/<PID>/cgroup command:

    # cat /proc/11269/cgroup
    12:freezer:/
    11:cpuset:/
    10:devices:/system.slice
    9:memory:/system.slice/example.service
    8:pids:/system.slice/example.service
    7:hugetlb:/
    6:rdma:/
    5:perf_event:/
    4:cpu,cpuacct:/
    3:net_cls,net_prio:/
    2:blkio:/
    1:name=systemd:/system.slice/example.service

    The example output relates to a process of interest. In this case, it is a process identified by PID 11269, which belongs to the example.service unit. You can determine whether the process was placed in a correct control group as defined by the systemd unit file specifications.

    Note

    By default, the items and their ordering in the list of resource controllers is the same for all units started by systemd, since it automatically mounts all the default resource controllers.

Additional resources

  • For more information about resource controllers in general refer to the cgroups(7) manual pages.
  • For a detailed description of specific resource controllers, see the documentation in the /usr/share/doc/kernel-doc-<kernel_version>/Documentation/cgroups-v1/ directory.

13.11. Monitoring resource consumption

The following procedure describes how to view a list of currently running control groups (cgroups) and their resource consumption in real-time.

Procedure

  1. To see a dynamic account of currently running cgroups, execute the # systemd-cgtop command:

    # systemd-cgtop
    Control Group                            Tasks   %CPU   Memory  Input/s Output/s
    /                                          607   29.8     1.5G        -        -
    /system.slice                              125      -   428.7M        -        -
    /system.slice/ModemManager.service           3      -     8.6M        -        -
    /system.slice/NetworkManager.service         3      -    12.8M        -        -
    /system.slice/accounts-daemon.service        3      -     1.8M        -        -
    /system.slice/boot.mount                     -      -    48.0K        -        -
    /system.slice/chronyd.service                1      -     2.0M        -        -
    /system.slice/cockpit.socket                 -      -     1.3M        -        -
    /system.slice/colord.service                 3      -     3.5M        -        -
    /system.slice/crond.service                  1      -     1.8M        -        -
    /system.slice/cups.service                   1      -     3.1M        -        -
    /system.slice/dev-hugepages.mount            -      -   244.0K        -        -
    /system.slice/dev-mapper-rhel\x2dswap.swap   -      -   912.0K        -        -
    /system.slice/dev-mqueue.mount               -      -    48.0K        -        -
    /system.slice/example.service                2      -     2.0M        -        -
    /system.slice/firewalld.service              2      -    28.8M        -        -
    ...

    The example output displays currently running cgroups ordered by their resource usage (CPU, memory, disk I/O load). The list refreshes every 1 second by default. Therefore, it offers a dynamic insight into the actual resource usage of each control group.

Additional resources

  • For more information about dynamic monitoring of resource usage, see the systemd-cgtop(1) manual pages.