Language and Page Formatting Options
Chapter 24. Using control groups version 1 with systemd
The following sections provide an overview of tasks related to creation, modification and removal of the control groups (
cgroups). The utilities provided by the
systemd system and service manager are the preferred way of the
cgroups management and will be supported in the future.
24.1. Role of systemd in control groups version 1
RHEL 8 moves the resource management settings from the process level to the application level by binding the system of
cgroup hierarchies with the
systemd unit tree. Therefore, you can manage the system resources with the
systemctl command, or by modifying the
systemd unit files.
By default, the
systemd system and service manager makes use of the
scope and the
service units to organize and structure processes in the control groups. The
systemctl command enables you to further modify this structure by creating custom
systemd automatically mounts hierarchies for important kernel resource controllers in the
systemd unit types are used for resource control:
Service - A process or a group of processes, which
systemdstarted according to a unit configuration file. Services encapsulate the specified processes so that they can be started and stopped as one set. Services are named in the following way:
Scope - A group of externally created processes. Scopes encapsulate processes that are started and stopped by the arbitrary processes through the
fork()function and then registered by
systemdat runtime. For example, user sessions, containers, and virtual machines are treated as scopes. Scopes are named as follows:
Slice - A group of hierarchically organized units. Slices organize a hierarchy in which scopes and services are placed. The actual processes are contained in scopes or in services. Every name of a slice unit corresponds to the path to a location in the hierarchy. The dash ("-") character acts as a separator of the path components to a slice from the
-.sliceroot slice. In the following example:
parent-name.sliceis a sub-slice of
parent.slice, which is a sub-slice of the
parent-name.slicecan have its own sub-slice named
parent-name-name2.slice, and so on.
scope, and the
slice units directly map to objects in the control group hierarchy. When these units are activated, they map directly to control group paths built from the unit names.
The following is an abbreviated example of a control group hierarchy:
Control group /: -.slice ├─user.slice │ ├─user-42.slice │ │ ├─session-c1.scope │ │ │ ├─ 967 gdm-session-worker [pam/gdm-launch-environment] │ │ │ ├─1035 /usr/libexec/gdm-x-session gnome-session --autostart /usr/share/gdm/greeter/autostart │ │ │ ├─1054 /usr/libexec/Xorg vt1 -displayfd 3 -auth /run/user/42/gdm/Xauthority -background none -noreset -keeptty -verbose 3 │ │ │ ├─1212 /usr/libexec/gnome-session-binary --autostart /usr/share/gdm/greeter/autostart │ │ │ ├─1369 /usr/bin/gnome-shell │ │ │ ├─1732 ibus-daemon --xim --panel disable │ │ │ ├─1752 /usr/libexec/ibus-dconf │ │ │ ├─1762 /usr/libexec/ibus-x11 --kill-daemon │ │ │ ├─1912 /usr/libexec/gsd-xsettings │ │ │ ├─1917 /usr/libexec/gsd-a11y-settings │ │ │ ├─1920 /usr/libexec/gsd-clipboard … ├─init.scope │ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18 └─system.slice ├─rngd.service │ └─800 /sbin/rngd -f ├─systemd-udevd.service │ └─659 /usr/lib/systemd/systemd-udevd ├─chronyd.service │ └─823 /usr/sbin/chronyd ├─auditd.service │ ├─761 /sbin/auditd │ └─763 /usr/sbin/sedispatch ├─accounts-daemon.service │ └─876 /usr/libexec/accounts-daemon ├─example.service │ ├─ 929 /bin/bash /home/jdoe/example.sh │ └─4902 sleep 1 …
The example above shows that services and scopes contain processes and are placed in slices that do not contain processes of their own.
24.2. Creating transient control groups
cgroups set limits on resources consumed by a unit (service or scope) during its runtime.
To create a transient control group, use the
systemd-runcommand in the following format:
# systemd-run --unit=<name> --slice=<name>.slice <command>
This command creates and starts a transient service or a scope unit and runs a custom command in such a unit.
--unit=<name>option gives a name to the unit. If
--unitis not specified, the name is generated automatically.
--slice=<name>.sliceoption makes your service or scope unit a member of a specified slice. Replace
<name>.slicewith the name of an existing slice (as shown in the output of
systemctl -t slice), or create a new slice by passing a unique name. By default, services and scopes are created as members of the
<command>with the command you wish to execute in the service or the scope unit.
The following message is displayed to confirm that you created and started the service or the scope successfully:
# Running as unit <name>.service
Optionally, keep the unit running after its processes finished to collect run-time information:
# systemd-run --unit=<name> --slice=<name>.slice --remain-after-exit <command>
The command creates and starts a transient service unit and runs a custom command in such a unit. The
--remain-after-exitoption ensures that the service keeps running after its processes have finished.
24.3. Creating persistent control groups
To assign a persistent control group to a service, it is necessary to edit its unit configuration file. The configuration is preserved after the system reboot, so it can be used to manage services that are started automatically.
To create a persistent control group, execute:
# systemctl enable <name>.service
The command above automatically creates a unit configuration file into the
/usr/lib/systemd/system/directory and by default, it assigns
24.4. Configuring memory resource control settings on the command-line
Executing commands in the command-line interface is one of the ways how to set limits, prioritize, or control access to hardware resources for groups of processes.
To limit the memory usage of a service, run the following:
# systemctl set-property example.service MemoryMax=1500K
The command instantly assigns the memory limit of 1,500 KB to processes executed in a control group the
example.serviceservice belongs to. The
MemoryMaxparameter, in this configuration variant, is defined in the
/etc/systemd/system.control/example.service.d/50-MemoryMax.conffile and controls the value of the
Optionally, to temporarily limit the memory usage of a service, run:
# systemctl set-property --runtime example.service MemoryMax=1500K
The command instantly assigns the memory limit to the
MemoryMaxparameter is defined until the next reboot in the
/run/systemd/system.control/example.service.d/50-MemoryMax.conffile. With a reboot, the whole
50-MemoryMax.conf file stores the memory limit as a multiple of 4096 bytes - one kernel page size specific for AMD64 and Intel 64. The actual number of bytes depends on a CPU architecture.
24.5. Configuring memory resource control settings with unit files
Each persistent unit is supervised by the
systemd system and service manager, and has a unit configuration file in the
/usr/lib/systemd/system/ directory. To change the resource control settings of the persistent units, modify its unit configuration file either manually in a text editor or from the command-line interface.
Manually modifying unit files is one of the ways how to set limits, prioritize, or control access to hardware resources for groups of processes.
To limit the memory usage of a service, modify the
/usr/lib/systemd/system/example.servicefile as follows:
… [Service] MemoryMax=1500K …
The configuration above places a limit on maximum memory consumption of processes executed in a control group, which
example.serviceis part of.Note
Use suffixes K, M, G, or T to identify Kilobyte, Megabyte, Gigabyte, or Terabyte as a unit of measurement.
Reload all unit configuration files:
# systemctl daemon-reload
Restart the service:
# systemctl restart example.service
- Reboot the system.
Optionally, check that the changes took effect:
# cat /sys/fs/cgroup/memory/system.slice/example.service/memory.limit_in_bytes 1536000
The example output shows that the memory consumption was limited at around 1,500 KB.Note
memory.limit_in_bytesfile stores the memory limit as a multiple of 4096 bytes - one kernel page size specific for AMD64 and Intel 64. The actual number of bytes depends on a CPU architecture.
24.6. Removing transient control groups
You can use the
systemd system and service manager to remove transient control groups (
cgroups) if you no longer need to limit, prioritize, or control access to hardware resources for groups of processes.
cgroups are automatically released once all the processes that a service or a scope unit contains, finish.
To stop the service unit with all its processes, execute:
# systemctl stop name.service
To terminate one or more of the unit processes, execute:
# systemctl kill name.service --kill-who=PID,... --signal=<signal>
The command above uses the
--kill-whooption to select process(es) from the control group you wish to terminate. To kill multiple processes at the same time, pass a comma-separated list of PIDs. The
--signaloption determines the type of POSIX signal to be sent to the specified processes. The default signal is SIGTERM.
24.7. Removing persistent control groups
You can use the
systemd system and service manager to remove persistent control groups (
cgroups) if you no longer need to limit, prioritize, or control access to hardware resources for groups of processes.
cgroups are released when a service or a scope unit is stopped or disabled and its configuration file is deleted.
Stop the service unit:
# systemctl stop <name>.service
Disable the service unit:
# systemctl disable <name>.service
Remove the relevant unit configuration file:
# rm /usr/lib/systemd/system/<name>.service
Reload all unit configuration files so that changes take effect:
# systemctl daemon-reload
24.8. Listing systemd units
The following procedure describes how to use the
systemd system and service manager to list its units.
To list all active units on the system, execute the
# systemctlcommand and the terminal will return an output similar to the following example:
# systemctl UNIT LOAD ACTIVE SUB DESCRIPTION … init.scope loaded active running System and Service Manager session-2.scope loaded active running Session 2 of user jdoe abrt-ccpp.service loaded active exited Install ABRT coredump hook abrt-oops.service loaded active running ABRT kernel log watcher abrt-vmcore.service loaded active exited Harvest vmcores for ABRT abrt-xorg.service loaded active running ABRT Xorg log watcher … -.slice loaded active active Root Slice machine.slice loaded active active Virtual Machine and Container Slice system-getty.slice loaded active active system-getty.slice system-lvm2\x2dpvscan.slice loaded active active system-lvm2\x2dpvscan.slice system-sshd\x2dkeygen.slice loaded active active system-sshd\x2dkeygen.slice system-systemd\x2dhibernate\x2dresume.slice loaded active active system-systemd\x2dhibernate\x2dresume> system-user\x2druntime\x2ddir.slice loaded active active system-user\x2druntime\x2ddir.slice system.slice loaded active active System Slice user-1000.slice loaded active active User Slice of UID 1000 user-42.slice loaded active active User Slice of UID 42 user.slice loaded active active User and Session Slice …
UNIT- a name of a unit that also reflects the unit position in a control group hierarchy. The units relevant for resource control are a slice, a scope, and a service.
LOAD- indicates whether the unit configuration file was properly loaded. If the unit file failed to load, the field contains the state error instead of loaded. Other unit load states are: stub, merged, and masked.
ACTIVE- the high-level unit activation state, which is a generalization of
SUB- the low-level unit activation state. The range of possible values depends on the unit type.
DESCRIPTION- the description of the unit content and functionality.
To list inactive units, execute:
# systemctl --all
To limit the amount of information in the output, execute:
# systemctl --type service,masked
--typeoption requires a comma-separated list of unit types such as a service and a slice, or unit load states such as loaded and masked.
- Configuring basic system settings in RHEL
24.9. Viewing systemd control group hierarchy
The following procedure describes how to display control groups (
cgroups) hierarchy and processes running in specific
To display the whole
cgroupshierarchy on your system, execute
# systemd-cgls Control group /: -.slice ├─user.slice │ ├─user-42.slice │ │ ├─session-c1.scope │ │ │ ├─ 965 gdm-session-worker [pam/gdm-launch-environment] │ │ │ ├─1040 /usr/libexec/gdm-x-session gnome-session --autostart /usr/share/gdm/greeter/autostart … ├─init.scope │ └─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18 └─system.slice … ├─example.service │ ├─6882 /bin/bash /home/jdoe/example.sh │ └─6902 sleep 1 ├─systemd-journald.service └─629 /usr/lib/systemd/systemd-journald …
The example output returns the entire
cgroupshierarchy, where the highest level is formed by slices.
To display the
cgroupshierarchy filtered by a resource controller, execute
# systemd-cgls <resource_controller>:
# systemd-cgls memory Controller memory; Control group /: ├─1 /usr/lib/systemd/systemd --switched-root --system --deserialize 18 ├─user.slice │ ├─user-42.slice │ │ ├─session-c1.scope │ │ │ ├─ 965 gdm-session-worker [pam/gdm-launch-environment] … └─system.slice | … ├─chronyd.service │ └─844 /usr/sbin/chronyd ├─example.service │ ├─8914 /bin/bash /home/jdoe/example.sh │ └─8916 sleep 1 …
The example output of the above command lists the services that interact with the selected controller.
To display detailed information about a certain unit and its part of the
# systemctl status <system_unit>:
# systemctl status example.service ● example.service - My example service Loaded: loaded (/usr/lib/systemd/system/example.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2019-04-16 12:12:39 CEST; 3s ago Main PID: 17737 (bash) Tasks: 2 (limit: 11522) Memory: 496.0K (limit: 1.5M) CGroup: /system.slice/example.service ├─17737 /bin/bash /home/jdoe/example.sh └─17743 sleep 1 Apr 16 12:12:39 redhat systemd: Started My example service. Apr 16 12:12:39 redhat bash: The current time is Tue Apr 16 12:12:39 CEST 2019 Apr 16 12:12:40 redhat bash: The current time is Tue Apr 16 12:12:40 CEST 2019
- What are kernel resource controllers
24.10. Viewing resource controllers
The following procedure describes how to learn which processes use which resource controllers.
To view which resource controllers a process interacts with, execute the
# cat proc/<PID>/cgroupcommand:
# cat /proc/11269/cgroup 12:freezer:/ 11:cpuset:/ 10:devices:/system.slice 9:memory:/system.slice/example.service 8:pids:/system.slice/example.service 7:hugetlb:/ 6:rdma:/ 5:perf_event:/ 4:cpu,cpuacct:/ 3:net_cls,net_prio:/ 2:blkio:/ 1:name=systemd:/system.slice/example.service
The example output relates to a process of interest. In this case, it is a process identified by
PID 11269, which belongs to the
example.serviceunit. You can determine whether the process was placed in a correct control group as defined by the
systemdunit file specifications.Note
By default, the items and their ordering in the list of resource controllers is the same for all units started by
systemd, since it automatically mounts all the default resource controllers.
Documentation in the
24.11. Monitoring resource consumption
The following procedure describes how to view a list of currently running control groups (
cgroups) and their resource consumption in real-time.
To see a dynamic account of currently running
cgroups, execute the
# systemd-cgtop Control Group Tasks %CPU Memory Input/s Output/s / 607 29.8 1.5G - - /system.slice 125 - 428.7M - - /system.slice/ModemManager.service 3 - 8.6M - - /system.slice/NetworkManager.service 3 - 12.8M - - /system.slice/accounts-daemon.service 3 - 1.8M - - /system.slice/boot.mount - - 48.0K - - /system.slice/chronyd.service 1 - 2.0M - - /system.slice/cockpit.socket - - 1.3M - - /system.slice/colord.service 3 - 3.5M - - /system.slice/crond.service 1 - 1.8M - - /system.slice/cups.service 1 - 3.1M - - /system.slice/dev-hugepages.mount - - 244.0K - - /system.slice/dev-mapper-rhel\x2dswap.swap - - 912.0K - - /system.slice/dev-mqueue.mount - - 48.0K - - /system.slice/example.service 2 - 2.0M - - /system.slice/firewalld.service 2 - 28.8M - - ...
The example output displays currently running
cgroupsordered by their resource usage (CPU, memory, disk I/O load). The list refreshes every 1 second by default. Therefore, it offers a dynamic insight into the actual resource usage of each control group.