Chapter 34. Understanding control groups

Using the control groups (cgroups) kernel functionality, you can control resource usage of applications to use them more efficiently.

You can use cgroups for the following tasks:

  • Setting limits for system resource allocation.
  • Prioritizing the allocation of hardware resources to specific processes.
  • Isolating certain processes from obtaining hardware resources.

34.1. Introducing control groups

Using the control groups Linux kernel feature, you can organize processes into hierarchically ordered groups - cgroups. You define the hierarchy (control groups tree) by providing structure to cgroups virtual file system, mounted by default on the /sys/fs/cgroup/ directory.

The systemd service manager uses cgroups to organize all units and services that it governs. Manually, you can manage the hierarchies of cgroups by creating and removing sub-directories in the /sys/fs/cgroup/ directory.

The resource controllers in the kernel then modify the behavior of processes in cgroups by limiting, prioritizing or allocating system resources, of those processes. These resources include the following:

  • CPU time
  • Memory
  • Network bandwidth
  • Combinations of these resources

The primary use case of cgroups is aggregating system processes and dividing hardware resources among applications and users. This makes it possible to increase the efficiency, stability, and security of your environment.

Control groups version 1

Control groups version 1 (cgroups-v1) provide a per-resource controller hierarchy. This means that each resource (such as CPU, memory, or I/O) has its own control group hierarchy. You can combine different control group hierarchies in a way that one controller can coordinate with another in managing their respective resources. However, when the two controllers belong to different process hierarchies, proper coordination is limited.

The cgroups-v1 controllers were developed across a large time span and as a result, the behavior and naming of their control files is not uniform.

Control groups version 2

Control groups version 2 (cgroups-v2) provide a single control group hierarchy against which all resource controllers are mounted.

The control file behavior and naming is consistent among different controllers.

Important

RHEL 9, by default, mounts and uses cgroups-v2.

Additional resources

34.2. Introducing kernel resource controllers

Kernel resource controllers enable the functionality of control groups. RHEL 9 supports various controllers for control groups version 1 (cgroups-v1) and control groups version 2 (cgroups-v2).

A resource controller, also called a control group subsystem, is a kernel subsystem that represents a single resource, such as CPU time, memory, network bandwidth or disk I/O. The Linux kernel provides a range of resource controllers that are mounted automatically by the systemd service manager. You can find a list of the currently mounted resource controllers in the /proc/cgroups file.

Controllers available for cgroups-v1:

blkio
Sets limits on input/output access to and from block devices.
cpu
Adjusts the parameters of the Completely Fair Scheduler (CFS) for a control group’s tasks. The cpu controller is mounted together with the cpuacct controller on the same mount.
cpuacct
Creates automatic reports on CPU resources used by tasks in a control group. The cpuacct controller is mounted together with the cpu controller on the same mount.
cpuset
Restricts control group tasks to run only on a specified subset of CPUs and to direct the tasks to use memory only on specified memory nodes.
devices
Controls access to devices for tasks in a control group.
freezer
Suspends or resumes tasks in a control group.
memory
Sets limits on memory use by tasks in a control group and generates automatic reports on memory resources used by those tasks.
net_cls
Tags network packets with a class identifier (classid) that enables the Linux traffic controller (the tc command) to identify packets that originate from a particular control group task. A subsystem of net_cls, the net_filter (iptables), can also use this tag to perform actions on such packets. The net_filter tags network sockets with a firewall identifier (fwid) that allows the Linux firewall to identify packets that originate from a particular control group task (by using the iptables command).
net_prio
Sets the priority of network traffic.
pids
Sets limits for a number of processes and their children in a control group.
perf_event
Groups tasks for monitoring by the perf performance monitoring and reporting utility.
rdma
Sets limits on Remote Direct Memory Access/InfiniBand specific resources in a control group.
hugetlb
Can be used to limit the usage of large size virtual memory pages by tasks in a control group.

Controllers available for cgroups-v2:

io
Sets limits on input/output access to and from block devices.
memory
Sets limits on memory use by tasks in a control group and generates automatic reports on memory resources used by those tasks.
pids
Sets limits for a number of processes and their children in a control group.
rdma
Sets limits on Remote Direct Memory Access/InfiniBand specific resources in a control group.
cpu
Adjusts the parameters of the Completely Fair Scheduler (CFS) for a control group’s tasks and creates automatic reports on CPU resources used by tasks in a control group.
cpuset
Restricts control group tasks to run only on a specified subset of CPUs and to direct the tasks to use memory only on specified memory nodes. Supports only the core functionality (cpus{,.effective}, mems{,.effective}) with a new partition feature.
perf_event
Groups tasks for monitoring by the perf performance monitoring and reporting utility. perf_event is enabled automatically on the v2 hierarchy.
Important

A resource controller can be used either in a cgroups-v1 hierarchy or a cgroups-v2 hierarchy, not simultaneously in both.

Additional resources

  • The cgroups(7) manual page
  • Documentation in /usr/share/doc/kernel-doc-<kernel_version>/Documentation/cgroups-v1/ directory (after installing the kernel-doc package).

34.3. Introducing namespaces

Namespaces are one of the most important methods for organizing and identifying software objects.

A namespace wraps a global system resource (for example, a mount point, a network device, or a hostname) in an abstraction that makes it appear to processes within the namespace that they have their own isolated instance of the global resource. One of the most common technologies that use namespaces are containers.

Changes to a particular global resource are visible only to processes in that namespace and do not affect the rest of the system or other namespaces.

To inspect which namespaces a process is a member of, you can check the symbolic links in the /proc/<PID>/ns/ directory.

Table 34.1. Supported namespaces and resources which they isolate:

NamespaceIsolates

Mount

Mount points

UTS

Hostname and NIS domain name

IPC

System V IPC, POSIX message queues

PID

Process IDs

Network

Network devices, stacks, ports, etc

User

User and group IDs

Control groups

Control group root directory

Additional resources