Red Hat Training

A Red Hat training course is available for Red Hat Enterprise Linux

3.4. cpuset

The cpuset subsystem assigns individual CPUs and memory nodes to cgroups. Each cpuset can be specified according to the following parameters, each one in a separate pseudofile within the cgroup virtual file system:

Important

Some subsystems have mandatory parameters that must be set before you can move a task into a cgroup which uses any of those subsystems. For example, before you move a task into a cgroup which uses the cpuset subsystem, the cpuset.cpus and cpuset.mems parameters must be defined for that cgroup.
cpuset.cpus (mandatory)
specifies the CPUs that tasks in this cgroup are permitted to access. This is a comma-separated list, with dashes ("-") to represent ranges. For example,
0-2,16
represents CPUs 0, 1, 2, and 16.
cpuset.mems (mandatory)
specifies the memory nodes that tasks in this cgroup are permitted to access. This is a comma-separated list in the ASCII format, with dashes ("-") to represent ranges. For example,
0-2,16
represents memory nodes 0, 1, 2, and 16.
cpuset.memory_migrate
contains a flag (0 or 1) that specifies whether a page in memory should migrate to a new node if the values in cpuset.mems change. By default, memory migration is disabled (0) and pages stay on the node to which they were originally allocated, even if the node is no longer among the nodes specified in cpuset.mems. If enabled (1), the system migrates pages to memory nodes within the new parameters specified by cpuset.mems, maintaining their relative placement if possible — for example, pages on the second node on the list originally specified by cpuset.mems are allocated to the second node on the new list specified by cpuset.mems, if the place is available.
cpuset.cpu_exclusive
contains a flag (0 or 1) that specifies whether cpusets other than this one and its parents and children can share the CPUs specified for this cpuset. By default (0), CPUs are not allocated exclusively to one cpuset.
cpuset.mem_exclusive
contains a flag (0 or 1) that specifies whether other cpusets can share the memory nodes specified for the cpuset. By default (0), memory nodes are not allocated exclusively to one cpuset. Reserving memory nodes for the exclusive use of a cpuset (1) is functionally the same as enabling a memory hardwall with the cpuset.mem_hardwall parameter.
cpuset.mem_hardwall
contains a flag (0 or 1) that specifies whether kernel allocations of memory page and buffer data should be restricted to the memory nodes specified for the cpuset. By default (0), page and buffer data is shared across processes belonging to multiple users. With a hardwall enabled (1), each tasks' user allocation can be kept separate.
cpuset.memory_pressure
a read-only file that contains a running average of the memory pressure created by the processes in the cpuset. The value in this pseudofile is automatically updated when cpuset.memory_pressure_enabled is enabled, otherwise, the pseudofile contains the value 0.
cpuset.memory_pressure_enabled
contains a flag (0 or 1) that specifies whether the system should compute the memory pressure created by the processes in the cgroup. Computed values are output to cpuset.memory_pressure and represent the rate at which processes attempt to free in-use memory, reported as an integer value of attempts to reclaim memory per second, multiplied by 1000.
cpuset.memory_spread_page
contains a flag (0 or 1) that specifies whether file system buffers should be spread evenly across the memory nodes allocated to the cpuset. By default (0), no attempt is made to spread memory pages for these buffers evenly, and buffers are placed on the same node on which the process that created them is running.
cpuset.memory_spread_slab
contains a flag (0 or 1) that specifies whether kernel slab caches for file input/output operations should be spread evenly across the cpuset. By default (0), no attempt is made to spread kernel slab caches evenly, and slab caches are placed on the same node on which the process that created them is running.
cpuset.sched_load_balance
contains a flag (0 or 1) that specifies whether the kernel will balance loads across the CPUs in the cpuset. By default (1), the kernel balances loads by moving processes from overloaded CPUs to less heavily used CPUs.
Note, however, that setting this flag in a cgroup has no effect if load balancing is enabled in any parent cgroup, as load balancing is already being carried out at a higher level. Therefore, to disable load balancing in a cgroup, disable load balancing also in each of its parents in the hierarchy. In this case, you should also consider whether load balancing should be enabled for any siblings of the cgroup in question.
cpuset.sched_relax_domain_level
contains an integer between -1 and a small positive value, which represents the width of the range of CPUs across which the kernel should attempt to balance loads. This value is meaningless if cpuset.sched_load_balance is disabled.
The precise effect of this value varies according to system architecture, but the following values are typical:
Values of cpuset.sched_relax_domain_level
ValueEffect
-1 Use the system default value for load balancing
0 Do not perform immediate load balancing; balance loads only periodically
1 Immediately balance loads across threads on the same core
2 Immediately balance loads across cores in the same package or book (in case of s390x architectures)
3 Immediately balance loads across books in the same package (available only for s390x architectures)
4 Immediately balance loads across CPUs on the same node or blade
5 Immediately balance loads across several CPUs on architectures with non-uniform memory access (NUMA)
6 Immediately balance loads across all CPUs on architectures with NUMA

Note

With the release of Red Hat Enterprise Linux 6.1 the BOOK scheduling domain has been added to the list of supported domain levels. This change affected the meaning of cpuset.sched_relax_domain_level values. Please note that the effect of values from 3 to 5 changed. For example, to get the old effect of value 3, which was "Immediately balance loads across CPUs on the same node or blade" the value 4 needs to be selected. Similarly, the old 4 is now 5, and the old 5 is now 6.