Chapter 4. Resource Management

Cgroups CPU ceiling enforcement

The Completely Fair Scheduler (CFS) in the Linux kernel is a proportional share scheduler which divides the CPU time proportionately between groups of tasks depending on the priority/weight of the task or shares assigned to groups of tasks. In CFS, a task group can get more than its share of CPU if there are enough idle CPU cycles available in the system, due to the work conserving nature of the scheduler.

However there are enterprise scenarios listed below, where giving more than the desired CPU share to a task group is not acceptable:
In enterprise systems that cater to multiple customers, cloud service providers need to assign a fixed amount of CPU time to the virtual guest based on the service level.
Service level guarantees
Customer demands a percentage of CPU resource without service interruptions for each virtual guest.
In these scenarios, the scheduler needs to put a hard stop on the CPU resource consumption of a task group if it exceeds a preset limit. This is usually achieved by throttling the task group when it fully consumes its allocated CPU time.
The cgroups CPU ceiling enforcement is considered a very important addition to the Red Hat Enterprise Linux feature repertoire, for the use case listed above. The CPU ceiling enforcement is provided by the Credit Scheduler in Xen, and also in the VMware ESX scheduler.
Cgroups CPU controller scalability improvement on SMP systems

Red Hat Enterprise Linux 6 enabled cgroups out of the box, and libvirt created a cgroup-per-guest model. On large SMP systems, an increase in the number of cgroups, worsened the performance. However, in Red Hat Enterprise Linux 6.2, the cgroups CPU scalability has been significantly improved, making it possible to create and run several hundreds of cgroups at once with no performance implications.

In addition to the scalability improvement, a /proc tunable parameter, dd sysctl_sched_shares_window, has been added, with the default set to 10 ms.
Cgroups I/O controller performance improvement

The cgroups I/O controller design has been improved to reduce the usage of locks inside the I/O controller, resulting in improved performance. Also, the I/O controller now supports per cgroup statistics.

Cgroups memory controller performance improvement

Red Hat Enterprise Linux 6.2 introduces a memory usage overhead improvement in the memory controller by reducing the allocation overhead for page_cgroup array by 37%. Additionally, the direct page_cgroup-to-page pointer has been removed, thereby improving the performance of the memory controller.

Default value for the CFQ group_isolation variable

The default for CFQ's group_isolation variable has been changed from 0 to 1 (/sys/block/<device>/queue/iosched/group_isolation). After various tests and numerous user reports, it was found that having default 1 is more useful. When set to 0, all random I/O queues become part of the root cgroup and not the actual cgroup which the application is part of. Consequently, this leads to no service differentiation for applications.


For more information on resource management and control groups, refer to the Red Hat Enterprise Linux 6 Resource Management Guide.