Red Hat Training

A Red Hat training course is available for RHEL 8

Chapter 12. Setting the disk scheduler

The disk scheduler is responsible for ordering the I/O requests submitted to a storage device.

You can configure the scheduler in several different ways:

Note

In Red Hat Enterprise Linux 8, block devices support only multi-queue scheduling. This enables the block layer performance to scale well with fast solid-state drives (SSDs) and multi-core systems.

The traditional, single-queue schedulers, which were available in Red Hat Enterprise Linux 7 and earlier versions, have been removed.

12.1. Available disk schedulers

The following multi-queue disk schedulers are supported in Red Hat Enterprise Linux 8:

none
Implements a first-in first-out (FIFO) scheduling algorithm. It merges requests at the generic block layer through a simple last-hit cache.
mq-deadline

Attempts to provide a guaranteed latency for requests from the point at which requests reach the scheduler.

The mq-deadline scheduler sorts queued I/O requests into a read or write batch and then schedules them for execution in increasing logical block addressing (LBA) order. By default, read batches take precedence over write batches, because applications are more likely to block on read I/O operations. After mq-deadline processes a batch, it checks how long write operations have been starved of processor time and schedules the next read or write batch as appropriate.

This scheduler is suitable for most use cases, but particularly those in which the write operations are mostly asynchronous.

bfq

Targets desktop systems and interactive tasks.

The bfq scheduler ensures that a single application is never using all of the bandwidth. In effect, the storage device is always as responsive as if it was idle. In its default configuration, bfq focuses on delivering the lowest latency rather than achieving the maximum throughput.

bfq is based on cfq code. It does not grant the disk to each process for a fixed time slice but assigns a budget measured in number of sectors to the process.

This scheduler is suitable while copying large files and the system does not become unresponsive in this case.

kyber

The scheduler tunes itself to achieve a latency goal by calculating the latencies of every I/O request submitted to the block I/O layer. You can configure the target latencies for read, in the case of cache-misses, and synchronous write requests.

This scheduler is suitable for fast devices, for example NVMe, SSD, or other low latency devices.

12.2. Different disk schedulers for different use cases

Depending on the task that your system performs, the following disk schedulers are recommended as a baseline prior to any analysis and tuning tasks:

Table 12.1. Disk schedulers for different use cases

Use caseDisk scheduler

Traditional HDD with a SCSI interface

Use mq-deadline or bfq.

High-performance SSD or a CPU-bound system with fast storage

Use none, especially when running enterprise applications. Alternatively, use kyber.

Desktop or interactive tasks

Use bfq.

Virtual guest

Use mq-deadline. With a host bus adapter (HBA) driver that is multi-queue capable, use none.

12.3. The default disk scheduler

Block devices use the default disk scheduler unless you specify another scheduler.

Note

For non-volatile Memory Express (NVMe) block devices specifically, the default scheduler is none and Red Hat recommends not changing this.

The kernel selects a default disk scheduler based on the type of device. The automatically selected scheduler is typically the optimal setting. If you require a different scheduler, Red Hat recommends to use udev rules or the TuneD application to configure it. Match the selected devices and switch the scheduler only for those devices.

12.4. Determining the active disk scheduler

This procedure determines which disk scheduler is currently active on a given block device.

Procedure

  • Read the content of the /sys/block/device/queue/scheduler file:

    # cat /sys/block/device/queue/scheduler
    
    [mq-deadline] kyber bfq none

    In the file name, replace device with the block device name, for example sdc.

    The active scheduler is listed in square brackets ([ ]).

12.5. Setting the disk scheduler using TuneD

This procedure creates and enables a TuneD profile that sets a given disk scheduler for selected block devices. The setting persists across system reboots.

In the following commands and configuration, replace:

  • device with the name of the block device, for example sdf
  • selected-scheduler with the disk scheduler that you want to set for the device, for example bfq

Prerequisites

Procedure

  1. Optional: Select an existing TuneD profile on which your profile will be based. For a list of available profiles, see TuneD profiles distributed with RHEL.

    To see which profile is currently active, use:

    $ tuned-adm active
  2. Create a new directory to hold your TuneD profile:

    # mkdir /etc/tuned/my-profile
  3. Find the system unique identifier of the selected block device:

    $ udevadm info --query=property --name=/dev/device | grep -E '(WWN|SERIAL)'
    
    ID_WWN=0x5002538d00000000_
    ID_SERIAL=Generic-_SD_MMC_20120501030900000-0:0
    ID_SERIAL_SHORT=20120501030900000
    Note

    The command in the this example will return all values identified as a World Wide Name (WWN) or serial number associated with the specified block device. Although it is preferred to use a WWN, the WWN is not always available for a given device and any values returned by the example command are acceptable to use as the device system unique ID.

  4. Create the /etc/tuned/my-profile/tuned.conf configuration file. In the file, set the following options:

    1. Optional: Include an existing profile:

      [main]
      include=existing-profile
    2. Set the selected disk scheduler for the device that matches the WWN identifier:

      [disk]
      devices_udev_regex=IDNAME=device system unique id
      elevator=selected-scheduler

      Here:

      • Replace IDNAME with the name of the identifier being used (for example, ID_WWN).
      • Replace device system unique id with the value of the chosen identifier (for example, 0x5002538d00000000).

        To match multiple devices in the devices_udev_regex option, enclose the identifiers in parentheses and separate them with vertical bars:

        devices_udev_regex=(ID_WWN=0x5002538d00000000)|(ID_WWN=0x1234567800000000)
  5. Enable your profile:

    # tuned-adm profile my-profile

Verification steps

  1. Verify that the TuneD profile is active and applied:

    $ tuned-adm active
    
    Current active profile: my-profile
    $ tuned-adm verify
    
    Verification succeeded, current system settings match the preset profile.
    See TuneD log file ('/var/log/tuned/tuned.log') for details.
  2. Read the contents of the /sys/block/device/queue/scheduler file:

    # cat /sys/block/device/queue/scheduler
    
    [mq-deadline] kyber bfq none

    In the file name, replace device with the block device name, for example sdc.

    The active scheduler is listed in square brackets ([]).

Additional resources

12.6. Setting the disk scheduler using udev rules

This procedure sets a given disk scheduler for specific block devices using udev rules. The setting persists across system reboots.

In the following commands and configuration, replace:

  • device with the name of the block device, for example sdf
  • selected-scheduler with the disk scheduler that you want to set for the device, for example bfq

Procedure

  1. Find the system unique identifier of the block device:

    $ udevadm info --name=/dev/device | grep -E '(WWN|SERIAL)'
    E: ID_WWN=0x5002538d00000000
    E: ID_SERIAL=Generic-_SD_MMC_20120501030900000-0:0
    E: ID_SERIAL_SHORT=20120501030900000
    Note

    The command in the this example will return all values identified as a World Wide Name (WWN) or serial number associated with the specified block device. Although it is preferred to use a WWN, the WWN is not always available for a given device and any values returned by the example command are acceptable to use as the device system unique ID.

  2. Configure the udev rule. Create the /etc/udev/rules.d/99-scheduler.rules file with the following content:

    ACTION=="add|change", SUBSYSTEM=="block", ENV{IDNAME}=="device system unique id", ATTR{queue/scheduler}="selected-scheduler"

    Here:

    • Replace IDNAME with the name of the identifier being used (for example, ID_WWN).
    • Replace device system unique id with the value of the chosen identifier (for example, 0x5002538d00000000).
  3. Reload udev rules:

    # udevadm control --reload-rules
  4. Apply the scheduler configuration:

    # udevadm trigger --type=devices --action=change

Verification steps

  • Verify the active scheduler:

    # cat /sys/block/device/queue/scheduler

12.7. Temporarily setting a scheduler for a specific disk

This procedure sets a given disk scheduler for specific block devices. The setting does not persist across system reboots.

Procedure

  • Write the name of the selected scheduler to the /sys/block/device/queue/scheduler file:

    # echo selected-scheduler > /sys/block/device/queue/scheduler

    In the file name, replace device with the block device name, for example sdc.

Verification steps

  • Verify that the scheduler is active on the device:

    # cat /sys/block/device/queue/scheduler