Chapter 8. Setting the disk scheduler

The disk scheduler is responsible for ordering the I/O requests submitted to a storage device.

You can configure the scheduler in several different ways:

8.1. Disk scheduler changes in RHEL 8

In RHEL 8, block devices support only multi-queue scheduling. This enables the block layer performance to scale well with fast solid-state drives (SSDs) and multi-core systems.

The traditional, single-queue schedulers, which were available in RHEL 7 and earlier versions, have been removed.

8.2. Available disk schedulers

The following multi-queue disk schedulers are supported in RHEL 8:

none
Implements a first-in first-out (FIFO) scheduling algorithm. It merges requests at the generic block layer through a simple last-hit cache.
mq-deadline

Attempts to provide a guaranteed latency for requests from the point at which requests reach the scheduler.

The mq-deadline scheduler sorts queued I/O requests into a read or write batch and then schedules them for execution in increasing logical block addressing (LBA) order. By default, read batches take precedence over write batches, because applications are more likely to block on read I/O operations. After mq-deadline processes a batch, it checks how long write operations have been starved of processor time and schedules the next read or write batch as appropriate.

This scheduler is suitable for most use cases, but particularly those in which the write operations are mostly asynchronous.

bfq

Targets desktop systems and interactive tasks.

The bfq scheduler ensures that a single application is never using all of the bandwidth. In effect, the storage device is always as responsive as if it was idle. In its default configuration, bfq focuses on delivering the lowest latency rather than achieving the maximum throughput.

bfq is based on cfq code. It does not grant the disk to each process for a fixed time slice but assigns a budget measured in number of sectors to the process.

This scheduler is suitable while copying large files and the system does not become unresponsive in this case.

kyber

The scheduler tunes itself to achieve a latency goal. You can configure the target latencies for read and synchronous write requests.

This scheduler is suitable for fast devices, for example NVMe, SSD, or other low latency devices.

8.3. Different disk schedulers for different use cases

Depending on the task that your system performs, the following disk schedulers are recommended as a baseline prior to any analysis and tuning tasks:

Table 8.1. Disk schedulers for different use cases

Use caseDisk scheduler

Traditional HDD with a SCSI interface

Use mq-deadline or bfq.

High-performance SSD or a CPU-bound system with fast storage

Use none, especially when running enterprise applications. Alternatively, use kyber.

Desktop or interactive tasks

Use bfq.

Virtual guest

Use mq-deadline. With a multi-queue host bus adapter (HBA), use none.

8.4. The default disk scheduler

Block devices use the default disk scheduler unless you specify another scheduler.

The kernel selects a default disk scheduler based on the type of device. The automatically selected scheduler is typically the optimal setting. If you require a different scheduler, Red Hat recommends to use udev rules or the Tuned application to configure it. Match the selected devices and switch the scheduler only for those devices.

8.5. Determining the active disk scheduler

This procedure determines which disk scheduler is currently active on a given block device.

Procedure

  • Read the content of the /sys/block/device/queue/scheduler file:

    # cat /sys/block/device/queue/scheduler
    
    [mq-deadline] kyber bfq none

    In the file name, replace device with the block device name, for example sdc.

    The active scheduler is listed in square brackets ([ ]).

8.6. Setting the disk scheduler using Tuned

This procedure creates and enables a Tuned profile that sets a given disk scheduler for selected block devices. The setting persists across system reboots.

In the following commands and configuration, replace:

  • device with the name of the block device, for example sdf
  • selected-scheduler with the disk scheduler that you want to set for the device, for example bfq

Prerequisites

Procedure

  1. Optional: Select an existing Tuned profile on which your profile will be based. For a list of available profiles, see Section 2.3, “Tuned profiles distributed with RHEL”.

    To see which profile is currently active, use:

    $ tuned-adm active
  2. Create a new directory to hold your Tuned profile:

    # mkdir /etc/tuned/my-profile
  3. Find the World Wide Name (WWN) identifier of the selected block device:

    $ udevadm info --query=property --name=/dev/device | grep WWN=
    
    ID_WWN=0x5002538d00000000
  4. Create the /etc/tuned/my-profile/tuned.conf configuration file. In the file, set the following options:

    • Optional: Include an existing profile:

      [main]
      include=existing-profile
    • Set the selected disk scheduler for the device that matches the WWN identifier:

      [disk]
      devices_udev_regex=ID_WWN=0x5002538d00000000
      elevator=selected-scheduler

      To match multiple devices in the devices_udev_regex option, enclose the identifiers in parentheses and separate them with vertical bars:

      devices_udev_regex=(ID_WWN=0x5002538d00000000)|(ID_WWN=0x1234567800000000)
  5. Enable your profile:

    # tuned-adm profile my-profile
  6. Verify that the Tuned profile is active and applied:

    $ tuned-adm active
    
    Current active profile: my-profile
    $ tuned-adm verify
    
    Verification succeeded, current system settings match the preset profile.
    See tuned log file ('/var/log/tuned/tuned.log') for details.

Additional resources

8.7. Setting the disk scheduler using udev rules

This procedure sets a given disk scheduler for specific block devices using udev rules. The setting persists across system reboots.

In the following commands and configuration, replace:

  • device with the name of the block device, for example sdf
  • selected-scheduler with the disk scheduler that you want to set for the device, for example bfq

Procedure

  1. Find the World Wide Identifier (WWID) of the block device:

    $ udevadm info --attribute-walk --name=/dev/device | grep wwid
    
        ATTRS{wwid}=="device WWID"

    An example value of device WWID is:

    t10.ATA     SAMSUNG MZNLN256HMHQ-000L7              S2WDNX0J336519
  2. Configure the udev rule. Create the /etc/udev/rules.d/99-scheduler.rules file with the following content:

    ACTION=="add|change", SUBSYSTEM=="block", ATTRS{wwid}=="device WWID", ATTR{queue/scheduler}="selected-scheduler"

    Replace device WWID with the WWID value that you found in the previous steps.

  3. Reload udev rules:

    # udevadm control --reload-rules
  4. Apply the scheduler configuration:

    # udevadm trigger --type=devices --action=change
  5. Verify the active scheduler:

    # cat /sys/block/device/queue/scheduler

8.8. Temporarily setting a scheduler for a specific disk

This procedure sets a given disk scheduler for specific block devices. The setting does not persist across system reboots.

Procedure

  • Write the name of the selected scheduler to the /sys/block/device/queue/scheduler file:

    # echo selected-scheduler > /sys/block/device/queue/scheduler

    In the file name, replace device with the block device name, for example sdc.

Verification steps

  • Verify that the scheduler is active on the device:

    # cat /sys/block/device/queue/scheduler