• Comments
  • NVMe performance degradation on RHEL 6.6

    Posted on

    Problem Statement
    We are seeing unexpected performance degradation on our NVMe device when using RHEL 6.6.

    The scenario is running a FIO random read job with a 4k block size. See all the parms below.
    We do not see the problem when using RHEL 6.5 or RHEL 7.0 on the same hardware

    System Details

    OS Level      RHEL 6.6
    Kernel          2.6.32-504.el6.x86_64
    H/W      Super-micro X10SAE motherboard,   
                         16GB ddr3 memory @ 1600MHz
                         Intel Xeon CPU E3-1225 v3 @ 3.20GHz,  1 socket – 4 core
    Device       Samsung NVMe SSD Controller 171X (rev 03)
                         Dell Express Flash NVMe XS1715 SSD 400GB
                        Using PCIe 3.0 slot.  Target Link Speed: 8GT/s  (from lspci)
    Driver      nvme – shipped with the kernel code
    

    FIO Test Results

    RHEL 6.5    2.6.32-431.23.3.el6.x86_64                  750 Kiops @ 55% cpu utilization
    RHEL 6.6    2.6.32-504.el6.x86_64                           139 Kiops @ 97% cpu utilization         ----
    RHEL 7.0    3.10.0-123.9.3.el7.x86_64                           753 Kiops @ 59% cpu utilizaion
    CentOS 6.5  2.6.32-431.29.2.el6.centos.plus.x86_64  753 Kiops @ 58% cpu utilization
    CentOS 6.6  2.6.32-504.1.3.el6.x86_64                           749 Kiops @ 55% cpu utilization
    CentOS 7.0  3.10.0-123.9.2.el7.x86_64                           749 Kiops @ 59% cpu utilization
    

    FIO Output Sample (using rhel 6.6)
    Measure_RR_4KB_QD256: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
    ...
    Measure_RR_4KB_QD256: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
    fio-2.0.13
    Starting 4 processes
    Jobs: 4 (f=4): [rrrr] [100.0% done] [543.5M/0K/0K /s] [139K/0 /0 iops] [eta 00m:00s]
    Measure_RR_4KB_QD256: (groupid=0, jobs=4): err= 0: pid=5510: Mon Dec 8 12:24:54 2014
    read : io=161844MB, bw=552427KB/s, iops=138106 , runt=300001msec
    slat (usec): min=0 , max=94120 , avg=21.57, stdev=96.18
    clat (usec): min=7 , max=96788 , avg=1828.19, stdev=801.42
    lat (usec): min=94 , max=96875 , avg=1850.74, stdev=806.36
    clat percentiles (usec):
    | 1.00th=[ 812], 5.00th=[ 1064], 10.00th=[ 1208], 20.00th=[ 1384],
    | 30.00th=[ 1512], 40.00th=[ 1640], 50.00th=[ 1768], 60.00th=[ 1880],
    | 70.00th=[ 2024], 80.00th=[ 2192], 90.00th=[ 2448], 95.00th=[ 2672],
    | 99.00th=[ 3280], 99.50th=[ 3856], 99.90th=[11840], 99.95th=[13120],
    | 99.99th=[21632]
    bw (KB/s) : min=76848, max=154416, per=25.01%, avg=138144.24, stdev=7932.83
    lat (usec) : 10=0.01%, 50=0.01%, 100=0.01%, 250=0.01%, 500=0.03%
    lat (usec) : 750=0.56%, 1000=2.98%
    lat (msec) : 2=64.83%, 4=31.12%, 10=0.30%, 20=0.16%, 50=0.01%
    lat (msec) : 100=0.01%
    cpu : usr=20.09%, sys=77.38%, ctx=165487, majf=0, minf=354
    IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
    submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
    complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
    issued : total=r=41432135/w=0/d=0, short=r=0/w=0/d=0

    Run status group 0 (all jobs):
    READ: io=161844MB, aggrb=552426KB/s, minb=552426KB/s, maxb=552426KB/s, mint=300001msec, maxt=300001msec

    FIO Job Parms
    ;Async Test CPU Utilization
    ;======================
    ; -- start job file --
    [Measure_RR_4KB_QD256]
    ioengine=libaio
    direct=1
    rw=randread
    norandommap
    randrepeat=0
    iodepth=64
    size=25%
    numjobs=4
    bs=4k
    overwrite=1
    filename=/dev/nvme1n1
    runtime=5m
    time_based
    group_reporting
    stonewall

    by

    points

    Responses

    Red Hat
    © 2025 Red Hat, Inc.