MRG 2.4 / realtime kernel - iostat shows intermittent stalled and/or stuck I/O on iSCSI device

Solution In Progress - Updated -

Issue

  • I have a problem that shows up intermittently on my server (once every 3-24 hours)
  • The application writing to iSCSI LUN is primarily write bias used for when a node fails for a secondary one to take over.
  • Looking at the performance monitor from the 4730 during this period of '0 read/writes' the array still says that it has performed '220 writes/sec' in the 5 sec monitoring periods.
  • The 'iostat' command shows times where there are single stalls of 0 read/writes but one or two commands queued. iSCSI stall queue but no read/write (100% util)
date    Device  rrqm/s  wrqm/s  r/s w/s rsec/s  wsec/s  avgrq-sz    avgqu-sz    await   svctm   %util
05/20/2014 03:40:49 PM  sdb 0   607.2   0   198.4   0   6452.8  32.52   0.17    0.84    0.76    15.1
05/20/2014 03:40:54 PM  sdb 0   604 0   202.6   0   6452.8  31.85   0.2 0.98    0.79    16.06
05/20/2014 03:40:59 PM  sdb 0   594.8   0.2 183.8   1.6 6228.8  33.86   0.16    0.86    0.81    14.9
05/20/2014 03:41:04 PM  sdb 0   618.4   0   208.4   0   6614.4  31.74   0.18    0.85    0.75    15.68
05/20/2014 03:41:09 PM  sdb 0   452.8   0   155 0   4857.6  31.34   0.4 0.84    2.48    38.48
05/20/2014 03:41:14 PM  sdb 0   0   0   0   0   0   0   1   0   0   100   <--------- 100% util, 1 command queued
05/20/2014 03:41:19 PM  sdb 0   1266.8  0   385 0   13219.2 34.34   14.95   42.13   1.39    53.48
05/20/2014 03:41:24 PM  sdb 0   568 0   171.8   0   5918.4  34.45   0.15    0.87    0.84    14.48
05/20/2014 03:41:29 PM  sdb 0   597.2   0   188 0   6281.6  33.41   0.16    0.83    0.79    14.94
05/20/2014 03:41:34 PM  sdb 0   651.8   0   233 0   7078.4  30.38   0.19    0.82    0.75    17.42

05/20/2014 06:31:24 PM  sdb 0   498.8   0   177.4   0   5409.6  30.49   0.54    3.04    2.96    52.56
05/20/2014 06:31:29 PM  sdb 0   826.8   0   305.6   0   9059.2  29.64   0.23    0.74    0.64    19.64
05/20/2014 06:31:34 PM  sdb 0   657 0   243 0   7200    29.63   0.22    0.89    0.73    17.66
05/20/2014 06:31:39 PM  sdb 0   585.6   0   183.2   0   6150.4  33.57   0.15    0.84    0.75    13.72
05/20/2014 06:31:44 PM  sdb 0   582.4   0   176.6   0   6072    34.38   0.15    0.85    0.78    13.84
05/20/2014 06:31:49 PM  sdb 0   494.8   0   152.8   0   5160    33.77   0.45    0.84    1.86    28.44
05/20/2014 06:31:54 PM  sdb 0   0   0   0   0   0   0   2   0   0   100  <------- 100 % util, 2 commands queued
05/20/2014 06:31:59 PM  sdb 0   1218.6  0   366.8   0   12704   34.63   0.34    7.25    0.69    25.36
05/20/2014 06:32:04 PM  sdb 0   587.4   0   188 0   6203.2  33  0.16    0.84    0.78    14.68
05/20/2014 06:32:09 PM  sdb 0   603.6   0   204.8   0   6467.2  31.58   0.21    1.03    0.76    15.6
05/20/2014 06:32:14 PM  sdb 0   613.6   0   198.2   0   6486.4  32.73   0.18    0.88    0.85    16.94
05/20/2014 06:32:19 PM  sdb 0   587.6   0.2 181.4   1.6 6160    33.93   0.16    0.88    0.82    14.92
05/20/2014 06:32:24 PM  sdb 0   583 0   179.2   0   6097.6  34.03   0.15    0.85    0.81    14.52

Environment

  • Red Hat Enterprise MRG Realtime 2.4
    • seen on RT kernel 3.8.13-rt27
  • Specific hardware
    • HP BL460g8 servers with emulex 10G network cards and HP 6120XG chassis switch. The storage in use is HP 4730 x 3 units in cluster.
    • Emulex card:
driver: be2net
version: 4.6.62.0u
firmware-version: 4.6.247.5
  • iSCSI
    • Linux software iSCSI initiator over be2net / Emulex NIC
    • Lefthand iSCSI target (LEFTHAND iSCSIDisk)
  • Network (used by iSCSI)
    • bond with 2 interfaces
  • application with threads running at a high realtime priority

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content