MRG 2.4 / realtime kernel - iostat shows intermittent stalled and/or stuck I/O on iSCSI device
Issue
- I have a problem that shows up intermittently on my server (once every 3-24 hours)
- The application writing to iSCSI LUN is primarily write bias used for when a node fails for a secondary one to take over.
- Looking at the performance monitor from the 4730 during this period of '0 read/writes' the array still says that it has performed '220 writes/sec' in the 5 sec monitoring periods.
- The 'iostat' command shows times where there are single stalls of 0 read/writes but one or two commands queued. iSCSI stall queue but no read/write (100% util)
date Device rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
05/20/2014 03:40:49 PM sdb 0 607.2 0 198.4 0 6452.8 32.52 0.17 0.84 0.76 15.1
05/20/2014 03:40:54 PM sdb 0 604 0 202.6 0 6452.8 31.85 0.2 0.98 0.79 16.06
05/20/2014 03:40:59 PM sdb 0 594.8 0.2 183.8 1.6 6228.8 33.86 0.16 0.86 0.81 14.9
05/20/2014 03:41:04 PM sdb 0 618.4 0 208.4 0 6614.4 31.74 0.18 0.85 0.75 15.68
05/20/2014 03:41:09 PM sdb 0 452.8 0 155 0 4857.6 31.34 0.4 0.84 2.48 38.48
05/20/2014 03:41:14 PM sdb 0 0 0 0 0 0 0 1 0 0 100 <--------- 100% util, 1 command queued
05/20/2014 03:41:19 PM sdb 0 1266.8 0 385 0 13219.2 34.34 14.95 42.13 1.39 53.48
05/20/2014 03:41:24 PM sdb 0 568 0 171.8 0 5918.4 34.45 0.15 0.87 0.84 14.48
05/20/2014 03:41:29 PM sdb 0 597.2 0 188 0 6281.6 33.41 0.16 0.83 0.79 14.94
05/20/2014 03:41:34 PM sdb 0 651.8 0 233 0 7078.4 30.38 0.19 0.82 0.75 17.42
05/20/2014 06:31:24 PM sdb 0 498.8 0 177.4 0 5409.6 30.49 0.54 3.04 2.96 52.56
05/20/2014 06:31:29 PM sdb 0 826.8 0 305.6 0 9059.2 29.64 0.23 0.74 0.64 19.64
05/20/2014 06:31:34 PM sdb 0 657 0 243 0 7200 29.63 0.22 0.89 0.73 17.66
05/20/2014 06:31:39 PM sdb 0 585.6 0 183.2 0 6150.4 33.57 0.15 0.84 0.75 13.72
05/20/2014 06:31:44 PM sdb 0 582.4 0 176.6 0 6072 34.38 0.15 0.85 0.78 13.84
05/20/2014 06:31:49 PM sdb 0 494.8 0 152.8 0 5160 33.77 0.45 0.84 1.86 28.44
05/20/2014 06:31:54 PM sdb 0 0 0 0 0 0 0 2 0 0 100 <------- 100 % util, 2 commands queued
05/20/2014 06:31:59 PM sdb 0 1218.6 0 366.8 0 12704 34.63 0.34 7.25 0.69 25.36
05/20/2014 06:32:04 PM sdb 0 587.4 0 188 0 6203.2 33 0.16 0.84 0.78 14.68
05/20/2014 06:32:09 PM sdb 0 603.6 0 204.8 0 6467.2 31.58 0.21 1.03 0.76 15.6
05/20/2014 06:32:14 PM sdb 0 613.6 0 198.2 0 6486.4 32.73 0.18 0.88 0.85 16.94
05/20/2014 06:32:19 PM sdb 0 587.6 0.2 181.4 1.6 6160 33.93 0.16 0.88 0.82 14.92
05/20/2014 06:32:24 PM sdb 0 583 0 179.2 0 6097.6 34.03 0.15 0.85 0.81 14.52
Environment
- Red Hat Enterprise MRG Realtime 2.4
- seen on RT kernel 3.8.13-rt27
- Specific hardware
- HP BL460g8 servers with emulex 10G network cards and HP 6120XG chassis switch. The storage in use is HP 4730 x 3 units in cluster.
- Emulex card:
driver: be2net
version: 4.6.62.0u
firmware-version: 4.6.247.5
- iSCSI
- Linux software iSCSI initiator over be2net / Emulex NIC
- Lefthand iSCSI target (LEFTHAND iSCSIDisk)
- Network (used by iSCSI)
- bond with 2 interfaces
- application with threads running at a high realtime priority
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.