RHEL 6.5 - "Kernel: multipath, not tainted" alarms
Hi, this is my first post on RHEL forums! I feel the need to post, as since the most recent kernel patch I am seeing the following errors:
Feb 10 05:33:18 know-iprd-testaql1 kernel: INFO: task multipathd:4620 blocked for more than 120 seconds.
Feb 10 05:33:18 know-iprd-testaql1 kernel: Not tainted 2.6.32-431.3.1.el6.x86_64 #1
Feb 10 05:33:18 know-iprd-testaql1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 10 05:33:18 know-iprd-testaql1 kernel: multipathd D 0000000000000005 0 4620 1 0x00000000
Feb 10 05:33:18 know-iprd-testaql1 kernel: ffff88105ac4f968 0000000000000082 0000000000000000 ffff880b48cf8880
Feb 10 05:33:18 know-iprd-testaql1 kernel: ffff881061c305e0 ffff881063c21000 ffffffff8137ffe0 ffff880b48ebee40
Feb 10 05:33:18 know-iprd-testaql1 kernel: ffff88105a828638 ffff88105ac4ffd8 000000000000fbc8 ffff88105a828638
Feb 10 05:33:18 know-iprd-testaql1 kernel: Call Trace:
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8137ffe0>] ? scsi_done+0x0/0x60
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff815287c5>] schedule_timeout+0x215/0x2e0
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff810841dd>] ? del_timer+0x7d/0xe0
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff81528443>] wait_for_common+0x123/0x180
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8152855d>] wait_for_completion+0x1d/0x20
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8126d9ec>] blk_execute_rq+0x8c/0xf0
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8126d730>] ? blk_rq_map_user+0x1a0/0x280
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff81271f15>] sg_io+0x215/0x3d0
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff812728a0>] scsi_cmd_ioctl+0x400/0x470
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff81272961>] scsi_cmd_blk_ioctl+0x51/0x70
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffffa00bc5bf>] sd_ioctl+0xaf/0x110 [sd_mod]
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8126f637>] __blkdev_driver_ioctl+0x67/0x80
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8126fabd>] blkdev_ioctl+0x1ed/0x6e0
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff811c46dc>] block_ioctl+0x3c/0x40
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8119db52>] vfs_ioctl+0x22/0xa0
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff81282715>] ? _atomic_dec_and_lock+0x55/0x80
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8119dcf4>] do_vfs_ioctl+0x84/0x580
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8118a7e1>] ? __fput+0x1a1/0x210
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8119e271>] sys_ioctl+0x81/0xa0
Feb 10 05:33:18 know-iprd-testaql1 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Feb 10 05:35:18 know-iprd-testaql1 kernel: INFO: task multipathd:4620 blocked for more than 120 seconds.
Feb 10 05:35:18 know-iprd-testaql1 kernel: Not tainted 2.6.32-431.3.1.el6.x86_64 #1
Feb 10 05:35:18 know-iprd-testaql1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 10 05:35:18 know-iprd-testaql1 kernel: multipathd D 0000000000000005 0 4620 1 0x00000000
Feb 10 05:35:18 know-iprd-testaql1 kernel: ffff88105ac4f968 0000000000000082 0000000000000000 ffff880b48cf8880
Feb 10 05:35:18 know-iprd-testaql1 kernel: ffff881061c305e0 ffff881063c21000 ffffffff8137ffe0 ffff880b48ebee40
Feb 10 05:35:18 know-iprd-testaql1 kernel: ffff88105a828638 ffff88105ac4ffd8 000000000000fbc8 ffff88105a828638
Feb 10 05:35:18 know-iprd-testaql1 kernel: Call Trace:
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8137ffe0>] ? scsi_done+0x0/0x60
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff815287c5>] schedule_timeout+0x215/0x2e0
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff810841dd>] ? del_timer+0x7d/0xe0
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff81528443>] wait_for_common+0x123/0x180
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8152855d>] wait_for_completion+0x1d/0x20
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8126d9ec>] blk_execute_rq+0x8c/0xf0
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8126d730>] ? blk_rq_map_user+0x1a0/0x280
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff81271f15>] sg_io+0x215/0x3d0
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff812728a0>] scsi_cmd_ioctl+0x400/0x470
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff81272961>] scsi_cmd_blk_ioctl+0x51/0x70
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffffa00bc5bf>] sd_ioctl+0xaf/0x110 [sd_mod]
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8126f637>] __blkdev_driver_ioctl+0x67/0x80
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8126fabd>] blkdev_ioctl+0x1ed/0x6e0
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff811c46dc>] block_ioctl+0x3c/0x40
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8119db52>] vfs_ioctl+0x22/0xa0
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff81282715>] ? _atomic_dec_and_lock+0x55/0x80
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8119dcf4>] do_vfs_ioctl+0x84/0x580
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8118a7e1>] ? __fput+0x1a1/0x210
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8119e271>] sys_ioctl+0x81/0xa0
Feb 10 05:35:18 know-iprd-testaql1 kernel: [<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Feb 10 05:36:17 know-iprd-testaql1 multipathd: sdq: couln't get asymmetric access state
Feb 10 05:36:17 know-iprd-testaql1 kernel: qla2xxx [0000:06:00.0]-801c:0: Abort command issued nexus=0:1:4 -- 1 2002.
Feb 10 05:36:17 know-iprd-testaql1 multipathd: data04: load table [0 419430400 multipath 1 queue_if_no_path 0 3 2 round-robin 0 1 1 65:0 1 round-robin 0 4 1 65:192 1 66:128 1 67:64
1 68:0 1 round-robin 0 3 1 8:64 1 68:192 1 69:128 1]
Feb 10 05:36:37 know-iprd-testaql1 multipathd: data04: load table [0 419430400 multipath 1 queue_if_no_path 0 2 1 round-robin 0 4 1 65:192 1 66:128 1 67:64 1 68:0 1 round-robin 0 4
1 65:0 1 8:64 1 68:192 1 69:128 1]
I don't know if this was an issue prior to the patch, or not, sorry. What I do know is, it worries me....should it?
What is actually broken here, if anything? Is there a fix?
Many thanks,
Mark Hands
Responses
Hey Mark - welcome to the forum!
Whenever I see "blocked for more than 120 seconds." I become concerned.
I suspect that it's not your kernel that has broken anything (although it is possible), but multipath might have been patched at the same time.
ls -l /etc/multipath.conf
grep -v ^# /etc/multipath.conf
multipath -ll -v2
Also - could you let us know what type of storage array you are using (to validate the multipath config) and whether you are booting from SAN (as the initrd would likely need some attention too).
I should preface by stating that you should absolutely open a case if this is urgent/important. We love to help folks at the forum, but certainly don't want to impact your business while we help you get through this ;-)
I have not worked with the v7000 on RHEL yet, but 8-paths to each LUN seems odd (although is absolutely possible). The fact that they have different priorities also seems odd. I would check IBMs recommendations for the mulitpath config. Also - I would check to see if the patching left behind and old version of the multipath.conf (as it appears something updated that file this morning. If you use satellite and configuration channels, there may be a copy in /var/lib/rhncfg/backups/
find /etc/ -name "multipath*" -exec ls -lh {} \;
find /var/ -name "multipath*" -exec ls -lh {} \;
Please take a look at this
http://pic.dhe.ibm.com/infocenter/storwize/ic/topic/com.ibm.storwize.v7000.doc/svc_linux_settings.html
specifically the global-device section:
device {
vendor "IBM"
product "2145"
path_grouping_policy group_by_prio
getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
features "1 queue_if_no_path"
prio alua
path_checker tur
failback immediate
no_path_retry "5"
rr_min_io 1
# polling_interval 30
dev_loss_tmo 120
}
You will likely NOT want to use getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n" - as that will change the "common name" (e.g. data02) that you have implemented elsewhere in your config, I believe.
It shouldn't change the devices' aliases. The getuid_callout should just be for pulling devid (typically the LUN serial) information from the SCSI device's code-page. The devid is what gets populated into the parenthesis-surrounded field of the multipath -ll output (and is what the OP is using to bind his aliases - the "data02" labels - with).
The "dev_loss_tmo 120" declaration is what's getting triggered in the original "NFO: task multipathd:4620 blocked for more than 120 seconds" logged message. Basically, an I/O that was sent down one of his eight paths to the LUN failed to complete within the time specified by the "dev_loss_tmo 120" declaration. Assuming the failed I/O was able to be shunted to another path and successfully committed, it should result in no damage to the data. Mostly, what it should be pointing to is that there have been I/O failures down a given path in the path-group.
Looking further into the logs, it looks like the /dev/sdq devnode/path (a member of the "data04" metadevice) is what experienced the error. That sdq is currently showing as good means that it was likely a self-correcting/transient condition. That said, if these are regular occurrences, it's likely going to be a good idea to talk to the SAN admin to see if either the array (or just one of the HBAs on the array) or the fabric has been experiencing issues that would have caused that path to become unavailable for more than 120s at a go.
I'll have to do some more digging. I'm surprised that UDEV is still recommended - and that recommendation might be release-specific.
que-depth is typically tuned on a per-HBA basis. (e.g. if you had a Database host that had SAN storage, and also direct fiber-channel tape devices - you would tune the cards differently).
Do you know what kind of HBAs you have installed? Are you using the included driver, or the stack provided by the vendor (Emulex or Qlogic, etc...)?
lspci | egrep -i 'fibre|fiber'
Que depth is an important tuning parameter, but in general it would make a box run poorly and not completely stop I/O if misconfigured. When I have seen that hung_tasktime_out message, it has typically indicated that the OS/kernel was not able to access something on disk.
Does the V7000 have an active/active multipath, or primary/stand-by? Are you able to see if the path (on the array itself) is transferring between the array controllers (it's been about 3 years since I have messed around with one of those arrays).
Hopefully I get some time to dig deeper on this and I can help you out.
I don't have any particular stance on UDEV. UDEV is quite perplexing to me, actually. It seemed as though releases had relied on UDEV heavily for a while, then they found some security issue with it.. and something changed ... and suddenly I don't hear about it for quite some time - then it pops up on the forum ;-)
We HAD to use UDEV for a while when Oracle ASM was not supported on RHEL 6. We have regretted the decision to use it (for this particular function).
So - to answer your question: I'm not sure if UDEV was/is necessary. IBMs documentation is usually pretty awesome, but the docs we found might be a bit dated. We use Emulex and included driver stack is actually sufficient for us.
Did you happen across this doc?
https://www-304.ibm.com/partnerworld/wps/servlet/download/DownloadServlet?id=6bQjQiT_N1PiPCA$cnt&attachmentName=Linux_IO_performance.pdf&token=MTM5MjIzMzk2NjUwMA==&locale=en_ALL_ZZ
Sorry to add such ambiguity in to the conversation! ;-) I wish I had a better answer.
Unless you're using an older EL5-based system, you generally don't want to muck with /etc/modprobe.conf. Its use is deprecated in EL6 and even in later EL5s.
That said, in general, if a given file doesn't exist in /etc/modprobe.d, you can just go ahead and create it - populating it with the suggested information. The fun of /etc/modprobe.d files is that, while there may be naming conventions promulgated by a module-maker, there's not really any specific name you must to give to your files: absent a vendor filename recommendation, just pick something meaningful to you (i.e., "self-documenting").
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
