Red Hat Enterprise Linux 6.1 kdump fails to create vmcore over nfs on a HP BL460c-G7.
Environment
- Red Hat Enterprise Linux 6.1
- kdump using net (nfs)
- SmartArray P410i
- HP BL460c-G7
Issue
-
kdump hangs on load with the following insmod stack trace while trying to generate a vmcore via nfs on HP BL460C-G7 system
HP HPSA Driver (v 2.0.2-3) hpsa 0000:0c:00.0: using doorbell to reset controller hpsa 0000:0c:00.0: PCI INT A -> Link[LNKA] -> GSI 5 (level, low) -> IRQ 5 hpsa 0000:0c:00.0: Waiting for board to reset. hpsa 0000:0c:00.0: board ready after hard reset. hpsa 0000:0c:00.0: Waiting for controller to respond to no-op hpsa 0000:0c:00.0: controller message 03:00 succeeded hpsa 0000:0c:00.0: PCI INT A -> Link[LNKA] -> GSI 5 (level, low) -> IRQ 5 hpsa 0000:0c:00.0: MSIX hpsa 0000:0c:00.0: hpsa0: <0x323a> at IRQ 48 using DAC INFO: task insmod:170 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. insmod D 0000000000000000 0 170 1 0x00000000 ffff880008bbba48 0000000000000086 0000000100000000 0000000000000000 ffff880000019a50 00000037ffffffc8 ffff88000002ae40 00000000fffd0426 ffff8800089dd0b8 ffff880008bbbfd8 000000000000f598 ffff8800089dd0b8 Call Trace: [<ffffffff814dba85>] schedule_timeout+0x215/0x2e0 [<ffffffff8111fb31>] ? __alloc_pages_nodemask+0x111/0x8b0 [<ffffffff814dad57>] ? thread_return+0x4e/0x777 [<ffffffff814db703>] wait_for_common+0x123/0x180 [<ffffffff8105dc20>] ? default_wake_function+0x0/0x20 [<ffffffffa0105a3b>] ? enqueue_cmd_and_start_io+0x11b/0x180 [hpsa] [<ffffffff814db81d>] wait_for_completion+0x1d/0x20 [<ffffffffa0105efc>] hpsa_scsi_do_simple_cmd_with_retry+0x6c/0xf0 [hpsa] [<ffffffffa0107d19>] hpsa_scsi_do_inquiry+0x79/0x100 [hpsa] [<ffffffffa010c2cc>] hpsa_init_one+0xfc4/0x1294 [hpsa] [<ffffffff811e6a45>] ? sysfs_addrm_finish+0x25/0x290 [<ffffffff81280ba7>] local_pci_probe+0x17/0x20 [<ffffffff81281d91>] pci_device_probe+0x101/0x120 [<ffffffff8133b552>] ? driver_sysfs_add+0x62/0x90 [<ffffffff8133b6f0>] driver_probe_device+0xa0/0x2a0 [<ffffffff8133b99b>] __driver_attach+0xab/0xb0 [<ffffffff8133b8f0>] ? __driver_attach+0x0/0xb0 [<ffffffff8133a954>] bus_for_each_dev+0x64/0x90 [<ffffffff8133b48e>] driver_attach+0x1e/0x20 [<ffffffff8133ad90>] bus_add_driver+0x200/0x300 [<ffffffff8133bcc6>] driver_register+0x76/0x140 [<ffffffff814e0735>] ? notifier_call_chain+0x55/0x80 [<ffffffff81281ff6>] __pci_register_driver+0x56/0xd0 [<ffffffff810943c5>] ? __blocking_notifier_call_chain+0x65/0x80 [<ffffffffa0113000>] ? hpsa_init+0x0/0x20 [hpsa] [<ffffffffa011301e>] hpsa_init+0x1e/0x20 [hpsa] [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0 [<ffffffff810aca7f>] sys_init_module+0xdf/0x250 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
Resolution
- Upgrade to kernel-2.6.32-220.el6 or later.
Workaround
- Blacklist hpsa module in kdump.conf
blacklist hpsa
Root Cause
Diagnostic Steps
- Verify that the SmartArray support hard reset and is on the supported list of hpsa / cciss controllers.
hpsa 0000:0c:00.0: using doorbell to reset controller hpsa 0000:0c:00.0: PCI INT A -> Link[LNKA] -> GSI 5 (level, low) -> IRQ 5 hpsa 0000:0c:00.0: Waiting for board to reset. hpsa 0000:0c:00.0: board ready after hard reset. hpsa 0000:0c:00.0: Waiting for controller to respond to no-op hpsa 0000:0c:00.0: controller message 03:00 succeeded - Verify that the issue is not the known be2net issue. Add the following lines to /etc/kdump.conf.
extra_modules be2net options be2net multi_rxq=0 default shell
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
