IPoIB neighbor list corruption
Issue
- Customer is seeing a list_del() in ipoib_cm_tx_handler() cause a GPF.
- It appears like we can delete an entry from the neigh list, and then before RCU destorys it, something else can grab a reference to it.
- Maybe it's not really "corruption," but rather list poisoning nixing something that should be allowed.
- It happens subsequent to the patch in 358.7.1 on both PPC and x86_64.
- Thus even though it looks quite similar, it is different than Red Hat Bug 913645.
- The discussions in the following threads seem related to this issue:
- Subject : list corruption in IPOIB
- Customer was monitoring that thread but it seems to have petered out and they haven't found any places where the discussion related to this issue has come up again.
- Here's the panic the customer is experiencing:
#5 [ffff8802338c3c70] general_protection at ffffffff815108c5
[exception RIP: list_del+16]
RIP: ffffffff81289020 RSP: ffff8802338c3d20 RFLAGS: 00010082
RAX: dead000000200200 RBX: ffff880433e60c88 RCX: 0000000000009e6c
RDX: 0000000000000246 RSI: ffff8806012ca298 RDI: ffff880433e60c88
RBP: ffff8802338c3d30 R8: ffff8806012ca2e8 R9: 00000000ffffffff
R10: 0000000000000001 R11: 0000000000000000 R12: ffff8804346b2020
R13: ffff88032a3e7540 R14: ffff8804346b26e0 R15: 0000000000000246
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#6 [ffff8802338c3d38] ipoib_cm_tx_handler at ffffffffa066fe0a [ib_ipoib]
#7 [ffff8802338c3d98] cm_process_work at ffffffffa05149a7 [ib_cm]
#8 [ffff8802338c3de8] cm_work_handler at ffffffffa05161aa [ib_cm]
#9 [ffff8802338c3e38] worker_thread at ffffffff81090e10
#10 [ffff8802338c3ee8] kthread at ffffffff81096c66
#11 [ffff8802338c3f48] kernel_thread at ffffffff8100c0ca
Is the x86 equivalent to what we saw on ppc64.
Here's the other way it happens:
#5 [ffff880028223d50] general_protection at ffffffff81510e15
[exception RIP: list_del+16]
RIP: ffffffff81289220 RSP: ffff880028223e08 RFLAGS: 00010092
RAX: dead000000200200 RBX: ffff88032d3b8a08 RCX: 00000000000072da
RDX: 0000000000000246 RSI: 0000000000000001 RDI: ffff88032d3b8a08
RBP: ffff880028223e18 R8: 0000000000000000 R9: 0000000000000001
R10: ffff880324abd000 R11: ffff880324174800 R12: ffff88032d3b8dc0
R13: ffff880339406340 R14: 0000000000000246 R15: ffff8803394066e0
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#6 [ffff880028223e20] ipoib_cm_handle_tx_wc at ffffffffa055b4a2 [ib_ipoib]
#7 [ffff880028223e80] ipoib_poll at ffffffffa0554e74 [ib_ipoib]
#8 [ffff880028223ee0] net_rx_action at ffffffff8144d473
#9 [ffff880028223f40] __do_softirq at ffffffff81076fa1
Environment
- Red Hat Enterprise Linux 6
- kernels-2.6.32-358.7.1
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.