IPoIB neighbor list corruption

Solution Unverified - Updated -

Issue

  • Customer is seeing a list_del() in ipoib_cm_tx_handler() cause a GPF.
  • It appears like we can delete an entry from the neigh list, and then before RCU destorys it, something else can grab a reference to it.
  • Maybe it's not really "corruption," but rather list poisoning nixing something that should be allowed.
  • It happens subsequent to the patch in 358.7.1 on both PPC and x86_64.
  • Thus even though it looks quite similar, it is different than Red Hat Bug 913645.
  • The discussions in the following threads seem related to this issue:
  • Customer was monitoring that thread but it seems to have petered out and they haven't found any places where the discussion related to this issue has come up again.
  • Here's the panic the customer is experiencing:
  #5 [ffff8802338c3c70] general_protection at ffffffff815108c5
     [exception RIP: list_del+16]
     RIP: ffffffff81289020  RSP: ffff8802338c3d20  RFLAGS: 00010082
     RAX: dead000000200200  RBX: ffff880433e60c88  RCX: 0000000000009e6c
     RDX: 0000000000000246  RSI: ffff8806012ca298  RDI: ffff880433e60c88
     RBP: ffff8802338c3d30   R8: ffff8806012ca2e8   R9: 00000000ffffffff
     R10: 0000000000000001  R11: 0000000000000000  R12: ffff8804346b2020
     R13: ffff88032a3e7540  R14: ffff8804346b26e0  R15: 0000000000000246
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  #6 [ffff8802338c3d38] ipoib_cm_tx_handler at ffffffffa066fe0a [ib_ipoib]
  #7 [ffff8802338c3d98] cm_process_work at ffffffffa05149a7 [ib_cm]
  #8 [ffff8802338c3de8] cm_work_handler at ffffffffa05161aa [ib_cm]
  #9 [ffff8802338c3e38] worker_thread at ffffffff81090e10
 #10 [ffff8802338c3ee8] kthread at ffffffff81096c66
 #11 [ffff8802338c3f48] kernel_thread at ffffffff8100c0ca
 Is the x86 equivalent to what we saw on ppc64.
 Here's the other way it happens:
 #5 [ffff880028223d50] general_protection at ffffffff81510e15
     [exception RIP: list_del+16]
     RIP: ffffffff81289220  RSP: ffff880028223e08  RFLAGS: 00010092
     RAX: dead000000200200  RBX: ffff88032d3b8a08  RCX: 00000000000072da
     RDX: 0000000000000246  RSI: 0000000000000001  RDI: ffff88032d3b8a08
     RBP: ffff880028223e18   R8: 0000000000000000   R9: 0000000000000001
     R10: ffff880324abd000  R11: ffff880324174800  R12: ffff88032d3b8dc0
     R13: ffff880339406340  R14: 0000000000000246  R15: ffff8803394066e0
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
  #6 [ffff880028223e20] ipoib_cm_handle_tx_wc at ffffffffa055b4a2 [ib_ipoib]
  #7 [ffff880028223e80] ipoib_poll at ffffffffa0554e74 [ib_ipoib]
  #8 [ffff880028223ee0] net_rx_action at ffffffff8144d473
  #9 [ffff880028223f40] __do_softirq at ffffffff81076fa1

Environment

  • Red Hat Enterprise Linux 6
  • kernels-2.6.32-358.7.1

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.