Red Hat Enterprise Linux 7 crashed in the rdma_cm kernel module

Solution Unverified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 7

    • Specifically kernel versions below and not including kernel-3.10.0-1160.46.1.el7
  • Infiniband/RDMA

Issue

  • The system crashed in the rdma_cm kernel module within the cma_comp_exch function attempting to lock spin lock

Resolution

  • Update the kernel to at least kernel-3.10.0-1160.46.1.el7 or above and monitor for additional kernel panics.

Root Cause

  • Upon exposing the RDMA Connection Manager (rdma_cm) layer to userspace via ucma, concurrent access from userspace may interact with multiple structures internally to the kernel modules without locking them down for protection. As such, access could lead to inconsistent states in the structures and ultimately a kernel panic.
  • The patch below wrapped all calls to the RDMA CM layer with a mutex lock to enable concurrent multi-threaded access;

     git show 2efd16d4fba4
        RDMA/ucma: Put a lock around every call to the rdma_cm layer
    
        Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1978075
        CVE: CVE-2020-36385
    [...]
            RDMA/ucma: Put a lock around every call to the rdma_cm layer
    
            The rdma_cm must be used single threaded.
    
            This appears to be a bug in the design, as it does have lots of locking
            that seems like it should allow concurrency. However, when it is all said
            and done every single place that uses the cma_exch() scheme is broken, and
            all the unlocked reads from the ucma of the cm_id data are wrong too.
    
            syzkaller has been finding endless bugs related to this.
    
            Fixing this in any elegant way is some enormous amount of work. Take a
            very big hammer and put a mutex around everything to do with the
            ucma_context at the top of every syscall.
    [...]
    

Diagnostic Steps

  1. If not done so already, setup kdump to generate vmcores for analysis on crashes and crash to analyse the vmcore generated from a crash.
  2. Loading the vmcore, the backtrace can be ascertained with bt:

    RIP: 0010:[<ffffffff98d17a90>]  [<ffffffff98d17a90>] native_queued_spin_lock_slowpath+0x110/0x200
    Call Trace:
    [<ffffffff9937dcf3>] queued_spin_lock_slowpath+0xb/0xf
    [<ffffffff9938bb27>] _raw_spin_lock_irqsave+0x37/0x40
    [<ffffffffc0c79218>] cma_comp_exch+0x28/0x60 [rdma_cm]   <-------------- HERE
    [<ffffffffc0c7da93>] cma_work_handler+0x33/0xa0 [rdma_cm]
    [<ffffffff98cbde8f>] process_one_work+0x17f/0x440
    [<ffffffff98cbefa6>] worker_thread+0x126/0x3c0
    [<ffffffff98cbee80>] ? manage_workers.isra.26+0x2a0/0x2a0
    [<ffffffff98cc5e61>] kthread+0xd1/0xe0
    [<ffffffff98cc5d90>] ? insert_kthread_work+0x40/0x40
    [<ffffffff99395df7>] ret_from_fork_nospec_begin+0x21/0x21
    [<ffffffff98cc5d90>] ? insert_kthread_work+0x40/0x40
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments