Dom0 Xen kernel crashes in the netloop module on Red Hat Enterprise Linux 5.7 or later
Issue
The dom0 kernel of a Xen host may crash with the following (or similar) stack trace in the netloop module:
Unable to handle kernel paging request at ffff8800238a05c0
RIP: [<ffffffff8873c208>] :netloop:loopback_start_xmit+0x123/0x2ea
PGD 105d067 PUD 105e067 PMD 117b067 PTE 0
[...]
Call Trace:
<IRQ> [<ffffffff80424b6d>] dev_hard_start_xmit+0x1b7/0x28a
[<ffffffff80230b81>] dev_queue_xmit+0x31f/0x3ef
[<ffffffff80233073>] ip_output+0x29a/0x2dd
[<ffffffff80235783>] ip_queue_xmit+0x42c/0x486
[<ffffffff8021d32d>] __mod_timer+0xff/0x10e
[<ffffffff802d269e>] __kmalloc+0x8f/0x9f
[<ffffffff80222ad2>] tcp_transmit_skb+0x646/0x67e
[<ffffffff802341b7>] __tcp_push_pending_frames+0x75d/0x849
[<ffffffff8021c647>] tcp_rcv_established+0x818/0x8bd
[<ffffffff8023cc5a>] tcp_v4_do_rcv+0x2a/0x2fa
[<ffffffff8022c1ce>] local_bh_enable+0x9/0x9c
[<ffffffff8878e164>] :ip_conntrack:ip_confirm+0x33/0x39
[<ffffffff80227cbe>] tcp_v4_rcv+0xa23/0xa77
[<ffffffff8044076c>] ip_local_deliver_finish+0x0/0x1eb
[<ffffffff80258542>] nf_hook_slow+0x58/0xbc
[<ffffffff8044076c>] ip_local_deliver_finish+0x0/0x1eb
[<ffffffff8023597c>] ip_local_deliver+0x19f/0x265
[<ffffffff80236cdc>] ip_rcv+0x539/0x57c
[<ffffffff80221590>] netif_receive_skb+0x495/0x4c4
[<ffffffff802319a4>] process_backlog+0x9b/0x104
[<ffffffff8020d0a1>] net_rx_action+0xb4/0x1c6
[<ffffffff80212f06>] __do_softirq+0x8d/0x13b
[<ffffffff8025fda4>] call_softirq+0x1c/0x278
[<ffffffff8026db69>] do_softirq+0x31/0x90
[<ffffffff8025f8d6>] do_hypervisor_callback+0x1e/0x2c
<EOI> [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
[<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
[<ffffffff8026efa8>] raw_safe_halt+0x87/0xab
[<ffffffff8026c553>] xen_idle+0x38/0x4a
[<ffffffff8024ac15>] cpu_idle+0x97/0xba
[<ffffffff80758b11>] start_kernel+0x21f/0x224
[<ffffffff807581e5>] _sinittext+0x1e5/0x1eb
Environment
- Red Hat Enterprise Linux 5.7 or a more recent release in dom0.
- Paravirtualized block devices of a Xen guest are mounted via NFS in dom0 ("tap:aio" scheme).
- Both network traffic and vbd activity are considerable in said Xen guest (networked database server, for example).
- A low NFS timeout ("timeo" mount parameter) can exacerbate the problem.
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
