System panics when booting to kernel-debug due to bug in openvswitch ovs_exit_net function in Red Hat Enterprise Linux 6.5

Solution Verified - Updated -

Issue

After installing kernel-debug, system panics on booting to the debug kernel. Initial suspicion is it is running out of memory.

=================================
[ INFO: inconsistent lock state ]
2.6.32-431.20.3.el6.x86_64.debug #1
---------------------------------
inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
ip/3479 [HC0[0]:SC0[0]:HE1:SE1] takes:
 (&bond->lock){+++?..}, at: [<ffffffffa045b6b2>] bond_vlan_rx_register+0x72/0xa0 [bonding]
{IN-SOFTIRQ-R} state was registered at:
  [<ffffffff810bb9a8>] __lock_acquire+0x5d8/0x1560
  [<ffffffff810bc9d4>] lock_acquire+0xa4/0x120
  [<ffffffff8155fda9>] _read_lock+0x39/0x70
  [<ffffffffa045d00f>] bond_start_xmit+0x43f/0x640 [bonding]
  [<ffffffff8148dc7c>] dev_hard_start_xmit+0x22c/0x490
  [<ffffffff8148e3d0>] dev_queue_xmit+0x200/0x380
  [<ffffffff81495818>] neigh_resolve_output+0x108/0x2f0
  [<ffffffffa03f81c7>] ip6_output_finish+0xb7/0x130 [ipv6]
  [<ffffffffa03fb27b>] ip6_output2+0x2bb/0x2d0 [ipv6]
  [<ffffffffa03fb318>] ip6_output+0x88/0x150 [ipv6]
  [<ffffffffa041aa86>] mld_sendpack+0x466/0x4c0 [ipv6]
  [<ffffffffa041b204>] mld_ifc_timer_expire+0x284/0x310 [ipv6]
  [<ffffffff810898de>] run_timer_softirq+0x20e/0x3e0
  [<ffffffff8107ee6f>] __do_softirq+0xdf/0x210
  [<ffffffff8100c40c>] call_softirq+0x1c/0x30
  [<ffffffff8100fc3d>] do_softirq+0xad/0xe0
  [<ffffffff8107eba5>] irq_exit+0x95/0xa0
  [<ffffffff81566e7a>] smp_apic_timer_interrupt+0x4a/0x5a
  [<ffffffff8100bc93>] apic_timer_interrupt+0x13/0x20
  [<ffffffff8133f9cf>] acpi_safe_halt+0x31/0x51
  [<ffffffff8133fa87>] acpi_idle_do_entry+0x20/0x30
  [<ffffffff8133fb13>] acpi_idle_enter_c1+0x7c/0xd0
  [<ffffffff814546a7>] cpuidle_idle_call+0xa7/0x150
  [<ffffffff81009fcb>] cpu_idle+0xbb/0x110
  [<ffffffff8155506b>] start_secondary+0x2bb/0x2fe
irq event stamp: 4115
hardirqs last  enabled at (4115): [<ffffffff8155f530>] _spin_unlock_irqrestore+0x40/0x80
hardirqs last disabled at (4114): [<ffffffff8155f8c2>] _spin_lock_irqsave+0x32/0xa0
softirqs last  enabled at (4104): [<ffffffff8149eb6d>] sk_filter+0xdd/0x110
softirqs last disabled at (4102): [<ffffffff8149ead5>] sk_filter+0x45/0x110

other info that might help us debug this:
1 lock held by ip/3479:
 #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff8149a9d7>] rtnl_lock+0x17/0x20

stack backtrace:
Pid: 3479, comm: ip Not tainted 2.6.32-431.20.3.el6.x86_64.debug #1
Call Trace:
 [<ffffffff810b97d7>] ? print_usage_bug+0x177/0x180
 [<ffffffff810ba65f>] ? mark_lock+0x23f/0x430
 [<ffffffff810bba08>] ? __lock_acquire+0x638/0x1560
 [<ffffffff810a89c8>] ? sched_clock_cpu+0xb8/0x110
 [<ffffffff8155f530>] ? _spin_unlock_irqrestore+0x40/0x80
 [<ffffffff810babcd>] ? trace_hardirqs_on_caller+0x14d/0x190
 [<ffffffff810bac1d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff810bc9d4>] ? lock_acquire+0xa4/0x120
 [<ffffffffa045b6b2>] ? bond_vlan_rx_register+0x72/0xa0 [bonding]
 [<ffffffff8155f9b6>] ? _write_lock+0x36/0x70
 [<ffffffffa045b6b2>] ? bond_vlan_rx_register+0x72/0xa0 [bonding]
 [<ffffffff8109b761>] ? queue_delayed_work+0x21/0x40
 [<ffffffffa045b6b2>] ? bond_vlan_rx_register+0x72/0xa0 [bonding]
 [<ffffffffa03ea441>] ? register_vlan_dev+0x191/0x1e0 [8021q]
 [<ffffffffa03ed018>] ? vlan_newlink+0xd8/0x100 [8021q]
 [<ffffffff8149b4af>] ? rtnl_newlink+0x4ef/0x580
 [<ffffffff8149b14c>] ? rtnl_newlink+0x18c/0x580
 [<ffffffff81251d1d>] ? selinux_netlink_recv+0x6d/0x90
 [<ffffffff8149a9d7>] ? rtnl_lock+0x17/0x20
 [<ffffffff8149acf7>] ? rtnetlink_rcv_msg+0x2d7/0x340
 [<ffffffff810b71ad>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff8149aa20>] ? rtnetlink_rcv_msg+0x0/0x340
 [<ffffffff814b65a9>] ? netlink_rcv_skb+0xa9/0xd0
 [<ffffffff8149aa05>] ? rtnetlink_rcv+0x25/0x40
 [<ffffffff814b61c7>] ? netlink_unicast+0x2d7/0x320
 [<ffffffff814b717d>] ? netlink_sendmsg+0x2ed/0x440
 [<ffffffff81476173>] ? sock_sendmsg+0x123/0x150
 [<ffffffff81134cdc>] ? unlock_page+0x2c/0x40
 [<ffffffff810a0fd0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff81014bf9>] ? sched_clock+0x9/0x10
 [<ffffffff81475fc4>] ? move_addr_to_kernel+0x64/0x70
 [<ffffffff81477966>] ? __sys_sendmsg+0x406/0x420
 [<ffffffff810a6af3>] ? up_read+0x23/0x40
 [<ffffffff8104ba84>] ? __do_page_fault+0x254/0x4e0
 [<ffffffff812ba61f>] ? debug_check_no_obj_freed+0x18f/0x210
 [<ffffffff81187f69>] ? kmem_cache_free+0x129/0x2b0
 [<ffffffff810babcd>] ? trace_hardirqs_on_caller+0x14d/0x190
 [<ffffffff810bac1d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81477b89>] ? sys_sendmsg+0x49/0x90
 [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
CE: hpet increasing min_delta_ns to 2319534 nsec
powernow-k8: Found 8 AMD Opteron(TM) Processor 6276                  (64 cpu cores) (version 2.20.00)
powernow-k8: Core Performance Boosting: on.
[Firmware Bug]: powernow-k8: No compatible ACPI _PSS objects found.
[Firmware Bug]: powernow-k8: Try again with latest BIOS.
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
SELinux: initialized (dev rpc_pipefs, type rpc_pipefs), uses genfs_contexts
br-eth1: no IPv6 routers present
br-int: no IPv6 routers present
bond0: no IPv6 routers present
bond0.2300: no IPv6 routers present
Ebtables v2.0 registered
ip_tables: (C) 2000-2006 Netfilter Core Team
ip6_tables: (C) 2000-2006 Netfilter Core Team
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
SELinux: initialized (dev proc, type proc), uses genfs_contexts
SELinux: initialized (dev mqueue, type mqueue), uses transition SIDs
lo: Disabled Privacy Extensions
SELinux: initialized (dev proc, type proc), uses genfs_contexts
general protection fault: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu53/topology/thread_siblings_list
CPU 40 
Modules linked in: ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables sunrpc bonding ipv6 8021q garp stp llc openvswitch vxlan vhost_net macvtap macvlan tun kvm_amd kvm microcode tpm_infineon serio_raw sg power_meter hpilo hpwdt be2iscsi iscsi_boot_sysfs libiscsi scsi_transport_iscsi be2net fam15h_power k10temp amd64_edac_mod edac_core edac_mce_amd i2c_piix4 shpchp ext4 jbd2 mbcache sd_mod crc_t10dif hpsa ata_generic pata_acpi pata_atiixp ahci radeon ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: mperf]

Pid: 325, comm: netns Not tainted 2.6.32-431.20.3.el6.x86_64.debug #1 HP ProLiant BL685c G7
RIP: 0010:[<ffffffff812b90d4>]  [<ffffffff812b90d4>] _raw_spin_trylock+0x4/0x40
RSP: 0018:ffff8820128c1cb0  EFLAGS: 00010046
RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b68 RCX: 0000000000000000
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 6b6b6b6b6b6b6b68
RBP: ffff8820128c1cb0 R08: 0000000000000002 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000000 R12: 6b6b6b6b6b6b6b80
R13: 0000000000000000 R14: ffff8840106a18c8 R15: 6b6b6b6b6b6b6b68
FS:  00007f17ac1c5700(0000) GS:ffff8820b1000000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000e8f000 CR3: 0000004012c4a000 CR4: 00000000000407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process netns (pid: 325, threadinfo ffff8820128c0000, task ffff8820128bc680)
Stack:
 ffff8820128c1ce0 ffffffff8155f85a ffffffff8109b145 ffffffff81187d7d
<d> ffff8840106a18a8 00000000ffffffff ffff8820128c1d50 ffffffff8109b145
<d> ffff88203fc00040 0000000000000286 ffff8820128c1d10 ffffffff810bac1d
Call Trace:
 [<ffffffff8155f85a>] _spin_lock_irq+0x4a/0x80
 [<ffffffff8109b145>] ? __cancel_work_timer+0x155/0x1b0
 [<ffffffff81187d7d>] ? cache_free_debugcheck+0x1ad/0x270
 [<ffffffff8109b145>] __cancel_work_timer+0x155/0x1b0
 [<ffffffff810bac1d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81485c90>] ? cleanup_net+0x0/0xa0
 [<ffffffff8109b1d0>] cancel_work_sync+0x10/0x20
 [<ffffffffa03ccac4>] ovs_exit_net+0xc4/0xe0 [openvswitch]
 [<ffffffffa03cca00>] ? ovs_exit_net+0x0/0xe0 [openvswitch]
 [<ffffffff81485cff>] cleanup_net+0x6f/0xa0
 [<ffffffff8109a3fc>] worker_thread+0x21c/0x3d0
 [<ffffffff8109a3ab>] ? worker_thread+0x1cb/0x3d0
 [<ffffffff810bac1d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff810a0fd0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8109a1e0>] ? worker_thread+0x0/0x3d0
 [<ffffffff810a0c46>] kthread+0x96/0xa0
 [<ffffffff8100c30a>] child_rip+0xa/0x20
 [<ffffffff8100bb10>] ? restore_args+0x0/0x30
 [<ffffffff810a0bb0>] ? kthread+0x0/0xa0
 [<ffffffff8100c300>] ? child_rip+0x0/0x20
Code: 83 c7 01 48 83 c6 01 8a 07 83 ea 01 38 06 75 0f 85 d2 75 eb b8 01 00 00 00 c9 c3 0f 1f 40 00 31 c0 c9 c3 90 90 90 90 55 48 89 e5 <8b> 07 89 c2 c1 c0 10 39 c2 8d 90 00 00 01 00 75 04 f0 0f b1 17 
RIP  [<ffffffff812b90d4>] _raw_spin_trylock+0x4/0x40
 RSP <ffff8820128c1cb0>

Environment

  • Red Hat Enterprise Linux (RHEL) 6.5
    • Seen on kernel-2.6.32-431.17.1.el6.x86_64.debug and kernel-2.6.32-431.20.3.el6.x86_64.debug
    • Does not affect systems ealier than RHEL 6.5 as the Open vSwitch (OVS) module was only introduced in 6.5.
    • OpenStack Tech Preview kernels, based on RHEL 6.4,are no longer supported, such as kernel-2.6.32-358.123.2.openstack.el6
    • Open vSwitch

Special Note
Open vSwitch and OpenStack related capabilities have been added into RHEL 6.5 to provide feature enablement for Red Hat OpenStack Platform (RH-OSP) and are only supported with RH-OSP.

While a Red Hat OpenStack Preview used a modified kernel based on RHEL 6.4 to include various feature enablement for Open vSwitch and OpenStack technologies, these features are only tested and supported on the Red Hat Enterprise Linux OpenStack Platform. This special RHEL 6.4 kernel-2.6.32-358.123.2.openstack.el6 was intended only as a Tech Preview and is no longer supported by Red Hat engineering.

This is further explained in RHEL 6.5 Technical Release Notes:

Open vSwitch (OVS) is an open-source, multi-layer software switch designed to be used as a virtual switch in virtualized server environments. Starting with Red Hat Enterprise Linux 6.4, the Open vSwitch kernel module is included as an enabler for Red Hat Enterprise Linux OpenStack Platform. Open vSwitch is only supported in conjunction with Red Hat products containing the accompanying user-space packages. Without theses packages, Open vSwitch will not function and cannot be used with other Red Hat Enterprise Linux variants.

The various OpenStack community platforms, including the RDO distribution of the OpenStack, is not supported by Red Hat Global Support Services.
Please refer to Is the RDO distribution of OpenStack supported by Red Hat?.

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content