[RHEL6.0] Bringing e1000 NIC up/down using ifconfig in RHEL5.6 KVM guest causes data corruption

Solution Verified - Updated -

Issue

  • The server sometimes panics when bringing e1000 NIC up/down using ifconfig command on a RHEL5.6 KVM guest running on RHEL6.0 host.
  • The panic occurs in various places in the kernel, so it seems some kind of data-corruption problem is causing this panic.
  • Also symptoms such as system stalls, page cache corruption, and error messages like
    swap_free: Bad swap offset entry 01000000
    

  and

mm/memory.c:120: bad pmd ffff81003e1146c8(0000000100000000)

  could also be observed.

  • The problem does not occur on bare-metal kernel.
  • The problem also occurs on RHEL5.6 guest running on RHEL6.1 beta host.
  • The problem does not occur when using virtio or rtl8139 virtual NIC, although a different problem occurs on rtl8139 virtual NIC.
  • The problem occurs on RHEL5.5 guest running on RHEL5.5 host.
  • Below is a panic message which could be frequently seen.
    Unable to handle kernel paging request at 00000001000000cc RIP:
    [<ffffffff8022f131>] consume_skb+0xe/0x61
    PGD 21e3e067 PUD 0
    Oops: 0000 [1] SMP
    last sysfs file: /devices/pci0000:00/0000:00:08.0/irq
    CPU 0
    Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi dm_mirror dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport floppy snd_intel8x0 snd_ac97_codec ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device e1000 snd_pcm_oss snd_mixer_oss tpm_tis snd_pcm snd_timer virtio_balloon tpm snd i2c_piix4 soundcore ide_cd i2c_core snd_page_alloc serio_raw tpm_bios cdrom pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ata_piix libata sd_mod scsi_mod virtio_blk virtio_pci virtio_ring virtio ext3 jbd uhci_hcd ohci_hcd ehci_hcd
    Pid: 10334, comm: ifcnofig Not tainted 2.6.18-238.el5 #1
    RIP: 0010:[<ffffffff8022f131>]  [<ffffffff8022f131>] consume_skb+0xe/0x61
    RSP: 0018:ffff810020ca5d18  EFLAGS: 00010206
    RAX: 0000000000000246 RBX: 0000000100000000 RCX: 00000000000000cd
    RDX: ffffc20000026000 RSI: 0000000000000000 RDI: 0000000100000000
    RBP: 00000000000000ce R08: 000000000000000f R09: ffff81003f2d0860
    R10: ac00810020ca5d20 R11: 0000000080358b00 R12: ffff81003f2d0500
    R13: ffff81003f2d0860 R14: ffff81003f2d0000 R15: ffff81003f4a5010
    FS:  00002b47130565e0(0000) GS:ffffffff80425000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 00000001000000cc CR3: 0000000020ce2000 CR4: 00000000000006e0
    Process ifcnofig (pid: 10334, threadinfo ffff810020ca4000, task ffff810022f8c7a0)
    Stack:  0000000080358b00 ffffc20000028008 00000000000000ce ffffffff88297026
    ffff81003e951680 ffffffff88297064 00000000000000b6 ffff81003f2d0500
    0000000000000001 ffff81003f2d0860 0000000000000000 ffffffff88298159
    Call Trace:
    [<ffffffff88297026>] :e1000:e1000_unmap_and_free_tx_resource+0x49/0x5a
    [<ffffffff88297064>] :e1000:e1000_clean_tx_ring+0x2d/0x8a
    [<ffffffff88298159>] :e1000:e1000_down+0xee/0x102
    [<ffffffff88299697>] :e1000:e1000_close+0x4d/0xc5
    [<ffffffff80233b84>] dev_close+0x53/0x77
    [<ffffffff80232c87>] dev_change_flags+0x5a/0x119
    [<ffffffff80267545>] devinet_ioctl+0x235/0x59c
    [<ffffffff8022a579>] sock_ioctl+0x1c1/0x1e5
    [<ffffffff8004226a>] do_ioctl+0x21/0x6b
    [<ffffffff8003026e>] vfs_ioctl+0x457/0x4b9
    [<ffffffff800b9609>] audit_syscall_entry+0x1a4/0x1cf
    [<ffffffff8004c73b>] sys_ioctl+0x59/0x78
    [<ffffffff8005d28d>] tracesys+0xd5/0xe0
    
    
    Code: 8b 87 cc 00 00 00 ff c8 75 05 0f ae e8 eb 0e f0 ff 8f cc 00
    RIP  [<ffffffff8022f131>] consume_skb+0xe/0x61
    RSP <ffff810020ca5d18>
    
  • Additional info: When the problem occurs, exactly one page in the guest OS is filled with the following pattern.
    0x0000000000000000      0x0000000100000000
    

Environment

  • Host: Red Hat Enterprise Linux 6.0 / RHEL6.1 beta
  • Guest: RHEL5.6

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content