Bad page state/map or Bad rss-counter state followed by kernel crash or soft lockup Red Hat Enterprise Linux.

Solution Unverified - Updated -

Issue

  • Crash with following log:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
IP: [<ffffffff81079d42>] dup_mm+0x272/0x520
PGD 404edde067 PUD 404f5b2067 PMD 0 
Oops: 0000 [#1] SMP 

Pid: 33631, comm: rhsmcertd-worke Not tainted 2.6.32-696.3.1.el6.x86_64 #1 HP ProLiant XL170r Gen9/ProLiant XL170r Gen9
RIP: 0010:[<ffffffff81079d42>]  [<ffffffff81079d42>] dup_mm+0x272/0x520
RSP: 0018:ffff88404c523d70  EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff88404c6c7700 RCX: 0000000000000000
RDX: ffff882050740e80 RSI: ffff88204c794c78 RDI: ffff88404ed2d530
RBP: ffff88404c523de0 R08: ffff882050d340c0 R09: 0000000000000000
R10: 00007fb777d01000 R11: ffff88404eca46e8 R12: ffff88404ed2dc50
R13: ffff88404ed2d530 R14: ffff88204c794c78 R15: ffff88404ed2dc38
FS:  00007fb782286700(0000) GS:ffff8820f0dc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 000000404d271000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Process rhsmcertd-worke (pid: 33631, threadinfo ffff88404c520000, task ffff88404eac4040)
Stack:
 00000000000000d0 0000000000000000 ffff88404c6c7768 ffff88404fb38c68
<d> ffff88404fb38c00 ffff88404ed2dc70 ffff88404ed2dc78 0000000000000000
<d> 00007fb7822869d0 0000000001200011 ffff88404cf89520 0000000000000000
Call Trace:
 [<ffffffff8107b112>] copy_process+0xe12/0x1520
 [<ffffffff8107b8b6>] do_fork+0x96/0x4c0
 [<ffffffff811ba222>] ? alloc_fd+0x92/0x160
 [<ffffffff81196997>] ? fd_install+0x47/0x90
 [<ffffffff81009598>] sys_clone+0x28/0x30
 [<ffffffff8100b3f3>] stub_clone+0x13/0x20
 [<ffffffff8100b0d2>] ? system_call_fastpath+0x16/0x1b
Code: 49 8b 45 30 49 8b 95 98 00 00 00 49 c7 45 20 00 00 00 00 49 c7 45 18 00 00 00 00 80 e4 df 48 85 d2 49 89 45 30 74 6c 48 8b 42 18 <48> 8b 48 10 48 8b 82 b8 00 00 00 f0 48 ff 42 30 41 f6 45 31 08 
RIP  [<ffffffff81079d42>] dup_mm+0x272/0x520
 RSP <ffff88404c523d70>
CR2: 0000000000000010
  • Another Pattern
RIP  [<ffffffff81140dd8>] free_pcppages_bulk+0x318/0x470
 RSP <ffff880434b9bb68>
CR2: ffffffff0042cdd8
  • Another Pattern.
WARNING: CPU: 20 PID: 186113 at mm/mmap.c:3031 exit_mmap+0x196/0x1a0
BUG: Bad rss-counter state mm:ffff881fbe5a3200 idx:0 val:1209
BUG: Bad rss-counter state mm:ffff881fbe5a3200 idx:1 val:2865
BUG: unable to handle kernel paging request at 0000000000400000
IP: [<ffffffff810864c7>] dup_mm+0x237/0x6f0
PGD 8000005fbeae4067 PUD 5fbc697067 PMD 5fb863d067 PTE 5fb9c7c025
Oops: 0003 [#1] SMP
  • Another Pattern.
 BUG: Bad rss-counter state mm:ffff8f06643b3e80 idx:1 val:3
 NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/u128:1:76]
Call Trace:
 [<ffffffff903a4ce1>] pagevec_lookup_tag+0x21/0x30
 [<ffffffffc0103acc>] mpage_prepare_extent_to_map+0xfc/0x2e0 [ext4]
 [<ffffffff903faf82>] ? kmem_cache_alloc+0x1c2/0x1f0
 [<ffffffffc00d2a93>] ? jbd2__journal_start+0xf3/0x1f0 [jbd2]
 [<ffffffffc010857c>] ? ext4_writepages+0x42c/0xd40 [ext4]
 [<ffffffffc01366a9>] ? __ext4_journal_start_sb+0x69/0xe0 [ext4]
 [<ffffffffc01085a7>] ext4_writepages+0x457/0xd40 [ext4]
 [<ffffffff903a3b81>] do_writepages+0x21/0x50
 [<ffffffff9044cfc0>] __writeback_single_inode+0x40/0x260
 [<ffffffff9044da54>] writeback_sb_inodes+0x1c4/0x490
 [<ffffffff9044ddbf>] __writeback_inodes_wb+0x9f/0xd0
 [<ffffffff9044e5f3>] wb_writeback+0x263/0x2f0
 [<ffffffff9043ab0c>] ? get_nr_inodes+0x4c/0x70
 [<ffffffff9044ef7b>] bdi_writeback_workfn+0x2cb/0x460
 [<ffffffff902b613f>] process_one_work+0x17f/0x440
 [<ffffffff902b71d6>] worker_thread+0x126/0x3c0
 [<ffffffff902b70b0>] ? manage_workers.isra.24+0x2a0/0x2a0
 [<ffffffff902bdf21>] kthread+0xd1/0xe0
 [<ffffffff902bde50>] ? insert_kthread_work+0x40/0x40
 [<ffffffff909255f7>] ret_from_fork_nospec_begin+0x21/0x21
 [<ffffffff902bde50>] ? insert_kthread_work+0x40/0x40

  • Another Pattern.
CPU: 2 PID: 77019 Comm: vertica Kdump: loaded Tainted: G    B          ------------   3.10.0-862.14.4.el7.x86_64 #1
Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v1.0 11/26/2012
 Call Trace:
  [<ffffffff91713754>] dump_stack+0x19/0x1b
  [<ffffffff911c3991>] print_bad_pte+0x1f1/0x290
  [<ffffffff911c65bf>] unmap_page_range+0xaff/0xc30
  [<ffffffff911c6771>] unmap_single_vma+0x81/0xf0
  [<ffffffff911c7a49>] unmap_vmas+0x49/0x90
  [<ffffffff911cdb3e>] unmap_region+0xbe/0x140
  [<ffffffff911ce131>] ? __vma_rb_erase+0x121/0x220
  [<ffffffff911d0145>] do_munmap+0x295/0x470
  [<ffffffff911d0385>] vm_munmap+0x65/0xb0
  [<ffffffff911d15a2>] SyS_munmap+0x22/0x30
  [<ffffffff9172579b>] system_call_fastpath+0x22/0x27
  • Another pattern.
 BUG: Bad page state in process vertica pfn:aafbff
 page:ffffe455eabeffc0 count:0 mapcount:-1 mapping:ffff8f05131bceb8 index:0x23fe
 page flags: 0x2fffff0002001c(referenced|uptodate|dirty|mappedtodisk)
 page dumped because: non-NULL mapping
Call Trace:
  [<ffffffff90913754>] dump_stack+0x19/0x1b
  [<ffffffff9090eba8>] bad_page.part.76+0xdc/0xf9
  [<ffffffff9039f5e0>] free_pages_prepare+0x170/0x190
  [<ffffffff903a0054>] free_hot_cold_page+0x74/0x160
  [<ffffffff903a4ef3>] __put_single_page+0x23/0x30
  [<ffffffff903a4f37>] put_page+0x37/0x50
  [<ffffffff9040ad27>] __split_huge_page+0x357/0x850
  [<ffffffff9040b296>] split_huge_page_to_list+0x76/0xf0
  [<ffffffff9040bd90>] __split_huge_page_pmd+0x1d0/0x5c0
  [<ffffffff903a0186>] ? free_hot_cold_page_list+0x46/0xa0
  [<ffffffff903c669d>] unmap_page_range+0xbdd/0xc30
  [<ffffffff903dddbd>] ? free_pages_and_swap_cache+0xad/0xd0
  [<ffffffff903c6771>] unmap_single_vma+0x81/0xf0
  [<ffffffff903c7bad>] zap_page_range+0x11d/0x190
  [<ffffffff9055f3d8>] ? call_rwsem_down_read_failed+0x18/0x30
  [<ffffffff903c2add>] SyS_madvise+0x3cd/0x9c0
  [<ffffffff902d0438>] ? task_sched_runtime+0xa8/0x110
  [<ffffffff9092579b>] system_call_fastpath+0x22/0x27

Environment

  • Red Hat Enterprise Linux 7
  • Seen on 2.6.32-696.3.1.el6 2.6.32-696.30.1.el6 2.6.32-754.el6
  • Seen on 3.10.0.693.11.6.el7 ~ 3.10.0-957.12.1
  • vmware guest / Microsoft Hyper-v / physical machine
  • docker-container / vertica database

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content