Several "page allocation failure. order:1, mode:0x20" messages are seen on the console after upgrade to Red Hat Enterprise Linux 6.2
Environment
- Red Hat Enterprise Linux 6 kernels after 2.6.32-220.el6
Issue
The following warning messages are seen on the console after upgrading to update 2:
Mar 18 14:39:17 hostname kernel: glusterfsd: page allocation failure. order:1, mode:0x20
Mar 18 14:39:17 hostname kernel: swapper: page allocation failure. order:1, mode:0x20
Mar 21 11:27:06 hostname kernel: swapper: page allocation failure. order:1, mode:0x20
Mar 21 18:25:45 hostname kernel: swapper: page allocation failure. order:1, mode:0x20
Mar 21 21:59:18 hostname kernel: swapper: page allocation failure. order:1, mode:0x20
-
Swapper is unable to allocate memory due to page allocation failure:
kernel: swapper: page allocation failure. order:1, mode:0x20 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-358.2.1.el6.x86_64 #1 kernel: Call Trace: kernel: <IRQ> [<ffffffff8112c207>] ? __alloc_pages_nodemask+0x757/0x8d0 kernel: [<ffffffff81166ab2>] ? kmem_getpages+0x62/0x170 kernel: [<ffffffff811676ca>] ? fallback_alloc+0x1ba/0x270 kernel: [<ffffffff8116711f>] ? cache_grow+0x2cf/0x320 kernel: [<ffffffff81167449>] ? ____cache_alloc_node+0x99/0x160 kernel: [<ffffffff811683cb>] ? kmem_cache_alloc+0x11b/0x190 kernel: [<ffffffff81439d58>] ? sk_prot_alloc+0x48/0x1c0 kernel: [<ffffffff8143ae32>] ? sk_clone+0x22/0x2e0 kernel: [<ffffffff81489d66>] ? inet_csk_clone+0x16/0xd0 kernel: [<ffffffff814a2c73>] ? tcp_create_openreq_child+0x23/0x450 kernel: [<ffffffff814a046d>] ? tcp_v4_syn_recv_sock+0x4d/0x310 kernel: [<ffffffff814a2a16>] ? tcp_check_req+0x226/0x460 kernel: [<ffffffff8149ff0b>] ? tcp_v4_do_rcv+0x35b/0x430 kernel: [<ffffffff81082034>] ? mod_timer+0x144/0x220 kernel: [<ffffffff814a171e>] ? tcp_v4_rcv+0x4fe/0x8d0 kernel: [<ffffffff814a171e>] ? tcp_v4_rcv+0x4fe/0x8d0 kernel: [<ffffffff8147f50d>] ? ip_local_deliver_finish+0xdd/0x2d0 kernel: [<ffffffff8147f798>] ? ip_local_deliver+0x98/0xa0 kernel: [<ffffffff8147ec5d>] ? ip_rcv_finish+0x12d/0x440 kernel: [<ffffffff8147f1e5>] ? ip_rcv+0x275/0x350 kernel: [<ffffffff814483bb>] ? __netif_receive_skb+0x4ab/0x750 kernel: [<ffffffff8144a798>] ? netif_receive_skb+0x58/0x60 kernel: [<ffffffffa008b975>] ? vmxnet3_rq_rx_complete+0x365/0x890 [vmxnet3] kernel: [<ffffffff8128d2b0>] ? swiotlb_map_page+0x0/0x100 kernel: [<ffffffffa008c0f3>] ? vmxnet3_poll_rx_only+0x43/0xc0 [vmxnet3] kernel: [<ffffffff8144cf63>] ? net_rx_action+0x103/0x2f0 kernel: [<ffffffff81076fb1>] ? __do_softirq+0xc1/0x1e0 kernel: [<ffffffff810e1720>] ? handle_IRQ_event+0x60/0x170 kernel: [<ffffffff8100c1cc>] ? call_softirq+0x1c/0x30 kernel: [<ffffffff8100de05>] ? do_softirq+0x65/0xa0 kernel: [<ffffffff81076d95>] ? irq_exit+0x85/0x90 kernel: [<ffffffff81516f15>] ? do_IRQ+0x75/0xf0 kernel: [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11 kernel: <EOI> [<ffffffff8103b90b>] ? native_safe_halt+0xb/0x10 kernel: [<ffffffff8101495d>] ? default_idle+0x4d/0xb0 kernel: [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110 kernel: [<ffffffff81506d9c>] ? start_secondary+0x2ac/0x2ef
Resolution
Fix
Update to kernel-2.6.32-358.el6 or higher, which contains the enhancement described in the Root Cause section below.
- Please note, this update (or newer) does not completely eliminate the possibility of the occurrence of the page allocation failure.
- The below mentioned workaround also works in 2.6.32-358.el6 and newer if the issue still persists even after the update.
Workaround
The following tunables can be used in an attempt to alleviate or prevent the reported condition:
- Increase
vm.min_free_kbytes
value, for example to a higher value than a single allocation request. - Change
vm.zone_reclaim_mode
to 1 if it's set to zero, so the system can reclaim back memory fromcached
memory.
Both settings can be set in /etc/sysctl.conf
, and loaded using sysctl -p /etc/sysctl.conf
.
For more information on these tunables, install the kernel-doc
package and refer to file /usr/share/doc/kernel-doc-2.6.32/Documentation/sysctl/vm.txt.
Root Cause
Before RHEL 6.4, kswapd does not try to free contiguous pages. This can cause GFP_ATOMIC
allocations requests to fail repeatedly, when nothing else in the system defragments memory. With RHEL 6.4 and newer, kswapd will compact (defragment) free memory, when required.
Please note that allocation failures can still happen. For example, when a larger burst of GFP_ATOMIC
allocations occur which kswapd may struggle to keep up with. However, these allocations should eventually succeed.
There are also other more specific cases that can result in page allocation failures and cause additional issues. Please refer to the following articles for more information:
- Failed
GFP_ATOMIC
allocations by the network stack result in dropped packets, which will be received on a subsequent retransmit. - Network problems with page allocation failures using the mlx4_en driver
- Stale TCP connections with tg3 on Red Hat Enterprise Linux 6
- "page allocation failure" messages occurring on machine for mount.nfs process
- cmahostd, sosreport and cat being seen with page allocation failure error messages in RHEL 6.4
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments