Red Hat Customer Portal

Skip to main content

Why is there high CPU usage after inserting the leap second?

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 6 GA
  • Red Hat Enterprise Linux 6.1
  • Red Hat Enterprise Linux 6.2
  • Red Hat Enterprise Linux 6.3

Issue

  • After the insertion of the leap second several processes, notably java, were reporting a high CPU usage.
  • Logged messages shows back trace from ktime_get.

Resolution

Fixing the issue when it occurs:

  • Manually reset the date by executing any of the below commands:
date -s "`LC_ALL=C date`"
date `date +'%m%d%H%M%C%y.%S'`

Resetting the date will resolve any present issues.

Preventing the issue from occurring:

  • RHEL 6.1 EUS: Update to kernel-2.6.32-131.30.2 (from RHBA-2012-1197) or later.
  • RHEL 6.2 EUS: Update to kernel-2.6.32-220.25.1.el6 (from RHBA-2012-1198) or later.
  • RHEL 6.3: Update to kernel-2.6.32-279.5.2 (from RHBA-2012-1199) or later.
  • RHEL 6.4GA and later: already contain the fix for this issue.

Workaround:

If no updated kernel can be installed, ntpd can be used in slew mode which will slew the leapsecond slowly instead of stepping it. When slew mode is used, the kernel discipline is disabled, preventing this bug from occurring.

Root Cause

* Previously in the kernel, when the leap second hrtimer was started, it was
possible that the kernel livelocked on the xtime_lock variable. This update
fixes the problem by using a mixture of separate subsystem locks (timekeeping
and ntp) and removing the xtime_lock variable, thus avoiding the livelock
scenarios that could occur in the kernel. (BZ#840950)

* After the leap second was inserted, applications calling system calls that
used futexes consumed almost 100% of available CPU time. This occurred because
the kernel's timekeeping structure update did not properly update these futexes.
The futexes repeatedly expired, re-armed, and then expired immediately again.
This update fixes the problem by properly updating the futex expiration times by
calling the clock_was_set_delayed() function, an interrupt-safe method of the
clock_was_set() function. (BZ#847366)

For more information about the leap second issue, subscribed customers may refer to the following article - Resolve Leap Second Issues in Red Hat Enterprise Linux.

Diagnostic Steps

  • The following messages were frequently seen in the logs:
Jul  1 02:23:12 server kernel: [1121001.934357] INFO: task java:60028 blocked for more than 120 seconds.
Jul  1 02:23:12 server kernel: [1121001.937203] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul  1 02:23:12 server kernel: [1121001.940147] java          D 0000000000000007     0 60028      1 0x00000080
Jul  1 02:23:12 server kernel: [1121001.943176]  ffff880d775419c8 0000000000000082 0000000501bfe000 ffffc900180cb058
Jul  1 02:23:12 server kernel: [1121001.946293]  ffffc900180cb020 ffffea002b821aa0 ffff880d34fe9580 000000d00002dc40
Jul  1 02:23:12 server kernel: [1121001.949488]  ffff880d34fe9b38 ffff880d77541fd8 000000000000f4e8 ffff880d34fe9b38
Jul  1 02:23:12 server kernel: [1121001.952757] Call Trace:
Jul  1 02:23:12 server kernel: [1121001.956042]  [<ffffffff8109b949>] ? ktime_get_ts+0xa9/0xe0
Jul  1 02:23:21 server kernel: [1121001.959347]  [<ffffffffa04bac10>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
Jul  1 02:23:21 server kernel: [1121001.962724]  [<ffffffff814ed833>] io_schedule+0x73/0xc0
Jul  1 02:23:21 server kernel: [1121001.966138]  [<ffffffffa04bac1e>] nfs_wait_bit_uninterruptible+0xe/0x20 [nfs]
Jul  1 02:23:21 server kernel: [1121001.969643]  [<ffffffff814ee1ef>] __wait_on_bit+0x5f/0x90
Jul  1 02:23:21 server kernel: [1121001.973190]  [<ffffffffa04bac10>] ? nfs_wait_bit_uninterruptible+0x0/0x20 [nfs]
Jul  1 02:23:21 server kernel: [1121001.976738]  [<ffffffff814ee298>] out_of_line_wait_on_bit+0x78/0x90
Jul  1 02:23:21 server kernel: [1121001.980253]  [<ffffffff81090d70>] ? wake_bit_function+0x0/0x50
Jul  1 02:23:21 server kernel: [1121001.983813]  [<ffffffff81110b6e>] ? find_get_page+0x1e/0xa0
Jul  1 02:23:21 server kernel: [1121001.987403]  [<ffffffffa04babff>] nfs_wait_on_request+0x2f/0x40 [nfs]
Jul  1 02:23:21 server kernel: [1121001.990991]  [<ffffffffa04c145e>] nfs_updatepage+0x28e/0x590 [nfs]
Jul  1 02:23:21 server kernel: [1121001.994526]  [<ffffffffa04af5c2>] nfs_write_end+0x152/0x2b0 [nfs]
Jul  1 02:23:21 server kernel: [1121001.998009]  [<ffffffff811116c4>] generic_file_buffered_write+0x174/0x2a0
Jul  1 02:23:21 server kernel: [1121002.001488]  [<ffffffff81112fb0>] __generic_file_aio_write+0x250/0x480
Jul  1 02:23:21 server kernel: [1121002.004944]  [<ffffffff8111324f>] generic_file_aio_write+0x6f/0xe0
Jul  1 02:23:21 server kernel: [1121002.008370]  [<ffffffffa04aeffe>] nfs_file_write+0xde/0x1f0 [nfs]
Jul  1 02:23:21 server kernel: [1121002.011760]  [<ffffffff8117661a>] do_sync_write+0xfa/0x140
Jul  1 02:23:21 server kernel: [1121002.015276]  [<ffffffff81090d30>] ? autoremove_wake_function+0x0/0x40
Jul  1 02:23:21 server kernel: [1121002.018657]  [<ffffffff8120c646>] ? security_file_permission+0x16/0x20
Jul  1 02:23:21 server kernel: [1121002.022012]  [<ffffffff81176918>] vfs_write+0xb8/0x1a0
Jul  1 02:23:21 server kernel: [1121002.025341]  [<ffffffff810d4932>] ? audit_syscall_entry+0x272/0x2a0
Jul  1 02:23:21 server kernel: [1121002.028661]  [<ffffffff81177321>] sys_write+0x51/0x90
Jul  1 02:23:21 server kernel: [1121002.031953]  [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.