5. Kernel-Related Updates

5.1. All Architectures

Bugzilla #467714
The ibmphp module is not safe to unload. Previously, the mechanism that prevented the ibmphp module from unloading was insufficient, and eventually triggered a bug halt. With this update, the method to prevent this module from unloading has been improved, preventing the bug halt. However, attempting to unload the module may produce a warning in the message log, indicating that the module is not safe to unload. This warning can be safely ignored.
Bugzilla #461564
With this update, physical memory will be limited to 64GB for 32-bit x86 kernels running on systems with more than 64GB. The kernel splits memory into 2 separate regions: Lowmem and Highmem. Lowmem is mapped into the kernel address space at all times. Highmem, however, is mapped into a kernel virtual window a page at a time as needed. If memory I/Os are allowed to exceed 64GB, the mem_map (also known as the page array) size can approach or even exceed the size of Lowmem. If this happens, the kernel panics during boot or starts prematurely. In the latter case, the kernel fails to allocate kernel memory after booting and either panics or hangs.
Bugzilla #246233
Previously, if a user pressed the arrow keys continously on a Hardware Virtual Machine (HVM) an interrupt race condition between the hardware interrupt and timer interrupt was encountered. As a result, the keyboard driver reported unknown keycode events. With this update, the i8042 polling timer has been removed, which resolves this issue.
Bugzilla #435705
With this update, the diskdump utility (which provides the ability to create and collect vmcore Kernel dumps) is now supported for use with the sata_svw driver.
Bugzilla #439043
With this update, the "swap_token_timeout" parameter has been added to /proc/sys/vm.
This file contains valid hold time of swap out protection token. The Linux Virtual Memory (VM) subsystem has a token based thrashing control mechanism and uses the token to prevent unnecessary page faults in thrashing situation. The unit of the value is in `second`. The value would be useful to tune thrashing behavior. Setting it to 0 will disable the swap token mechanism.
Bugzilla #439431
Previously, when a NFSv4 (Network File System Version 4) client encountered issues while processing a directory using readdir(), an error for the entire readdir() call was returned. With this update, the fattr4_rdattr_error flag is now set when readdir() is called, instructing the server to continue on and only report an error on the specific directory entry that was causing the issue.
Bugzilla #443655
Previously, the NFS (Network File System) client was not handling malformed replies from the readdir() function. Consequently, the reply from the server would indicate that the call to the readdir() function was successful, but the reply would contain no entries. With this update, the readdir() reply parsing logic has been changed, such that when a malformed reply is received, the client returns an EIO error.
Bugzilla #448076
The RPC client stores the result of a portmap call at a place in memory that can be freed and reallocated under the right circumstances. However, under some circumstances, the result of the portmap call was freed from memory too early, which may have resulted in memory corruption. With this update, reference counting has been added to the memory location where the portmap result is stored, and will only free it after it has been used.
Bugzilla #450743
Under some circumstances, the allocation of some data structures for RPC calls may have been blocked when the system memory was low. Consequently, deadlock may have been encountered under heavy memory pressure when there were a large number of NFS pages awaiting writeback. With this update, the allocation of these data structures is now non-blocking, which resolves this issue.
Bugzilla #451088
Previously, degraded performance may have been encountered when writing to a LVM mirrored volume synchronously (using the O_SYNC flag). Consequently, every write I/O to a mirrored volume was delayed by 3ms, resulting in the mirrored volume being approximately 5-10 times slower than a linear volume. With this update, I/O queue unplugging has been added to the dm-raid1 driver, and the performace of mirrored volumes has been improved to be comparable with that of linear volumes.
Bugzilla #476997
A new tuning parameter has been added to allow system administrators to change the max number of modified pages kupdate writes to disk per iteration each time it runs. This new tunable (/proc/sys/vm/max_writeback_pages) defaults to a value of 1024 (4MB) so that a maximum of 1024 pages get written out by each iteration of kupdate. Increasing this value alters how aggressively kupdate flushes modified pages and decreases the potential amount of data loss if the system crashes between kupdate runs. However, increasing the max_writeback_pages value may have negative performance consequences on systems that are sensitive to I/O loads.
Bugzilla #456911
A new allowable value has been added to the /proc/sys/kernel/wake_balance tunable parameter. Setting wake_balance to a value of 2 will instruct the scheduler to run the thread on any available CPU rather than scheduling it on the optimal CPU. Setting this kernel parameter to 2 will force the scheduler to reduce the overall latency even at the cost of total system throughput.
Bugzilla #475715
When checking a directory tree, the kernel module could, in some circumstances, incorrectly decide the tree was not busy. An active offset mount with an open file handle being used for expires caused the file handle to not count toward the busyness check. This resulted in mount requests being made for already mounted offsets. With this update, the kernel module check has been corrected and incorrect mount requests are no longer generated.
Bugzilla #453470
During system initalization, the CPU vendor was detected after the initialization of the Advanced Programmable Interrupt Controllers (APICs). Consequently, on x86_64 AMD systems with more than 8 cores, APIC clustered mode was used, resulting in suboptimal system performance. With this update, the CPU vendor is now queried prior to initializing the APICs, resulting in APIC physical flat mode being used by default, which resolves this issue.
Bugzilla #462459
The Common Internet File System (CIFS) code has been updated in Red Hat Enterprise Linux 4.8, fixing a number of bugs that had been repaired in upstream, including the following change:
Previously, when mounting a server without Unix extensions, it was possible to change the mode of a file. However, this mode change could not be permanently stored, and may have changed back to the original mode at any time. With this update, the mode of the file cannot be temporarily changed by default; chmod() calls will return success, but have no effect. A new mount option, dynperm needs to be used if the old behavior is required.
Bugzilla #451819
Previously, in the kernel, there was a race condition may have been encountered between dio_bio_end_aio() and dio_await_one(). This may have lead to a situation where direct I/O is left waiting indefinitely on an I/O process that has already completed. With this update, these reference counting operations are now locked so that the submission and completion paths see a unified state, which resolves this issue.
Bugzilla #249775
Previously, upgrading a fully virtualized guest system from Red Hat Enterprise Linux 4.6 (with the kmod-xenpv package installed) to newer versions of Red Hat Enterprise Linux 4 resulted in an improper module dependency between the built-in kernel modules: xen-vbd.ko & xen-vnif.ko and the older xen-platform-pci.ko module. Consequently, file systems mounted via the xen-vbd.ko block driver, and guest networking using the xen-vnif.ko network driver would fail.
In Red Hat Enterprise Linux 4.7, the functionality in the xen-platform-pci.ko module was built-in to the kernel. However, when a formally loadable kernel module becomes a part of the kernel, the symbol dependency check for existing loadable modules is not accounted for in the module-init-tools correctly. With this update, the xen-platform-pci.ko functionality has been removed from the built-in kernel and placed back into a loadable module, allowing the module-init-tools to check and create the proper dependencies during a kernel upgrade.
Bugzilla #463897
Previously, attempting to mount disks or partitions in a 32-bit Red Hat Enterprise Linux 4.6 fully virtualized guest using the paravirtualized block driver(xen-vbd.ko) on a 64-bit host would fail. With this update, the block front driver (block.c) has been updated to inform the block back driver that the guest is using the 32-bit protocol, which resolves this issue.
Bugzilla #460984
Previously, installing the pv-on-hvm drivers on a bare-metal kernel automatically created the /proc/xen directory. Consequently, applications that verify if the system is running a virtualized kernel by checking for the existence of the /proc/xen directory may have incorrectly assumed that the virtualized kernel is being used. With this update, the pv-on-hvm drivers no longer create the /proc/xen directory, which resolves this issue.
Bugzilla #455756
Previously, paravirtualized guests could only have a maximum of 16 disk devices. In this update, this limit has been increased to a maximum of 256 disk devices.
Bugzilla #523930
In some circumstances, write operations to a particular TTY device opened by more than one user (eg, one opened it as /dev/console and the other opened it as /dev/ttyS0) were blocked. If one user opened the TTY terminal without setting the O_NONBLOCK flag, this user's write operations were suspended if the output buffer was full or if a STOP (Ctrl-S) signal was sent. As well, because the O_NONBLOCK flag was not respected, write operations for user terminals opened with the O_NONBLOCK flag set were also blocked. This update re-implements TTY locks, ensuring O_NONBLOCK works as expected, even if a STOP signal is sent from another terminal.
Bugzilla #519692
Previously, the get_random_int() function returned the same number until the jiffies counter (which ticks at a clock interrupt frequency) or process ID (PID) changed, making it possible to predict the random numbers. This may have weakened the ASLR security feature. With this update, get_random_int() is more random and no longer uses a common seed value. This reduces the possibility of predicting the values get_random_int() returns.
Bugzilla #518707
ib_mthca, the driver for Host Channel Adapter (HCA) cards based on the Mellanox Technologies MT25408 InfiniHost III Lx HCA integrated circuit device, uses kmalloc() to allocate large bitmasks. This ensures allocated memory is a contiguous physical block, as is required by DMA devices such as these HCA cards.
Previously, the largest allowed kmalloc() was a 128kB page. If ib_mthca was set to allocate more than 128kB (for example, by setting the num_mutt option to "num_mutt=2097152", causing kmalloc() to allocate 256kB) the driver failed to load, returning the message
Failed to initialize memory region table, aborting.
This update alters the allocation methods of the ib_mthca driver. When mthca_buddy_init() wants more than a page, memory is allocated directly from the page allocator, rather than using kmalloc(). It is now possible to pin large amounts of memory for use by the ib_mthca driver by increasing the values assigned to num_mutt and num_mtt.
Bugzilla #519446
Previously, there were some instances in the kernel where the __ptrace_unlink() function (part of the ptrace system call) used REMOVE_LINKS and SET_LINKS, rather than add_parent and remove_parent, while changing the parent of a process. This approach could abuse the global process list and, as a consequence, create deadlocked and unkillable processes in some circumstances. With this update, __ptrace_unlink() now uses add_parent and remove_parent in every instance, ensuring that deadlocked and unkillable processes cannot be created.

Note

Unkillable or deadlocked processes created by this bug had no effect on system availability.