Udev rules are getting reloaded frequently and causing Oracle ASM outage
Red Hat Insights can detect this issue
Environment
- Red Hat Enterprise Linux 9
- Red Hat Enterprise Linux 8
- Red Hat Enterprise Linux 7
- Red Hat Enterprise Linux 6
- Oracle DB using ASM disks managed by udev
Issue
- Udev rules are getting reloaded frequently and causing Oracle ASM outage
- After installing
tuned
on system, we are getting high SYS CPU at specific times when thetuned-mpath-iosched
rule is triggered from/lib/udev
.
Resolution
-
There was a known issue with
inotify
watch events configured in the udev rules. Due to this, when a process opens a device for write operation, and then close it, then it could synthesise a change event. And this change event reloads the udev rules configured with ACTION=="add|change".The udev man page describes the
watch
andnowatch
options for 'inotify':$ man udev [...] watch Watch the device node with inotify, when closed after being opened for writing, a change uevent will be synthesised. nowatch Disable the watching of a device node with inotify. [...]
The 'inotify' watch and change event is turned on for most of the devices:
$ fgrep watch -r /lib/udev/rules.d/* | fgrep OPTIONS lib/udev/rules.d/10-dm.rules:OPTIONS:="nowatch" lib/udev/rules.d/11-dm-lvm.rules:OPTIONS:="nowatch" lib/udev/rules.d/13-dm-disk.rules:OPTIONS+="watch" lib/udev/rules.d/60-persistent-storage.rules:KERNEL!="xvd*|sr*", OPTIONS+="watch" lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}!="partition", ATTR{removable}!="1", OPTIONS+="watch" lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}=="partition", OPTIONS+="watch" lib/udev/rules.d/64-md-raid.rules:OPTIONS+="watch" lib/udev/rules.d/80-udisks.rules:# KERNEL=="dm-*", OPTIONS+="watch"
-
To suppress the false positive 'change' events, disable the 'inotify' watch for the devices used by Oracle ASM.
-
Before applying any changes, read through the following KCS documents to ensure you understand what Oracle ASM devices are being used in your environment, and that Oracle's
afd_filter
issue has been addressed:- systemd-udev is causing high CPU utilization on RHEL with Oracle Database server
- How to create Oracle ASM disks using disk or multipath devices in Red Hat Enterprise Linux?
- How to make sure Oracle ASM devices pointing to multipath devices and not scsi paths, sd devices when using ASMLib to manage ASM disks?
- systemd-udev is causing high CPU utilization on RHEL with Oracle Database server
-
Create a udev rule file
/etc/udev/rules.d/96-asm-device.rules
[FN-1] and setnowatch
option:ACTION=="add|change", KERNEL=="sd*", OPTIONS:="nowatch"
If you're using ASM on multipath devices, you'll need to change the "sd" to "dm" as shown in the following example:
ACTION=="add|change", KERNEL=="dm-*", OPTIONS:="nowatch"
If Oracle AFD driver (
oracleafd
) is in use, then disablenowatch
option on disks created under/dev/oracleafd/disks
path as well:ACTION=="add|change", KERNEL=="oracleafd/.*", OPTIONS:="nowatch" ACTION=="add|change", KERNEL=="oracleafd/*", OPTIONS:="nowatch" ACTION=="add|change", KERNEL=="oracleafd/disks/*", OPTIONS:="nowatch"
-
Then use below commands to reload the udev rules configuration:
$ /sbin/udevadm control --reload-rules $ /sbin/udevadm trigger --type=devices --action=change
Note:
-
Above commands will reload the complete udev configuration and will trigger all the udev rules. On a busy production system this could disrupt the ongoing IO operations. So it would be recommended to please use the above commands during a scheduled maintenance window only.
-
The udev rule shown in above snip is provided as an example, there may be further changes required to accommodate the devices used by Oracle ASM. Please get in touch with Red Hat support representative for assistance in setting up the udev rules to selectively disable the 'inotify' watch for the devices used by Oracle ASM.
-
The 'nowatch' option would only disable the 'inotify' watch events. The inotify API provides a mechanism for monitoring the filesystem events. It is used to monitor the individual files or to monitor directories. So, setting the 'nowatch' option for devices does not disable any existing device specific udev rules and any changes to underlying disk (e.g. add/remove/change) would still trigger the udev rules as expected.
-
It is recommended that the oracle
nowatch
options be placed within their own rules file separate from other Oracle rules and that the file number ('96' in the above example) be larger than any other oracle rules file number. You can name the file whatever you want, as long as the lines inside the file match thenowatch
rules above and that rules file is run later (has higher ordinal value) in the list of numbered udev rules to allow other oracle rules to load first. Typically the other oracle asm ownership and symlink rules are numbered in the 50s or 80s, but sometimes are numbered rule 99 (rules are executed in ordinal value 00-99). It that case, renumbering the existing asm ownership/symlink rules file to something like 95-98 and then the nowatch rules as number 99 would be recommended.
-
Root Cause
-
Monitoring the udev events using below commands showed that there were frequent 'change' events triggered for the devices and the rules were getting reloaded.
$ udevadm monitor --property &>> /tmp/udevadm_monitor_property $ udevadm monitor --kernel &>> /tmp/udevadm_monitor_kernel $ udevadm monitor --udev &>> /tmp/udevadm_monitor_udev
-
Here is an example of an open test case with no write triggering the rule
./check_udev /dev/mapper/mpathmn Opened file /dev/mapper/mpathmn with descriptor 3 KERNEL[1433182713.102117] change /devices/virtual/block/dm-1 (block) UDEV [1433182713.823458] change /devices/virtual/block/dm-1 (block) 'check_udev.c' program used for binary 'check_udev': #include <stdio.h> #include <stdlib.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> int main(int argc, char*argv[]) { int fd; fd=open(argv[1],O_RDWR,0660); printf("Opened file %s with descriptor %d\n",argv[1],fd); close(fd); }
-
While investigating the reason for 'change' events generated during device open/close operations, it was found that there is a 'inotify' watch enabled through couple of rules:
$ fgrep watch -r /lib/udev/rules.d/* | fgrep OPTIONS lib/udev/rules.d/10-dm.rules:OPTIONS:="nowatch" lib/udev/rules.d/11-dm-lvm.rules:OPTIONS:="nowatch" lib/udev/rules.d/13-dm-disk.rules:OPTIONS+="watch" lib/udev/rules.d/60-persistent-storage.rules:KERNEL!="xvd*|sr*", OPTIONS+="watch" lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}!="partition", ATTR{removable}!="1", OPTIONS+="watch" lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}=="partition", OPTIONS+="watch" lib/udev/rules.d/64-md-raid.rules:OPTIONS+="watch" lib/udev/rules.d/80-udisks.rules:# KERNEL=="dm-*", OPTIONS+="watch"
The
OPTIONS+="watch"
causes udev to watch the device with 'inotify'.o So, if any program opens a disk device in writable mode with
open()
andclose()
it, then udev synthesises a 'change' uevent. This is because, if the disk is modified, the disk label, UUID, or partitions or anything else for the disk could have changed. The udev daemon has to rescan the disk and update the udev database accordingly.o Due to this
change
event on disk devices, anyudev
rules havingACTION=="add|change"
will get reloaded.
Diagnostic Steps
-
Run following commands from separate terminals and monitor the output to verify if there are any frequent
change
events triggered for the disks:# udevadm monitor --property # udevadm monitor --kernel # udevadm monitor --udev
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments