Udev rules are getting reloaded frequently and causing Oracle ASM outage

Solution Verified - Updated August 15 2024 at 12:35 PM -

Red Hat Insights can detect this issue

Proactively detect and remediate issues impacting your systems.

View matching systems and remediation

Environment

Red Hat Enterprise Linux 9
Red Hat Enterprise Linux 8
Red Hat Enterprise Linux 7
Red Hat Enterprise Linux 6
Oracle DB using ASM disks managed by udev

Issue

Udev rules are getting reloaded frequently and causing Oracle ASM outage
After installing tuned on system, we are getting high SYS CPU at specific times when the tuned-mpath-iosched rule is triggered from /lib/udev.

Resolution

There was a known issue with inotify watch events configured in the udev rules. Due to this, when a process opens a device for write operation, and then close it, then it could synthesise a change event. And this change event reloads the udev rules configured with ACTION=="add|change".

The udev man page describes the watch and nowatch options for 'inotify':

$ man udev
[...]
   watch
       Watch the device node with inotify, when closed after being opened for writing, a change uevent will be synthesised.

   nowatch
       Disable the watching of a device node with inotify.
[...]

The 'inotify' watch and change event is turned on for most of the devices:

$ fgrep watch -r /lib/udev/rules.d/* | fgrep OPTIONS
lib/udev/rules.d/10-dm.rules:OPTIONS:="nowatch"
lib/udev/rules.d/11-dm-lvm.rules:OPTIONS:="nowatch"
lib/udev/rules.d/13-dm-disk.rules:OPTIONS+="watch"
lib/udev/rules.d/60-persistent-storage.rules:KERNEL!="xvd*|sr*", OPTIONS+="watch"
lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}!="partition", ATTR{removable}!="1", OPTIONS+="watch"
lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}=="partition", OPTIONS+="watch"
lib/udev/rules.d/64-md-raid.rules:OPTIONS+="watch"
lib/udev/rules.d/80-udisks.rules:# KERNEL=="dm-*", OPTIONS+="watch"

To suppress the false positive 'change' events, disable the 'inotify' watch for the devices used by Oracle ASM.
1. Before applying any changes, read through the following KCS documents to ensure you understand what Oracle ASM devices are being used in your environment, and that Oracle's afd_filter issue has been addressed:
2. Create a udev rule file /etc/udev/rules.d/96-asm-device.rules ^[FN-1] and set nowatch option:
  Raw
```
ACTION=="add|change", KERNEL=="sd*", OPTIONS:="nowatch"
```
  If you're using ASM on multipath devices, you'll need to change the "sd" to "dm" as shown in the following example:
  Raw
```
ACTION=="add|change", KERNEL=="dm-*", OPTIONS:="nowatch"
```
  If Oracle AFD driver (oracleafd) is in use, then disable nowatch option on disks created under /dev/oracleafd/disks path as well:
  Raw
```
ACTION=="add|change", KERNEL=="oracleafd/.*", OPTIONS:="nowatch"
ACTION=="add|change", KERNEL=="oracleafd/*", OPTIONS:="nowatch"
ACTION=="add|change", KERNEL=="oracleafd/disks/*", OPTIONS:="nowatch"
```
3. Then use below commands to reload the udev rules configuration:
  Raw
```
$ /sbin/udevadm control --reload-rules
$ /sbin/udevadm trigger --type=devices --action=change
```
Note:
- Above commands will reload the complete udev configuration and will trigger all the udev rules. On a busy production system this could disrupt the ongoing IO operations. So it would be recommended to please use the above commands during a scheduled maintenance window only.
- The udev rule shown in above snip is provided as an example, there may be further changes required to accommodate the devices used by Oracle ASM. Please get in touch with Red Hat support representative for assistance in setting up the udev rules to selectively disable the 'inotify' watch for the devices used by Oracle ASM.
- The 'nowatch' option would only disable the 'inotify' watch events. The inotify API provides a mechanism for monitoring the filesystem events. It is used to monitor the individual files or to monitor directories. So, setting the 'nowatch' option for devices does not disable any existing device specific udev rules and any changes to underlying disk (e.g. add/remove/change) would still trigger the udev rules as expected.
- It is recommended that the oracle nowatch options be placed within their own rules file separate from other Oracle rules and that the file number ('96' in the above example) be larger than any other oracle rules file number. You can name the file whatever you want, as long as the lines inside the file match the nowatch rules above and that rules file is run later (has higher ordinal value) in the list of numbered udev rules to allow other oracle rules to load first. Typically the other oracle asm ownership and symlink rules are numbered in the 50s or 80s, but sometimes are numbered rule 99 (rules are executed in ordinal value 00-99). It that case, renumbering the existing asm ownership/symlink rules file to something like 95-98 and then the nowatch rules as number 99 would be recommended.

Root Cause

Monitoring the udev events using below commands showed that there were frequent 'change' events triggered for the devices and the rules were getting reloaded.

$ udevadm monitor --property  &>>  /tmp/udevadm_monitor_property
$ udevadm monitor --kernel  &>>  /tmp/udevadm_monitor_kernel
$ udevadm monitor --udev  &>> /tmp/udevadm_monitor_udev

Here is an example of an open test case with no write triggering the rule

./check_udev /dev/mapper/mpathmn
 Opened file /dev/mapper/mpathmn with descriptor 3

KERNEL[1433182713.102117] change   /devices/virtual/block/dm-1 (block)
UDEV  [1433182713.823458] change   /devices/virtual/block/dm-1 (block)

'check_udev.c' program used for binary 'check_udev':
 #include <stdio.h>
 #include <stdlib.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <fcntl.h>

 int main(int argc, char*argv[])
 {
 int fd;
 fd=open(argv[1],O_RDWR,0660);
 printf("Opened file %s with descriptor %d\n",argv[1],fd);
 close(fd);
 }

While investigating the reason for 'change' events generated during device open/close operations, it was found that there is a 'inotify' watch enabled through couple of rules:
Raw
```
$ fgrep watch -r /lib/udev/rules.d/* | fgrep OPTIONS
lib/udev/rules.d/10-dm.rules:OPTIONS:="nowatch"
lib/udev/rules.d/11-dm-lvm.rules:OPTIONS:="nowatch"
lib/udev/rules.d/13-dm-disk.rules:OPTIONS+="watch"
lib/udev/rules.d/60-persistent-storage.rules:KERNEL!="xvd*|sr*", OPTIONS+="watch"
lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}!="partition", ATTR{removable}!="1", OPTIONS+="watch"
lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}=="partition", OPTIONS+="watch"
lib/udev/rules.d/64-md-raid.rules:OPTIONS+="watch"
lib/udev/rules.d/80-udisks.rules:# KERNEL=="dm-*", OPTIONS+="watch"
```
The OPTIONS+="watch" causes udev to watch the device with 'inotify'.

o So, if any program opens a disk device in writable mode with open() and close() it, then udev synthesises a 'change' uevent. This is because, if the disk is modified, the disk label, UUID, or partitions or anything else for the disk could have changed. The udev daemon has to rescan the disk and update the udev database accordingly.

o Due to this change event on disk devices, any udev rules having ACTION=="add|change" will get reloaded.

Diagnostic Steps

Run following commands from separate terminals and monitor the output to verify if there are any frequent change events triggered for the disks:
Raw
```
# udevadm monitor --property
# udevadm monitor --kernel
# udevadm monitor --udev
```

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Ansible.com

Red Hat Ecosystem Catalog

Red Hat Hybrid Cloud Console

Red Hat Store

Red Hat Marketplace

Red Hat Summit and AnsibleFest

Udev rules are getting reloaded frequently and causing Oracle ASM outage

Red Hat Insights can detect this issue

Environment

Issue

Resolution

Note:

Root Cause

Diagnostic Steps

Comments

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Red Hat Insights can detect this issue

Environment

Issue

Resolution

Note:

Root Cause

Diagnostic Steps

Comments

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links