'udev' rules continuously being reloaded resulted in Oracle ASM diskgroup outage

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux 6
  • Red Hat Enterprise Linux 7
  • Oracle DB using ASM disks created using udev
  • DM-Multipath

Issue

  • The udev rules created for ASM disks are continuously being reloaded which resulted in Oracle ASM diskgroup outage.
  • After installing tuned on system, we are gtting high SYS CPU at specific times when the tuned-mpath-iosched rule is triggered from /lib/udev.

Resolution

  • It has been observed that when Oracle processes are opening the device for writing and then closing it, this synthesizes a change event. And any udev rules having ACTION=="add|change" will get reloaded.

    The udev man pages discusses the watch and nowatch options for inotify:

    $ man udev
    [...]
       watch
           Watch the device node with inotify, when closed after being opened for writing, a change uevent will be synthesised.
    
       nowatch
           Disable the watching of a device node with inotify.
    [...]
    

    The inotify watch and change event is turned on for most devices:

    $ fgrep watch -r /lib/udev/rules.d/* | fgrep OPTIONS
    lib/udev/rules.d/10-dm.rules:OPTIONS:="nowatch"
    lib/udev/rules.d/11-dm-lvm.rules:OPTIONS:="nowatch"
    lib/udev/rules.d/13-dm-disk.rules:OPTIONS+="watch"
    lib/udev/rules.d/60-persistent-storage.rules:KERNEL!="xvd*|sr*", OPTIONS+="watch"
    lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}!="partition", ATTR{removable}!="1", OPTIONS+="watch"
    lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}=="partition", OPTIONS+="watch"
    lib/udev/rules.d/64-md-raid.rules:OPTIONS+="watch"
    lib/udev/rules.d/80-udisks.rules:# KERNEL=="dm-*", OPTIONS+="watch"
    

    To suppress the false positive change events disable the inotify watch for devices used for Oracle ASM using following steps:

    [1] Add a following line at the end of /etc/udev/rules.d/96-asm-device.rules

    ACTION=="add|change", KERNEL=="sd*", OPTIONS:="nowatch"
    

    [2] After adding above line, please restart udevd process to make this change effective:

    $ /sbin/udevadm control --reload-rules
    $ /sbin/udevadm trigger --type=devices --action=change
    
    Note:

    o Above command will reload the complete udev configuration and will trigger all the udev rules. On a busy production system this could disrupt ongoing operations, applications running on the server. So it would be recommended to please use above command during a scheduled maintenance window only.

    o Above udev rule to disable inotify watch is provided only for example, and there may be further changes required in the same depending upon how the Oracle ASM devices are configured on server. Please get in touch with Red Hat support representative in case any assistance is required in setting up the udev rule to selectively disable inotify watch for disk devices used with Oracle ASM. The inotify watch is triggering false positive change events
    when the IO was done on ASM disk and those devices were closed upon completion of IO.

    o The 'nowatch' option would only disable 'inotify' watch. The 'inotify' API provides a mechanism for monitoring filesystem events.
    'Inotify' can be used to monitor individual files or to monitor directories. When a directory is monitored, inotify will return events for the directory itself, and for files inside the directory.

    o So with 'nowatch' any changes with underlying disks would still trigger the rules as expected.

Root Cause

  • Monitoring udev using following commands showed that there were frequent change events triggered for disk devices and the udev rules were being reloaded.

    $ udevadm monitor --property  &>>  /tmp/udevadm_monitor_property
    $ udevadm monitor --kernel  &>>  /tmp/udevadm_monitor_kernel
    $ udevadm monitor --udev  &>> /tmp/udevadm_monitor_udev
    
  • Here is an example of an open test case with no write triggering the rule

    ./check_udev /dev/mapper/mpathmn
     Opened file /dev/mapper/mpathmn with descriptor 3
    
    KERNEL[1433182713.102117] change   /devices/virtual/block/dm-1 (block)
    UDEV  [1433182713.823458] change   /devices/virtual/block/dm-1 (block)
    
    'check_udev.c' program used for binary 'check_udev':
     #include <stdio.h>
     #include <stdlib.h>
     #include <sys/types.h>
     #include <sys/stat.h>
     #include <fcntl.h>
    
     int main(int argc, char*argv[])
     {
     int fd;
     fd=open(argv[1],O_RDWR,0660);
     printf("Opened file %s with descriptor %d\n",argv[1],fd);
     close(fd);
     }
    
  • While checking the reason for change events generated while opening a device it was found that there is a inotify watch enabled for devices through couple of rules:

    $ fgrep watch -r /lib/udev/rules.d/* | fgrep OPTIONS
    lib/udev/rules.d/10-dm.rules:OPTIONS:="nowatch"
    lib/udev/rules.d/11-dm-lvm.rules:OPTIONS:="nowatch"
    lib/udev/rules.d/13-dm-disk.rules:OPTIONS+="watch"
    lib/udev/rules.d/60-persistent-storage.rules:KERNEL!="xvd*|sr*", OPTIONS+="watch"
    lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}!="partition", ATTR{removable}!="1", OPTIONS+="watch"
    lib/udev/rules.d/60-persistent-storage.rules:KERNEL=="xvd*", ENV{DEVTYPE}=="partition", OPTIONS+="watch"
    lib/udev/rules.d/64-md-raid.rules:OPTIONS+="watch"
    lib/udev/rules.d/80-udisks.rules:# KERNEL=="dm-*", OPTIONS+="watch"
    

    The OPTIONS+="watch" causes udev to watch the device with inotify.

    o So, if any program opens a disk device in writable mode with open() and close() it, then the udev daemon synthesizes a kernel change uevent. This is because, if the disk is modified, the disk label, UUID, or partitions or anything else for the disk could have changed. The udev daemon has to rescan the disk and update the udev database accordingly.

    o Due to this change event on disk devices, any udev rules having ACTION=="add|change" will get reloaded.

Diagnostic Steps

  • Run following command from separate terminals to verify if there are any change events getting triggered for disk devices which could trigger the udev rules which has ACTION=="add|change:

    $ udevadm monitor --property
    $ udevadm monitor --kernel
    $ udevadm monitor --udev
    

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments