Sudo hangs when IO logging is enabled

Solution Verified - Updated -

Environment

  • Red Hat Enterprise Linux (RHEL) 7.4
  • sudo-1.8.19p2 with IO logging enabled

Issue

When sudo is configured with IO logging enabled and a command that produces output to terminal is run, it does sometimes hang in a poll() system call indefinitely. Only sending a SIGKILL to the main sudo process terminates it.
Sometimes the child process (the actual command being run in sudo) is left in a zombie (Z) state.

Resolution

Update to sudo-1.8.19p2-14.el7_5 shipped with Advisory RHBA-2018:1992 or newer.

Before the fix was available temporarily disabling IO logging in /etc/sudoers using visudo served as a workaround.

Root Cause

The issue is a blocking poll() system call for input from master PTY /dev/ptmx (the PTY multiplexer) that will not time out. It occurs once the child process terminates, and the input and output buffers are being flushed.
The root cause is the open master PTY file descriptor at the time of the buffer flushing and a subsequent blocking attempt to read output from the already terminated child process.

Diagnostic Steps

  1. Verify in /etc/sudoers, i.e. using visudo, that you have IO logging enabled. Look for log_input and log_output keywords. Configuration of IO logging may look similar to the following:

    Defaults    iolog_dir=/var/log/sudo-io/%{user}, log_input, log_output
    
  2. If you have gdb and the appropriate debuginfo packages installed, a backtrace of the hanged sudo process can be analyzed. You can get a backtrace of a running process using the following command, where $PID is the PID of the hanged sudo process:

    # gdb `which sudo` $PID
    

    A backtrace of the hanged process will look similar to the following:

    # gdb `which sudo` 2516
    GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
    Copyright (C) 2013 Free Software Foundation, Inc.
    ... lines omitted ...
    0x00007fd72683ca20 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:81
    81  T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
    (gdb) bt
    #0  0x00007fd72683ca20 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:81
    #1  0x00007fd726d21675 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46
    #2  sudo_ev_scan_impl (base=base@entry=0x55ce7db9a9e0, flags=flags@entry=0) at ./event_poll.c:155
    #3  0x00007fd726d1ac9e in sudo_ev_loop_v1 (base=base@entry=0x55ce7db9a9e0, flags=flags@entry=0) at ./event.c:302
    #4  0x000055ce7c04a5ba in del_io_events (nonblocking=nonblocking@entry=false) at ./exec_pty.c:1045
    #5  0x000055ce7c04e142 in pty_close (cstat=cstat@entry=0x7ffe00867180) at ./exec_pty.c:895
    #6  0x000055ce7c049917 in sudo_execute (details=details@entry=0x55ce7c262fa0 <command_details>, cstat=cstat@entry=0x7ffe00867180) at ./exec.c:537
    #7  0x000055ce7c054222 in run_command (details=0x55ce7c262fa0 <command_details>) at ./sudo.c:1161
    #8  0x000055ce7c04689c in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ./sudo.c:311
    
  3. Verify what file descriptors we are polling() for:

    (gdb) up
    #1  0x00007fd726d21675 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46
    46  /usr/include/bits/poll2.h: No such file or directory.
    (gdb) up
    #2  sudo_ev_scan_impl (base=base@entry=0x55ce7db9a9e0, flags=flags@entry=0) at ./event_poll.c:155
    155     nready = poll(base->pfds, base->pfd_high + 1, timeout);
    (gdb) print base->pfd_high
    $15 = 0
    (gdb) print base->pfds[0]
    $18 = {fd = 14, events = 1, revents = 0}
    
  4. Check what file the file descriptor belongs to. You can do that by issuing info os files in gdb and looking for sudo process, or you can check lsof. For example by running the follwing command (our PID was 2516, and fd = 14):

    # lsof | grep 2516 | grep '14' | grep 'CHR' | awk '{ print $1,$2,$4,$9 }'
    sudo 2516 14u /dev/ptmx
    
  5. If the file descriptor belongs to /dev/ptmx, then you are likely hitting this issue.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.