Poor NFS umount-Handling by systemd?

Latest response

Has anyone run into any issues with NFS mounts and systemd? I have an application that's slow to shut down. The application is hosted, in part, on an NFS share. As a result of the application's slow stop, the umount operation is delayed. Unfortunately, this seems to result in an ugly collision of issues.

Systemd seems to call and execute a stop of the network stack prior to the umount completing. As a result, the NFS client is no longer able to talk to the NFS server (no network connectivity). The NFS client then hangs on the umount because of the inability to talk to the NFS server. Complicating things, there doesn't seem to be a timeout on the NFS client-shutdown and related actions. So, the umount operation hangs "forever". Only resolution to the situation is to hard-cycle the stuck system.

I'd found an issue-thread in the systemd GitHub project that seemed to match the above. Resultant of finding that issue, I opted to try manually shutting down my application prior to issuing my reboot. With the application offline, my reboot proceeded without issue. A second reboot without the application pre-offlined cause the issue to re-manifest.

So, I got to thinking "how can I ensure the application exits sufficiently-quickly that the umount can happen unfettered" (since there didn't seem to be a good way to force the RHEL7 version systemd to enforce a proper dependency-chain between NFS and networking). Digging through the systemd man pages, I found the TimeoutStopSec= option. I experimented with various values, but found that setting a TimeoutStopSec=10 to my application's unit file was the most reliable.

Unfortunately, adding that option means that the application gets terminated at ten seconds rather than being allowed to execute a full, graceful shutdown. Thus far, I'm not seeing problems with this rude-stop, but it doesn't give me the warm fuzzies having to do this. So, wanted to see if anyone might have better ideas for working around this issue (and also see if any RH people that might view this thread can see if there's any open issues in BugZilla).

Responses