OpenShift nodes hang on reboot due to NFS lazy umounts called by oci-umount, container tool package
Issue
- Open Shift nodes hang on reboot due to lazy umounts triggered by oci-umount
crash> ps -m | grep UN
[0 00:10:44.049] [UN] PID: 2136 TASK: ffff99f3fef64f10 CPU: 17 COMMAND: "poweroff" <
[0 00:10:55.262] [UN] PID: 31589 TASK: ffff99ff78ba0000 CPU: 7 COMMAND: "java"
[0 00:10:55.345] [UN] PID: 62574 TASK: ffff99f44baa0000 CPU: 17 COMMAND: "java"
[0 00:11:02.028] [UN] PID: 63909 TASK: ffff99ed7cffaf70 CPU: 10 COMMAND: "prometheus"
The shutdown => poweroff task also seen waiting on sync inodes of nfs superblock
crash> ps -p 2136
PID: 0 TASK: ffffffff93016480 CPU: 0 COMMAND: "swapper/0"
PID: 1 TASK: ffff99e9adb28000 CPU: 2 COMMAND: "shutdown"
PID: 2136 TASK: ffff99f3fef64f10 CPU: 17 COMMAND: "poweroff"
crash> bt 2136
PID: 2136 TASK: ffff99f3fef64f10 CPU: 17 COMMAND: "poweroff"
#0 [ffff99f0c577faf8] __schedule at ffffffff92b128d4
#1 [ffff99f0c577fb88] schedule at ffffffff92b12f49
#2 [ffff99f0c577fb98] schedule_timeout at ffffffff92b108b9
#3 [ffff99f0c577fc40] io_schedule_timeout at ffffffff92b1245d
#4 [ffff99f0c577fc70] io_schedule at ffffffff92b124f8
#5 [ffff99f0c577fc80] bit_wait_io at ffffffff92b10ee1
#6 [ffff99f0c577fc98] __wait_on_bit at ffffffff92b10a07
#7 [ffff99f0c577fcd8] wait_on_page_bit at ffffffff92592f11
#8 [ffff99f0c577fd30] __filemap_fdatawait_range at ffffffff92593041
#9 [ffff99f0c577fe08] filemap_fdatawait_keep_errors at ffffffff92595ff7
#10 [ffff99f0c577fe18] sync_inodes_sb at ffffffff9264a395
#11 [ffff99f0c577fee0] sync_inodes_one_sb at ffffffff9264ec39
#12 [ffff99f0c577fef0] iterate_supers at ffffffff9261f093
#13 [ffff99f0c577ff30] sys_sync at ffffffff9264ef14
#14 [ffff99f0c577ff50] system_call_fastpath at ffffffff92b1f7d5
Environment
- OpenShift v3.11.154-1
- Red Hat Enterprise Linux 7.5
kernel 3.10.0-862.el7.x86_64 - oci-umount-2.3.4-2.git87f9237.el7.x86_64
- CRI-O 1.11.16
- docker-1.13.1-104.git4ef4b30.el7.x86_64
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.