Cron jobs are not running according to its schedule and with no error messages
Environment
- Red Hat Enterprise Linux (
RHEL) 6 - cronie-1.4.4-16.el6_8.2
DellAuthentication Services (a.k.aQASorVAS)- vasclnt-4.1.0-21267.x86_64
- vasgp-4.1.0-21267.x86_64
or SMBdomain authentication
Issue
- Production system presenting random inconsistency with
cronjob execution without leaving traces on system logs. - Seeing
vasdrelated errors on/var/log/messages. - Observed also when executing jobs for
SMBaccounts
Resolution
Fix the authentication backend issue, e.g.
- Stop vasd daemon;
- Monitor cron jobs execution;
- Optionally start vasd daemon.
There is a bugzilla 1630855 against cronie requesting backport of patch fixing sensitivity to SIGPIPE caught during authentication.
Root Cause
Before version 1.4.7 cronie was known to be sensitive to SIGPIPE. This was fixed by upstream commit ee4cbe7.
- System is experiencing failures in the PAM stack in interaction with
- the
VASd(third party) caching daemon. For further analysis onVASdissue, please engage theAuthentication Servicesvendor support team.
or SMBdomain authentication backend
- the
Diagnostic Steps
- Review the
/var/log/messessystem log file forVASderrors:
Sep 6 03:50:04 server .vgptool[61551]: [ERROR vashostaccess.cpp:107] Result: process wait_for failed: Error: No child processes#012 failed to wait for process: /opt/quest/libexec/vas/vasd/vasac_helper
Sep 6 06:50:19 server .vgptool[22573]: [ERROR vashostaccess.cpp:107] Result: process wait_for failed: Error: No child processes#012 failed to wait for process: /opt/quest/libexec/vas/vasd/vasac_helper
Sep 6 11:20:43 server .vgptool[43715]: [ERROR vashostaccess.cpp:107] Result: process wait_for failed: Error: No child processes#012 failed to wait for process: /opt/quest/libexec/vas/vasd/vasac_helper
- Run
crondin debug mode:
# service crond stop
# strace -ttffo /tmp/crond.strace /usr/sbin/crond -x ext,sch,proc,pars,load,misc,test,bit
- Review the
straceoutput generated files:
. . .
16:50:01.965611 open("/opt/quest/lib64/tls/x86_64/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
16:50:01.965661 stat("/opt/quest/lib64/tls/x86_64", 0x7fffbb237e90) = -1 ENOENT (No such file or directory)
16:50:01.965719 open("/opt/quest/lib64/tls/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
16:50:01.965768 stat("/opt/quest/lib64/tls", 0x7fffbb237e90) = -1 ENOENT (No such file or directory)
16:50:01.965811 open("/opt/quest/lib64/x86_64/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
16:50:01.965853 stat("/opt/quest/lib64/x86_64", 0x7fffbb237e90) = -1 ENOENT (No such file or directory)
16:50:01.965909 open("/opt/quest/lib64/librt.so.1", O_RDONLY) = -1 ENOENT (No such file or directory)
16:50:01.965949 stat("/opt/quest/lib64", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
. . .
16:50:01.973671 connect(3, {sa_family=AF_FILE, path="/var/opt/quest/vas/vasd/.vasd40_ipc_sock"}, 110) = 0
16:50:01.973762 fcntl(3, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
16:50:01.973795 fcntl(3, F_SETFL, O_RDWR) = 0
16:50:01.973825 umask(0) = 022
16:50:01.973862 pipe([6, 7]) = 0
16:50:01.973896 poll([{fd=3, events=POLLPRI|POLLOUT}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOUT}])
16:50:01.973935 sendmsg(3, {msg_name(0)=NULL, msg_iov(1)=[{"\1", 1}], msg_controllen=24, {cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, {7}}, msg_flags=0}, MSG_NOSIGNAL) = 1
16:50:01.974020 poll([{fd=3, events=POLLOUT|POLLWRBAND}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOUT|POLLWRBAND|POLLHUP}])
16:50:01.974103 write(3, "\3\0\0\0 \0\0\0\0\0\0\0\0\0\0\0\0\0\0\0VIPC\1\0\0\0\f\0\0\0", 32) = -1 EPIPE (Broken pipe)
16:50:01.974202 --- SIGPIPE (Broken pipe) @ 0 (0) ---
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.
Comments