- Red Hat Enterprise Linux (RHEL) 6
- Red Hat Satellite Proxy 5.7.0
Sosreport fails after satellite plugin timed out
Running 58/81: rpm... Running 59/81: sar... Running 60/81: satellite... [plugin:satellite] command 'spacewalk-debug --dir /var/tmp/sos.1234/sosreport-test-xxx/sos_commands/satellite/spacewalk-debug' timed out after 300s Running 61/81: scsi... ... Running 81/81: yum... Creating compressed archive... Traceback (most recent call last): File "/usr/sbin/sosreport", line 25, in <module> main(sys.argv[1:]) File "/usr/lib/python2.6/site-packages/sos/sosreport.py", line 1520, in main sos.execute() File "/usr/lib/python2.6/site-packages/sos/sosreport.py", line 1499, in execute return self.final_work() File "/usr/lib/python2.6/site-packages/sos/sosreport.py", line 1411, in final_work checksum = self._create_checksum(archive, hash_name) File "/usr/lib/python2.6/site-packages/sos/sosreport.py", line 1349, in _create_checksum archive_fp = open(archive, 'rb') IOError: [Errno 2] No such file or directory: '/var/tmp/sos.1234/sosreport-test-xxx/sosreport-test-xxx.tar.xz'
- The timeouts observed in the satellite
sosplug-in can be resolved by updating to
sos-3.2-63.el6released with Advisory RHBA-2018:1920 or newer, including following enhancements:
The satellite spacewalk-debug command is attempting to archive a huge volume of data ( > 10G) while running spacewalk-debug, and is timeing out, causing sosreport to fail.
sosreport has detected that the archive was missing when attempting to calculate an its checksum (md5sum), but the issue occured prior to that while the arcive was being compressed.
/usr/lib/python2.6/site-packages/sos/sosreport.py 50 # file system errors that should terminate a run 51 fatal_fs_errors = (errno.ENOSPC, errno.EROFS) ... 1372 # package up and compress the results 1373 if not self.opts.build: 1374 old_umask = os.umask(0o077) 1375 if not self.opts.quiet: 1376 print(_("Creating compressed archive...")) 1377 # compression could fail for a number of reasons 1378 try: 1379 archive = self.archive.finalize( 1380 self.opts.compression_type) 1381 except (OSError, IOError) as e: 1382 if e.errno in fatal_fs_errors: # <--- only fatal filesystem errors are handled 1383 print("") 1384 print(_(" %s while finalizing archive" % e.strerror)) 1385 print("") 1386 self._exit(1) 1387 except: 1388 if self.opts.debug: #<--- unless being run with --debug 1389 raise 1390 else: 1391 return False 1392 finally: 1393 os.umask(old_umask) 1394 else: ... 1408 if not self.opts.build: 1409 # compute and store the archive checksum 1410 hash_name = self.policy.get_preferred_hash_name() 1411 checksum = self._create_checksum(archive, hash_name) #<--- exception 1412 self._write_checksum(archive, hash_name, checksum)
In the case where this was caused by the satellite plugin it was found that spacewalk-debug was attempting to archive 11G of log files when it timed out. Looking at the work that sosreport had completed so far may provide insight into why a particular plugin times out or fails.
This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.