Sync fails with "PLP0000: CRC check failed" error

Solution Unverified - Updated -

Environment

Red Hat Satellite or Capsule 6.x

Issue

Since last Friday we have an issue with the sync of RHEL 6 base repo. The task is failing with:

PLP0000: CRC check failed 0x971ce6a4 != 0x846637c6L

There are four other tasks in error showing the same CRC check failed:

Id: 81f5fa04-ca3e-4726-abf8-6fb0a7ef6fb2
Label: Actions::Katello::Repository::MetadataGenerate
Name: Metadata generate
Owner: foreman_admin
Execution type: Delayed
Start at: 2017-12-08 08:30:35 UTC
Start before: -
Started at: 2017-12-08 08:30:35 UTC
Ended at:
State: paused
Result: error
Params: {"services_checked"=>["pulp", "pulp_auth"], "locale"=>"en"}
Errors:
PLP0000: CRC check failed 0x971ce6a4 != 0x846637c6L

Resuming the tasks is not helping, they will finish with the same error.

The underlying issue is the extremely low value for 'open files' with ulimit, as can be seen in the Diagnostic Steps below.

Resolution

Increase the open files limit from 1024 (extremely low) to a much larger value (in this case, 65536):

Set it permanently by editing the file /etc/security/limits.conf:

# vi /etc/security/limits.conf
...
*     -   nofile    65536
...

Log out and log back in for the change to take effect.

After this, a few foreman-rake tasks need to be done (deleting orphaned content, cleaning backend objects, and a reimport of content), a restart of satellite services, and an installer run on the satellite if a config change is applied (even if on the current version of Satellite):

# foreman-rake katello:delete_orphaned_content --trace
# foreman-rake katello:clean_backend_objects --trace   <===== if there are objects to be cleaned, re-run this command with added option:  COMMIT=true
# foreman-rake katello:reimport --trace
# satellite-maintain service restart
# satellite-installer        <=====   to run if there are changes in the config

Root Cause

The setting for open files being extremely low (in this case, 1024).

# ulimit
1024

Diagnostic Steps

Looking at /var/log/messages:

...
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008) [Errno 28] No space left on device
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008) Traceback (most recent call last):
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008)   File "/usr/lib/python2.7/site-packages/pulp/plugins/util/metadata_writer.py", line 91, in finalize
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008)     self._close_metadata_file_handle()
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008)   File "/usr/lib/python2.7/site-packages/pulp/plugins/util/metadata_writer.py", line 433, in _close_metadata_file_handle
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008)     super(FastForwardXmlFileContext, self)._close_metadata_file_handle()
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008)   File "/usr/lib/python2.7/site-packages/pulp/plugins/util/metadata_writer.py", line 169, in _close_metadata_file_handle
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008)     self.metadata_file_handle.flush()
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008)   File "/usr/lib64/python2.7/gzip.py", line 383, in flush
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008)     self.fileobj.write(self.compress.flush(zlib_mode))
Dec 12 12:11:21 *** pulp: pulp.plugins.util.metadata_writer:ERROR: (9396-71008) IOError: [Errno 28] No space left on device
Dec 12 12:11:21 *** rsyslogd: fopen() failed: 'No space left on device', path: '/var/lib/rsyslog/imjournal.state.tmp'  [v8.24.0 try http://www.rsyslog.com/e/2013 ]
Dec 12 12:11:21 *** rsyslogd: fopen() failed: 'No space left on device', path: '/var/lib/rsyslog/imjournal.state.tmp'  [v8.24.0 try http://www.rsyslog.com/e/2013 ]
...

While it appears that there is enough disk space:

# df -h

Filesystem                         Size  Used Avail Use% Mounted on
...
/dev/mapper/vg_data01-lv_pulp      295G  227G   69G  77% /var/lib/pulp   <-----------
...

Another thing to check was ulimit:

# ulimit
1024

This is entirely too low. Should be increased to a very large number. If an error states 'too many open files' or 'could not open socket', then this is exactly the error for ulimit being too low.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Comments