What is taking up so much space in /var/lib/pulp?

Latest response

Greetings, hope someone can share some insight here. Running Red Hat Satellite 6.2.9.
The installation doc states that with RHEL 5,6 and 7 synchronized, the runtime size is 500GB. I understand that is an estimate. I only have RHEL 6.3-6.9 and 7.1-7.3 synchronized, I only have optional/debug repos enabled for 7.2 and 7.3, some HA/RS repos and capsule server. Yet /var/lib/pulp is pushing 600GB and I can't expand it right now, my hardware is maxed out. I have disabled the repos I don't need and I've manually ran the cron job to clear orphaned content, yet I was only able to free up about 1GB. Is there any other tool I can run to find unnecessary content, or a certain configuration to avoid that could be wasting space? Just looking for some guidance, thank you.

Responses

Historically debug repos are huge. We have at least 1 TB allocated for /var/lib/pulp on our primary prod satellite. (RHEL5, 6, 7 {optional,extras,supplmentary} and each major kickstart trees as well as RHV, RHV-M, and Satellite repos) I think the docs fail to provide an accurate approximation for storage. It is pretty emphatic about using LVM storage and making sure that pulp is expandable though.

I inadvertently synched a LifeCycle Environment to a capsule the other day, and had to work pretty hard to back it out using the KB's and Discussions on clearning orphaned rpms, but in the end, I don't think it completely frees up space for RPMs or content that are no longer in use.

I have the following files under /var/lib/pulp/nodes/published/https/repos, even though the RHEL 6u1 and 6u2 repositories were disabled, and the related Content Views were deleted. Is it OK to just remove these files? The command to remove orphaned content (foreman-rake katello:delete_orphaned_content RAILS_ENV=production) did not remove them. If so, after removing them would a katello:reindex be necessary?

CV_RHEL6u1_SOE-i386
CV_RHEL6u1_SOE-i386-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_Kickstart_i386_6_1
CV_RHEL6u1_SOE-i386-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_i386_6_1
CV_RHEL6u1_SOE-i386-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_i386_6Server
CV_RHEL6u1_SOE-i386-Red_Hat_Enterprise_Linux_Server-Red_Hat_Satellite_Tools_6_2_for_RHEL_6_Server_RPMs_i386
CV_RHEL6u1_SOE-x86_64
CV_RHEL6u1_SOE-x86_64-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_Kickstart_x86_64_6_1
CV_RHEL6u1_SOE-x86_64-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_x86_64_6Server
CV_RHEL6u1_SOE-x86_64-Red_Hat_Enterprise_Linux_Server-Red_Hat_Satellite_Tools_6_2_for_RHEL_6_Server_RPMs_x86_64
CV_RHEL6u1_SOE-x86_64-Red_Hat_Enterprise_Linux_Server-Red_Hat_Satellite_Tools_6_2_for_RHEL_7_Server_RPMs_x86_64
CV_RHEL6u2_SOE-i386
CV_RHEL6u2_SOE-i386-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_Kickstart_i386_6_2
CV_RHEL6u2_SOE-i386-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_i386_6_2
CV_RHEL6u2_SOE-i386-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_i386_6Server
CV_RHEL6u2_SOE-i386-Red_Hat_Enterprise_Linux_Server-Red_Hat_Satellite_Tools_6_2_for_RHEL_6_Server_RPMs_i386
CV_RHEL6u2_SOE-x86_64
CV_RHEL6u2_SOE-x86_64-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_Kickstart_x86_64_6_2
CV_RHEL6u2_SOE-x86_64-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_x86_64_6_2
CV_RHEL6u2_SOE-x86_64-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_x86_64_6Server
CV_RHEL6u2_SOE-x86_64-Red_Hat_Enterprise_Linux_Server-Red_Hat_Satellite_Tools_6_2_for_RHEL_6_Server_RPMs_x86_64

Pulp default is to retain content by default. A bugzilla was raised for /var/lib/pulp growth on Sat 6.1.10.

In Satellite 6.2, there are a couple of features that may be set on repositories that may be useful in managing space: 1. 'Mirror on Sync': With this feature, each time a repository is synchronized, if a new version of a package is downloaded, the old version is removed. 2. 'On Demand' (Download Policy): With this feature, the content will only get downloaded to the filesystem when a client requests it.

Mirror on sync means 'make the repository on disk match what is in the upstream repository'. It is useful when there are issues with a repo, such as if Red Hat mistakenly releases a package with the Beta repo, you need a means to fix the downstream repos (in your Satellite) after we fix it upstream.

How would I go about setting the configuration/feature to "Mirror on Sync", for the repositories?

This is what we run to make universal repo changes.

#!/bin/bash
declare -a repo_list=$(hammer repository list | awk '/^[0-9]/ {print $1}')
for i in ${repo_list[@]}; do
    hammer repository update --id "$i" --mirror-on-sync true
done

The list of available options is available by "hammer repository update --help".