A typical workflow for a Satellite 5 installtion involves maintaining strict control over exactly what changes are available to registered systems. This is accomplished by cloning the channels synchronized from Red Hat Network, and limiting the clones to a given subset of the 'current' state of the original channel.
spacewalk-clone-by-date is a tool available as part of the Satellite 5 subscription which aims to ease the process of creating and maintaining cloned channels. However, the results of using
spacewalk-clone-by-date can be different than what the user expects.
EXECUTIVE SUMMARY (aka, tl;dr):
CBD) does the best it can with the data it has available, but sometimes channel metadata can force it to make decisions that are surprising to the end-user.
There are two important things to be aware of when understanding how
CBD drives its functionality off of the dates associated with errata, not packages. So when the user specifies a date,
CBD is going to try to produce a cloned channel that includes only RPMs that are associated with errata that were issued/updated prior to that date.
Second: Not every RPM in a channel is associated with an erratum. When a channel first appears (e.g., when RHEL6 was initially released), the channel is filled with RPMs that have no errata associated with them. This is a channel's 'original state'.
OK - keeping this in mind, here's the naive description of what
CBD does, generally speaking:
1: Create the new clone-channel
2: Link the 'original state' RPMs of the source-channel to the destination-channel
3: Find all errata in the source-channel whose issue/update date is prior to the date selected
4: Clone all those errata into the destination
5: Link all the RPMs associated with those errata into the cloned channel.
In the best of all possible worlds,
CBD would now be done, and everything would make sense. However, there are two additional wrinkles:
Third: the end-state of a cloned channel must result in installable RPMs, or the channel is considered 'broken'. Not a lot of point in being subscribed to a channel to get content, and having
yum install foo fail because the channel is missing dependencies that
foo.rpm relies on!
Fourth: Over the years, RPMs have been added to channel without having an associated erratum, for a variety of reasons. Unfortunately, there is no way for
CBD to tell the difference between the actual 'original state' RPMs, and later additions.
As a result of these, in between steps 2) and 3), we must add these:
2.1: Do dependency-resolution on the current state of the channel
2.2: For each RPM that depsolving says is required to resolve dependencies:
2.2.1: Find the erratum that delivered that RPM to the channel
2.2.2: Clone that erratum into the destination-channel, and add it to the list of errata cloned
2.2.3: Link that erratum's RPMs into the destination-channel
2.2.4: Recursively depsolve on the new state of the channel
2.3: When there are no more 'missing' RPMs, unwind and continue.
Executing these steps will result in the destination-channel having unexpected RPMs included, because they are required to resolve dependencies of RPMs that were added post-original-channel-creation, but without matching errata. As it says in its man-page, "Spacewalk-clone-by-date is tool that provides a best-effort attempt at cloning valid and dependency-complete channels" The key phrase here is "best effort".
There is one last wrinkle that affects the results of
Fifth: sometimes, errata are republished or otherwise modified. If RPMs are added to an erratum after its initial publish-date, it is possible for such an erratum to suddenly find itself dependent on RPMs from newer errata. As a result,
CBD must do a second depsolve step once it has its initial list of errata-to-clone, executing steps 2.1-2.3 as 4.1-4.3.
There isn't enough metadata available to Spacewalk/Satellite5/RHN to be able to recreate a channel in the exact state it was in, as of a given date.
spacewalk-clone-by-date does the best it can with the information it has available to it. As a result:
- destination-channels can have newer RPMs than expected due to RPMs being delivered 'naked' (without an erratum) into the source-channel
- destination-channels can have errata newer than expected due to older errata having been modified after the fact
and, most importantly, from
"This is a tool to assist Administrators with channel creation, it cannot replace them."
spacewalk-clone-by-date, the channel administrator can either accept the result, including its surprises, or manage the channel content 'by hand', removing undesired RPMs/errata.