What's the coolest thing you've done with Satellite?

Latest response

The world of RHEL, Linux, and Open Source is very big with dozens of paths to be successful with the same task. Customers that run very large Enterprises will frequently use our Red Hat Satellite tool to help keep their systems in order and running their best. It's so popular we actually have a whole discuss groups dedicated just Satellite to talk about all things configuration management, patching, and mass system administration:

 [https://access.redhat.com/groups/red-hat-network-satellite](https://access.redhat.com/groups/red-hat-network-satellite)

With so many folks using Satellite, there must be a lot of good stories out there about it. So please share with us, what are some of the coolest things you've done with Satellite to make your job easier and keeping your orgnization running?

Responses

OK - I'll take the plunge:

Working at a trading house, I had a requirement to completely rebuild the grid processing system at the drop of a hat (rowdy developers tried to create demands that they thought RHEL couldn't meet so that they could end up using Ubuntu). Grid processing ran on 128 blades. Each blade then ran 3 Xen domains (Dev, Test, Prod) on local disk to ensure good separation of environments. I created a kickstart for the physical hosts and a kickstart for the virtual guests.

I could re-provision the physical guests via SSM in Satellite. On their firstboot, they would all create their three Xen domains and initiate an installation of each of them. Total systems provisioned - 512 (128 physical, 384 guests). Build time - 40 minutes.

Took some tweaking of the Satellite, but the biggest factor we found in the installation time was the 3 minute built-in reboot delay once a system starts it's reprovisioning, and the boot time of the blades. The blades we had used a firmware with a particaularly frustrating BIOS boot time. Booting to start the install and then booting at the end of the install was painful.

Issues: every now and then an installation would hang due to a package not being available for download from the Satellite. The system would sit with an interactive prompt on the screen, so I'd have to access the iLO or the virt console and just hit OK to try the package again.

Also a slight niggle with the old versions of the guests all needing manual deletion from the Satellite. This was also done via

The developers had demanded a comlpete rebuild take no longer than 30 minutes. But management had my back and we used the Satellite.

Not the biggest estate in the world, but I was impressed at the ability to rebuild 512 servers so quickly.

D

Wow! Don't downplay yourself Duncan, that's totally awesome! Not many folks can say they built 512 machines in about 40 minutes. Very impressive.

-Chris

This would make for a impressive presentation at one of the RH conferences. Someone at RH should make an attempt to convince Duncan to do this...

I would love to get more details on how you put this together.

--Cary

'm not a presenter, so wouldn't like to take a front seat on anything like that. Sorry.

The key thing that facilitated all this was strict use of DHCP and a good algorithm to determine MAC addresses for guests. The physical machines were already in DHCP for auto network config. When each of them booted, they ran simultaneous scripts to create and kickstart 3 Xen guests on local disk. (Thinking about it, I only hit the fastest rebuild times when I demo'd the thing to the key decision makers. When I did that, I tweaked things to make sure the Dev and UAT systems were cough left out. As long as they saw the Prod grid system come back to life, that's what I cared about :-) )

Anyway - each Xen creation script used the hostname of the physical host to calculate the MAC address to be used for each Xen guest. That way, they would all be rebuilt and come back to life in exactly the same state they had been in before.

The physical installs rarely failed due to Satellite overload, but the virtual guests did quite a bit. Was a pain to have to jump on a serial console to the relevant box just to hit the "retry" option on the screen. Think I raised tickets about this at the time to ask if these could be replaced with automatic retry after a certain time delay. Don't remember what came of that.

I do remember raising tickets to boost the throughput of the Satellite though.

I'm curious, were you able to determine at what point the demand caused the package delivery to need retry.

Anyone have satellite setup in a clustered HA / load balanced configuration?

Ultimately no. The support ticket lost all momentum when a different direction was required. The Xen guests were binned and grid nodes built on the physical hardware only. This allowed the devs to use all of the physical RAM per node for their Java stack rather than having it split equally between Dev, UAT & Prod. From what I remember, this gave a bigger performance boost than running more nodes with less virtual RAM.

By this point I suspected further Apache tuning and more RAM to the Satellite would help, but never got a chance to investigate more.

Will be interesting to see if the provisioning approach used by Katello will scale any better.

I also remember a story from a Red Hatter who wrapped up recorded episodes of Dr Who as RPM's which he would upload to his Satellite and push out to his girlfriend's laptop so she could watch them. That always struck me a pretty cool - although for different reasons.

Haha that's a novel approach...

I've build a spacecmd script that builds my complete RHEL SOE. In case of a disaster scenario, I can install an empty RHN Satellite and run a single command to create the SOE from scratch, including getting software channels, configuration channels, kickstart profiles and snippets, activation keys and groups setup properly.

Wow! That IS pretty cool Magnus! I bet your Business Continutiy folks love that. Hopefully you never need to use it, but if you do it certainly sounds like you're well-prepared.

It also simplifies having 3 separate RHN Satellites. One for development, one for test and one for production.

I wrote a script to export all Satellite channels as bog-standard YUM repositories so that any host could access them (registered to SAT or not). See http://pastebin.com/download.php?i=UUmcF4yT

Lots of people using Spacewalk leverage 'mrepo' but that leaves you with 2x the disk foot print. Though if you use this script, that problem can be solved too.

Very creative Matthew. Thanks for sharing!

-CRob

For each host registered in Sat, I added links on the description page - one link to our Wiki page for the host, the other to the Nagios page for the host. The trouble is that there's no easy way to do this - you have to hack up the internal page-generation code. We put in an Enhancement Request, but nothing ever came of it.

On the other side, we added links from the Wiki and Nagios to the SatServer page for the corresponding host. The Wiki page for a host also links to the corresponding Nagios page, and the Nagios page for a host links to the corresponding Wiki page.

When it's all set up, it works great. We haven't taken the time to retrofit this back into Sat following our last upgrade, so that isn't working right now, but we keep hoping RedHat will add a feature that allows definition of links like this.

Have tried to do this too. Our method used Wiki pages and HP SIM server links, but essentially the same. Would actually be a killer feature in my view.

Don't think we raised any tickets. To be able to provide links to Wiki/Nagios/SIM/vSphere/RHEV etc pages relevant to your system would really help integrate the Satellite with other corporate tools.

D

We've done similar things to system profile pages as well as other pages on the satellite web front end. Persistance across upgrades is fairly easy with our implementation.

Rather than hacking page-generation code, we insert a javascript. That is, append javascript code to /javascript/check_all.js and /templates/footer.pxt. footer.pxt is a simple HTML script element to load your custom javascript. With check_all.js, you do the same thing with javascript (with onLoad event, appendChild to the page body -- where the new child is your javascript).

Your javascript can then figure out which page you are on and, if appropriate make page mods using DOM insertions. Since the Red Hat page loads have already loaded Prototype, that makes working with the DOM a bit easier. Your currently logged-in javascript session can make API (or CGI) calls, or even load other satellite web pages to get data needed to update the page you are currently viewing.

This method gives us the two maintenance insertion points to get our customizations working again after an upgrade.

Another related insertion we use is adding one line to /css/rhn-base.css to load some additional tweeks we like to make.

I've talked to Red Hat about making this even easier in CloudForms. I hope they are listening.

I LOVE it! Now that is something cool and out-of-the-box. I love the work integrating all these different systems under "one pane of glass." It looks like a few of you all had similair thoughts. Great work! Let's hope you get your Sat. issue worked out to get this totally back online.

Cheers,

CRob

So, not necessarily very cool but a colleague and I had some fun putting together a system profile that was used to provison all the Linux workstations at our campus. We registered all the desktop's MAC addresses in cobbler then toggled them to pxe boot to that profile and within a few minutes users could login to a workstation that:

  • Was joined to the school's windows AD domain.

  • Had all the software they needed, like Citrix XenApps client (for which we created a custom RPM), flash plugin, etc.

  • Could print to the appropriate print queues.

The ease and reproducibility of provisioning we achieved got management buy-in to expand the number of Linux workstations available to our community.

I worked at a company and they are following the DTAP method. (Development, Testing, Acceptance, Production).

We wanted that the RHEL updates where first only available on Development systems. After that testing and acceptance and production. For that I made 4 channels in Satellite: Development, Testing, Acceptance, Production and made a Python script which enables the updates and errata for a channel which is giving as a argument. It is only possible for Production to get the updates and errata from Acceptance and Acceptance from Testing and so on.

This will improve the testing of new application for the company because they can track which updates of the OS are installed or not and maybe making a difference of the system and behavior.

did you look at '/usr/bin/spacewalk-clone-by-date'

It's a while ago and I think at the moment I created that python script there wasn't a tool like that.

https://rhn.redhat.com/errata/RHEA-2012-0330.html

Yes :-) It was before version 5.4 :-)

That's interesting. Same goal as what Magnus has chatted about earlier, but all on one box. It really comes down to what you're comfortable with and what your requirements for seperation might be around management of your servers. Both approaches are good. One benefit with having multiple Satellites would be if you experience some type of massive hardware failure with some scripting and DNS work you can fairly quickly get a down Satellite moved over to one of your other online boxes. But excellent idea, thanks for sharing with us all!

-CRob