9. Example: Remedying Resource Drift

Maintain servers, particularly production or business critical servers and applications, requires keeping rein on the configuration files and packages on those systems. When an unexpected change occurs, the system moves away from the administrator-defined state. That is configuration drift.
JBoss ON can monitor configuration files and target directories and track any changes to those area. This uses a drift definition which sets where the JBoss ON agent monitors configuration and at what frequency. If drift is detected, then the JBoss ON server can fire an alert and run an alert CLI script that reverts, or remedies, the changed configuration files.

9.1. The Plan for the Scripts

There are two different scripts in play because there are two different sets of situations for managing drift:
  • First, there is a script to set up drift for a resource. This shell script runs through a series of setup steps at once:
    1. It creates the drift definition for a resource (through the driftDef.js CLI script).
    2. It creates a generic deploy.xml recipe, zips the drift directory, and creates a new bundle and bundle deployment (through the createBundle.js CLI script).
    3. After waiting for the initial snapshot, it then pins the snapshot to the definition (through the snapshot.js CLI script).
    All of those files are generated by the shell script.
  • An alert definition has to be created through the UI (not the CLI), but it can be configured to use any drift detection as a condition and then to run a server-side script in response. This second script simply deploys the bundle that was made of the pristine base directory and overwrites the drift.

9.2. Setting up the Drift Definition and Preparing the Bundle

The setup script actually runs through three CLI scripts and some system commands. Having all of the steps in a single script makes it possible to set up a drift definition and a backup bundle by running a single command:
[root@server ~]# ./driftBundle.sh

NOTE

Both drift definitions and bundle deployments take a lot of resource- and infrastructure-specific settings. The driftBundle.sh script in this example defines a lot of variables in the script to account for each piece of information.
The variables could be defined using a .conf or even a set of .conf files (cf. Section 6.2, “Creating the Wrapper Script and .conf File”), but for simplicity in this example, all of the variables are defined in the driftBundle.sh script itself.
The first part of the script simply defines the connection settings to use when running the JBoss ON CLI. This example only defines a username and password, so it assumes that the script is run on a system which also has a JBoss ON server running locally. The options could be edited to supply a remote JBoss ON server hostname and port.
There are three general variables defined:
  • The location of the rhq-cli.sh script
  • Any options, such as the username and password, to pass with the CLI command
  • The directory to use both to save the generated JavaScript files and to use for the path to JavaScript files
#!/bin/bash
# options for the CLI
CLI='cliRoot/rhq-remoting-cli-3.1.2.GA/bin/rhq-cli.sh'
OPTS=' -u rhqadmin -p rhqadmin'
SCRIPTS='/opt'
The first part of the script sets up the drift definition. By default, drift is only enabled for a handful of resource types — JBoss servers, Tomcat servers, and platforms — so it is easiest to identify the resource based on a combination of its resource type and name.
Once the resource is identified, then the definition can be created. The full list of possible definition settings is covered in the drift documentation, but a general definition will identify the base directory to monitor, set some rules about what files or subdirectories to ignore (like log files), and set an interval or frequency for drift detection scans.
All of these definition parameters are defined as individual variables in the shell script. In this example, drift is configured for a platform.
# set parameters for the drift definition
RESTYPE='Linux'
RESPLUGIN='Platforms'
RESNAME="server.example.com"
NAME='example drift'
DESC='drift from script'
BASEDIR='/opt/drift'
BASEDIRTYPE='fileSystem'
EXCLUDE='./logs/'
PATTERN=
MODE='normal'
INTERVAL='3600'
The shell script will eventually create a CLI script that is run automatically in the CLI. The first part of the CLI script defines a resource type criteria search for the platform, and then the resource platform itself.
driftDef() {
cat <<-EOF

//set the resource type
var resType = ResourceTypeManager.getResourceTypeByNameAndPlugin("$RESTYPE","$RESPLUGIN");

//get the resource to associate with the drift definition
rcrit = ResourceCriteria()
rcrit.addFilterResourceTypeName("$RESTYPE")
rcrit.addFilterName("$RESNAME")
var resources = ResourceManager.findResourcesByCriteria(rcrit)
var res = resources.get(0)

TIP

This script searches for a single resource to configure for drift. You could also create the script to search for multiple resource and add them to a compatible group, and the iterate through the compatible group to add the drift definition to each resource.
The next part configures the drift definition itself. The DriftDefinitionManager is a wrapper for a Configuration() object. The CLI script first calls for the default drift template for the given resource type and then creates a definition object based on that template.
//get the default template for the resource type
criteria = DriftDefinitionTemplateCriteria()
criteria.addFilterResourceTypeId(resType.id)
templates = DriftTemplateManager.findTemplatesByCriteria(criteria)
template = templates.get(0)
//create a new drift definition instance, based on the template
definition = template.createDefinition()
Once the configuration object is created, then the definition options are assigned values.
This script creates a real drift definition with one exception: it sets a very low scan interval, 30 seconds. In fact, that is the shortest configurable interval. This allows the agent to collect the initial snapshot fairly quickly, which helps the overall setup go faster. This interval will be reset to a more reasonable value (the one defined in the variables) at the end of the script execution.
//set the drift definition configuration options
definition.resource = res
definition.name = '$NAME'
definition.description = '$DESC'
definition.setAttached(false) // this is false so that template changes don't affect the definition
// this is set low to trigger an early initial detection run
definition.setInterval(30)
var basedir = new DriftDefinition.BaseDirectory(DriftConfigurationDefinition.BaseDirValueContext.valueOf('$BASEDIRTYPE'),'$BASEDIR')
definition.basedir = basedir

// there can be multiple exclude statements made, as desired
var f = new Filter("$EXCLUDE", "$PATTERN") // location, pattern
definition.addExclude(f)

//this defaults to normal, which means that any changes will
// trigger an alert. plannedChanges is the other option, which 
// disables alerting for drift changes.
definition.setDriftHandlingMode(DriftConfigurationDefinition.DriftHandlingMode.valueOf('$MODE'))
Once the configuration is complete, it needs to be written to the definition.
//apply the new definition to the resource
DriftManager.updateDriftDefinition(EntityContext.forResource(res.id),definition)

EOF
}

NOTE

The drift definition uses an entity context rather than the resource ID alone to identify the resource. An entity context first identifies the type of object (the entity) and then its associated inventory ID.
There are actually several different steps for creating a "bundle" because there is no one part to a "bundle." The script makes a ZIP archive of the given drift base directory, and that makes the bundle archive. Then, for defining the bundle, there are two steps. There is defining the bundle destination, which is a compatible group to which bundles (any bundles) can be deployed plus the location on the resources for deploying the bundles. Then the package itself is uploaded as a bundle version.
The variables define both the information for the bundle version and bundle archive and for the bundle destination.
There is one other variable included: the path to the CLI's samples directory. Helper functions to create bundle versions, to create bundle destinations, and to deploy specified bundles are already defined in the bundles.js sample script. Using those functions makes deploying bundles very easy.
# options for the bundle
SAMPLES='cliRoot/rhq-remoting-cli-3.1.2.GA/samples'
DESTNAME='drift destination'
BUNDLEDESC='bundle to remediate drift'
BUNDLENAME='driftBundle'
GROUPNAME='Linux Group'
ZIP='driftBundle.zip'
BVER='1.0'
BUNDLE='/opt/bundles/'$ZIP
ARCHIVE='/opt/bundles/'$ZIP
This particular bundle deployment is pretty simple. The target bundle destination is the same as the drift base directory.
Since there are no tokens to realize or external content to pull in, just the backup archive itself, the recipe can be pretty simple. This script creates the recipe (deploy.xml) which is used in the bundle archive.
deploy() {
cat << _EOF_
<?xml version="1.0"?>
<project name="$BUNDLENAME" default="main"
        xmlns:rhq="antlib:org.rhq.bundle">
    <rhq:bundle name="$BUNDLENAME" version="$BVER" description="$BUNDLEDESC">
        <rhq:deployment-unit name="drift" manageRootDir="true">
            <rhq:archive name="$ZIP" exploded="true">
            </rhq:archive>
        </rhq:deployment-unit>
    </rhq:bundle>
<target name="main" />

</project>
_EOF_
}
The bundles.js sample script already defines all of the functions required to deploy bundles, but it relies on the util.js sample script. When the CLI is run non-interactively, there is no way to import an external script that another script requires.
So, this shell script first concatenates the bundles.js and util.js scripts together, and then appends the calls to create the bundle version and the bundle destination.
createBundle() {
cat $SAMPLES/util.js $SAMPLES/bundles.js
cat  << _EOF_

// set the location of the bundle archive
var path = '$BUNDLE'

// create the bundle version in JON
createBundleVersion(path)

// set all of the variables for the bundle destination
var destinationName = '$DESTNAME'
var description = '$BUNDLEDESC'
var bundleName = '$BUNDLENAME'
var groupName = '$GROUPNAME'
var baseDirName = '$BASEDIR'
var deployDir = "."

// create the new destinition in JON
createBundleDestination(destinationName, description, bundleName, groupName, baseDirName, deployDir)

_EOF_
}

NOTE

Make sure that the resource already belongs to a compatible group and that the compatible group has a unique enough name so that it is the only one returned in the search.
The last CLI script created by the shell script pins the initial snapshot to the new drift definition. A snapshot, as the name implies, is a picture of the current settings of the base directory. Pinning a snapshot to a definition sets a baseline, or comparison, for the agent to use to evaluate drift. A pinned snapshot is a specific and identified configuration that must be maintained (as opposed to rolling changes).
Once the snapshot is pinned, this script then resets the drift definition configuration so that it uses a longer (more realistic) interval between scans.
snapshot() {
cat <<- _EOF_
//find the resource
rcrit = ResourceCriteria()
rcrit.addFilterResourceTypeName("$RESTYPE")
rcrit.addFilterName("$RESNAME")
var resources = ResourceManager.findResourcesByCriteria(rcrit)
var res = resources.get(0)

//find the new drift definition
criteria = DriftDefinitionCriteria()
criteria.addFilterName('$NAME')
criteria.addFilterResourceIds(res.id)
def = DriftManager.findDriftDefinitionsByCriteria(criteria)
definition = def.get(0)
definition.setInterval($INTERVAL)

// it is necessary to redefine the complete configuration when you're 
// resetting the interval or the other values will be overwritten with default 
// or set to null
var basedir = new DriftDefinition.BaseDirectory(DriftConfigurationDefinition.BaseDirValueContext.valueOf('$BASEDIRTYPE'),'$BASEDIR')
definition.basedir = basedir
definition.name = '$NAME'
// there can be multiple exclude statements made, as desired
var f = new Filter("$EXCLUDE", "$PATTERN") // location, pattern
definition.addExclude(f)
DriftManager.updateDriftDefinition(EntityContext.forResource(res.id),definition)

// pin to the initial snapshot, which is version 0
// this gets the most recent snapshot if that is the better version to use
// snap = DriftManager.getSnapshot(DriftSnapshotRequest(definition.id))
DriftManager.pinSnapshot(definition.id,0)
_EOF_
}
The last part of the script actually runs all of the defined JBoss ON CLI scripts and sets up both the drift definition and the bundle definition (as a backup in case any drift is detected).
There are two system commands sandwiched between the JBoss ON CLI scripts. The first is the zip commands to create the bundle archive. The second is a sleep command which pauses the script to give the JBoss ON agent time to collect the initial snapshot for drift before attempting to pin the snapshot.
# create the drift definition

driftDef > $SCRIPTS/driftDef.js
$CLI $OPTS -f $SCRIPTS/driftDef.js


# create the recipe file and then zip up the 
# drift base directory to make the bundle archive

deploy > /deploy.xml
zip $ARCHIVE $BASEDIR
zip $BUNDLE $ARCHIVE /deploy.xml

# create the bundle from the recipe and archive
# and then create the bundle definition 

createBundle > $SCRIPTS/createBundle.js
$CLI $OPTS -f $SCRIPTS/createBundle.js


# sleep to allow the server to get the first snapshot
# this only sleeps for a minute, but it really depends on your environment
# whether that is long enough

sleep 1m

# this pins the new snapshot to the new drift definition
# and then changes the drift interval to the longer, variable-specified
# value

snapshot > $SCRIPTS/snapshot.js
$CLI $OPTS -f $SCRIPTS/snapshot.js
There is no error handling in this shell script. If any step fails, like the initial snapshot taking longer than the sleep period, there is no indication of what went wrong aside from malformed drift or bundle configuration.

9.3. Remedying Drift

To remediate drift, define an alert in the UI and upload a CLI script which can be run, automatically, whenever drift is detected. All the script has to do is deploy the backup bundle to the resource, and there are several different ways to do that.
This example goes through all the basic steps: it pulls the resource information from the alert, searches for the bundle version, and then deploys it to the resource. One nifty thing about this script is that it writes a log file, capturing the alert information that triggered the remediation.
This script can be uploaded directly when the alert definition is created. Before uploading the script, be sure to set the variables to the bundle destination and bundle version that you created when the drift definition was set up.
// - The 'alert' variable is seeded by the alert sender

// SET THESE VARIABLES
var bundleDestinationName = 'drift destination'
var bundleVersion = 1.0
var logFile = '/tmp/alert-cli-demo/logs/alert-' + alert.id + '.log'

// Log what we're doing to a file tied to the fired alert id
//
var e = exporter
e.setTarget( 'raw', logFile )

// Dump the alert
//
e.write( alert )

// get a proxy for the alerted-on Resource
//
var alertResource = ProxyFactory.getResource(alert.alertDefinition.resource.id)

// Dump the resource
//
e.write( " " )
e.write( alertResource )


// Remediate file

// Find the Bundle Destination
//
var destCrit = new BundleDestinationCriteria()
destCrit.addFilterName( bundleDestinationName )
var result = BundleManager.findBundleDestinationsByCriteria( destCrit )
var dest = result.get( 0 )

// Find the Bundle Version
//
var versionCrit = new BundleVersionCriteria()
versionCrit.addFilterVersion( bundleVersion )
result = BundleManager.findBundleVersionsByCriteria( versionCrit )
var ver = result.get( 0 )

// Create a new Deployment for the bundle version and the destination
//
var deployment = BundleManager.createBundleDeployment(ver.getId(), dest.getId(), 'remediate drift', new Configuration())

// Schedule a clean deploy of the deployment. This will wipe out the edited file and lay down a clean copy
//
BundleManager.scheduleBundleDeployment(deployment.getId(), true)

e.write( " " )
e.write( "REMEDIATION COMPLETE!" )