8.5.4. The Watchman Tool

The Watchman tool is a daemon that is used to protect your OpenShift Enterprise instance against common issues found by Red Hat . The Watchman tool solves these common issues autonomously, and includes the following built-in features:
  • Watchman searches cgroup event flow through syslog to determine when a gear is destroyed. If the pattern does not match a clean gear removal, the gear will be restarted.
  • Watchman monitors the application server logs for messages hinting at out of memory, then restarts the gear if needed.
  • Watchman compares the user-defined status of a gear, then the actual status of the gear, and fixes any dependencies.
  • Watchman searches processes to ensure they belong to the correct cgroup. It kills abandoned processes associated with a stopped gear, or restarts a gear that has zero running processes.
  • Watchman monitors the usage rate of CPU cycles and restricts a gear's CPU consumption if the rate of change is too aggressive.
Watchman capabilities can be expanded with plug-ins. See Section 8.5.4.2, “Supported Watchman Plug-ins” for more information.

8.5.4.1. Enabling Watchman

Watchman is an optional tool that monitors the state of gears and cartridges on a node. It is primarily used to automatically attempt to resolve problems and, if required, restore any gears that have ceased to function.
Enable the Watchman tool persistently using the following command on a node host:
# chkconfig openshift-watchman on