pgsql resource monitor operations time out in a RHEL 6 High Availability cluster with pacemaker

Solution In Progress - Updated -

Issue

  • pacemaker on the primary node for some reason tried to restart postgres after getting a number of unknown errors from the monitoring agent.
  • Our pgsql resource timed out in a monitor operation
  • Why might postgres become unresponsive to monitoring checks from the cluster and time out?
  Jun 29 00:05:06 [19486] node1.example.com       lrmd:  warning: child_timeout_callback:   postgres-service_monitor_4000 process (PID 3908) timed out
  Jun 29 00:05:06 [19486] node1.example.com       lrmd:  warning: operation_finished:   postgres-service_monitor_4000:3908 - timed out after 60000ms
  Jun 29 00:05:06 [19489] node1.example.com       crmd:    error: process_lrm_event:    Operation postgres-service_monitor_4000: Timed Out (node=node1.example.com call=41, timeout=60000ms)

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add-On
  • pacemaker
  • One or more pgsql resources defined in the CIB to be managed by the cluster

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content