pgsql resource monitor operations time out in a RHEL 6 High Availability cluster with pacemaker

Solution In Progress - Updated -

Issue

  • pacemaker on the primary node for some reason tried to restart postgres after getting a number of unknown errors from the monitoring agent.
  • Our pgsql resource timed out in a monitor operation
  • Why might postgres become unresponsive to monitoring checks from the cluster and time out?
  Jun 29 00:05:06 [19486] node1.example.com       lrmd:  warning: child_timeout_callback:   postgres-service_monitor_4000 process (PID 3908) timed out
  Jun 29 00:05:06 [19486] node1.example.com       lrmd:  warning: operation_finished:   postgres-service_monitor_4000:3908 - timed out after 60000ms
  Jun 29 00:05:06 [19489] node1.example.com       crmd:    error: process_lrm_event:    Operation postgres-service_monitor_4000: Timed Out (node=node1.example.com call=41, timeout=60000ms)

Environment

  • Red Hat Enterprise Linux (RHEL) 6 with the High Availability Add-On
  • pacemaker
  • One or more pgsql resources defined in the CIB to be managed by the cluster

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase of over 48,000 articles and solutions.

Current Customers and Partners

Log in for full access

Log In
Close

Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.