crmd times out when communicating with lrmd causing the crmd process to respawn

Solution Verified - Updated -

Issue

  • The following issue occurred:

    2018-07-10T19:48:39.636600-04:00 node42 lrmd[29721]:  notice: Lost reply from stonith-ng (0x5638def4e7f0) finally arrived, sending re-enabled
    2018-07-10T19:49:00.616169-04:00  node42 crmd[27617]: warning: Request 291 to lrmd (0x55caa97e0070) failed: Connection timed out (-110) after 20000ms
    2018-07-10T19:49:00.616466-04:00 node42 crmd[27617]:   error: Couldn't perform lrmd_rsc_exec operation (timeout=20000): -110: Connection timed out (110)
    2018-07-10T19:49:00.616616-04:00 node42 crmd[27617]:   error: Operation monitor on dlm failed: -70
    2018-07-10T19:49:00.616755-04:00 node42 crmd[27617]: warning: Input I_FAIL received in state S_NOT_DC from do_lrm_rsc_op
    2018-07-10T19:49:00.616892-04:00 node42 crmd[27617]:  notice: State transition S_NOT_DC -> S_RECOVERY
    2018-07-10T19:49:00.617041-04:00 node42 crmd[27617]: warning: Fast-tracking shutdown in response to errors
    2018-07-10T19:49:00.617231-04:00 node42 crmd[27617]:   error: Input I_TERMINATE received in state S_RECOVERY from do_recover
    2018-07-10T19:49:00.617403-04:00 node42 crmd[27617]:  notice: Stopped 0 recurring operations at shutdown (1 remaining)
    2018-07-10T19:49:00.617538-04:00 node42 crmd[27617]:   error: 1 pending LRM operation at shutdown
    2018-07-10T19:49:00.617682-04:00 node42 crmd[27617]:   error: Pending action: scsi:873 (scsi_start_0)
    2018-07-10T19:49:00.617865-04:00 node42 crmd[27617]:  notice: Disconnected from the LRM
    2018-07-10T19:49:00.618004-04:00 node42 crmd[27617]:  notice: Disconnected from Corosync
    2018-07-10T19:49:00.618712-04:00 node42 cib[29717]: warning: new_event_notification (29717-27617-11): Broken pipe (32)
    2018-07-10T19:49:00.618874-04:00 node42 cib[29717]: warning: A-Sync reply to crmd failed: No message of desired type
    2018-07-10T19:49:00.619007-04:00 node42 crmd[27617]:  notice: Disconnected from the CIB
    2018-07-10T19:49:00.619172-04:00 node42 crmd[27617]:   error: Could not recover from internal error
    2018-07-10T19:49:00.619954-04:00 node42 pacemakerd[29387]:   error: The crmd process (27617) exited: Generic Pacemaker error (201)
    2018-07-10T19:49:00.620163-04:00 node42 pacemakerd[29387]:  notice: Respawning failed child process: crmd
    
  • The following error occurred:

    Oct 15 14:58:18.773 node42 pacemaker-based     [124259] (pcmk__compress)    error: Compression of 2490302 bytes failed: output data will not fit into the buffer provided | bzerror=-8
    Oct 15 14:58:18.773 node42 pacemaker-based     [124259] (pcmk__ipc_prepare_iov)     error: Could not compress 2490302-byte message into less than IPC limit of 131072 bytes; set PCMK_ipc_buffer to higher value (9961208 bytes suggested)
    

Environment

  • Red Hat Enterprise Linux (RHEL) 7,8,9 with the High Availability or Resilient Storage Add On
  • OpenStack scale-out Environments

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content