crmd times out when communicating with lrmd causing the crmd process to respawn
Issue
-
The following issue occurred:
2018-07-10T19:48:39.636600-04:00 node42 lrmd[29721]: notice: Lost reply from stonith-ng (0x5638def4e7f0) finally arrived, sending re-enabled 2018-07-10T19:49:00.616169-04:00 node42 crmd[27617]: warning: Request 291 to lrmd (0x55caa97e0070) failed: Connection timed out (-110) after 20000ms 2018-07-10T19:49:00.616466-04:00 node42 crmd[27617]: error: Couldn't perform lrmd_rsc_exec operation (timeout=20000): -110: Connection timed out (110) 2018-07-10T19:49:00.616616-04:00 node42 crmd[27617]: error: Operation monitor on dlm failed: -70 2018-07-10T19:49:00.616755-04:00 node42 crmd[27617]: warning: Input I_FAIL received in state S_NOT_DC from do_lrm_rsc_op 2018-07-10T19:49:00.616892-04:00 node42 crmd[27617]: notice: State transition S_NOT_DC -> S_RECOVERY 2018-07-10T19:49:00.617041-04:00 node42 crmd[27617]: warning: Fast-tracking shutdown in response to errors 2018-07-10T19:49:00.617231-04:00 node42 crmd[27617]: error: Input I_TERMINATE received in state S_RECOVERY from do_recover 2018-07-10T19:49:00.617403-04:00 node42 crmd[27617]: notice: Stopped 0 recurring operations at shutdown (1 remaining) 2018-07-10T19:49:00.617538-04:00 node42 crmd[27617]: error: 1 pending LRM operation at shutdown 2018-07-10T19:49:00.617682-04:00 node42 crmd[27617]: error: Pending action: scsi:873 (scsi_start_0) 2018-07-10T19:49:00.617865-04:00 node42 crmd[27617]: notice: Disconnected from the LRM 2018-07-10T19:49:00.618004-04:00 node42 crmd[27617]: notice: Disconnected from Corosync 2018-07-10T19:49:00.618712-04:00 node42 cib[29717]: warning: new_event_notification (29717-27617-11): Broken pipe (32) 2018-07-10T19:49:00.618874-04:00 node42 cib[29717]: warning: A-Sync reply to crmd failed: No message of desired type 2018-07-10T19:49:00.619007-04:00 node42 crmd[27617]: notice: Disconnected from the CIB 2018-07-10T19:49:00.619172-04:00 node42 crmd[27617]: error: Could not recover from internal error 2018-07-10T19:49:00.619954-04:00 node42 pacemakerd[29387]: error: The crmd process (27617) exited: Generic Pacemaker error (201) 2018-07-10T19:49:00.620163-04:00 node42 pacemakerd[29387]: notice: Respawning failed child process: crmd
-
The following error occurred:
Oct 15 14:58:18.773 node42 pacemaker-based [124259] (pcmk__compress) error: Compression of 2490302 bytes failed: output data will not fit into the buffer provided | bzerror=-8 Oct 15 14:58:18.773 node42 pacemaker-based [124259] (pcmk__ipc_prepare_iov) error: Could not compress 2490302-byte message into less than IPC limit of 131072 bytes; set PCMK_ipc_buffer to higher value (9961208 bytes suggested)
Environment
- Red Hat Enterprise Linux (RHEL) 7,8,9 with the High Availability or Resilient Storage Add On
- OpenStack scale-out Environments
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.