PCSD Status detection is slow

Environment

Red Hat Enterprise Linux (RHEL) 7 with the High Availability Or Resilient Storage Add On
pcs package version pcs-0.9.143* or older

Issue

Command pcs status does not return PCSD Status
Command pcs status is very slow
Command pcs status takes too long to complete

Resolution

Update to the following version of pcs. With this version, the PCSD status output is only displayed when the -full option is enabled. In addition, there was optimizations made to pcs such as parallelize pcsd status check. If the --full option is not provided then the command pcs status will not check the pcsd status on each cluster node.

# pcs status --full

Red Hat Enterprise Linux 7

The issue (bz1207405) has been resolved with errata RHSA-2016:2596 with the following package(s): pcs-0.9.152-10.el7 or later

The pcs status command on pcs version pcs-0.9.152-10.el7 or higher will report the following:

# pcs status
Cluster name: cluster
Stack: corosync
Current DC: node1 partition with quorum
Last updated: Tue May 21 11:08:08 2019

Online: [ node1.example.com node2.example.com ]

Full list of resources:

 fence01    (stonith:fence_xvm):    Started node1.example.com
 fence02    (stonith:fence_xvm):    Started node2.example.com

Daemon Status:
   corosync: active/enabled
   pacemaker: active/enabled
   pcsd: active/enabled

It's also possible to check the PCSD Status including the --full option:

# pcs status --full
Cluster name: cluster
Stack: corosync
Current DC: node1 partition with quorum
Last updated: Tue May 21 11:08:08 2019

Online: [ node1.example.com node2.example.com ]

Full list of resources:

 fence01    (stonith:fence_xvm):    Started node1.example.com
 fence02    (stonith:fence_xvm):    Started node2.example.com

PCSD Status:
  node1.example.com: Online
  node2.example.com: Online

Daemon Status:
   corosync: active/enabled
   pacemaker: active/enabled
   pcsd: active/enabled

Root Cause

This issue can happen due the following causes:

Immediately after a node goes down, the next pcs status will take very long to complete
During a node rejoin to the cluster
Network communication failure between the nodes

The issue happen because the command does not have a timeout parameter, in order to break the check in older pcs package versions. It will wait until the other cluster nodes reply with pcsd service status.

In addition, the pcs status command required a few optimizations that made it perform faster when getting the PCSD status output.

This solution is part of Red Hat’s fast-track publication program, providing a huge library of solutions that Red Hat engineers have created while supporting our customers. To give you the knowledge you need the instant it becomes available, these articles may be presented in a raw and unedited form.

Select Your Language

Environment

Issue

Resolution

Red Hat Enterprise Linux 7

Root Cause

Comments

Quick Links

Help

Site Info

Related Sites

About

Red Hat legal and privacy links

Red Hat legal and privacy links

Environment

Issue

Resolution

Red Hat Enterprise Linux 7

Root Cause

Comments

Quick Links

Help

Site Info

Related Sites

Systems Status

About

Red Hat legal and privacy links

Red Hat legal and privacy links