CTDB resource start failed in RHEL High Availability Cluster with Timed Out status

Solution Verified - Updated -

Issue

  • CTDB resource start failed with the following error:
 ctdb_start_0 on node1 'unknown error' (1): call=39, status=Timed Out, 
  exitreason='none',
  • ctdb logs repeatedly report failure to get recovery lock.
2018/09/09 15:45:53.460791 ctdbd[46960]: Starting CTDBD (Version 4.7.1) as PID: 46960
2018/09/09 15:45:53.460960 ctdbd[46960]: Removed stale socket /tmp/ctdb.socket
2018/09/09 15:45:53.461820 ctdbd[46960]: connect() failed, errno=2
2018/09/09 15:45:54.641746 ctdbd[46960]: This node (1) is now the recovery master
2018/09/09 15:45:57.644921 ctdb-recoverd[47023]: Election period ended
2018/09/09 15:45:57.666176 ctdb-recoverd[47023]: Unable to take recovery lock - contention
2018/09/09 15:45:57.666220 ctdb-recoverd[47023]: Unable to get recovery lock - retrying recovery
2018/09/09 15:45:58.649349 ctdb-recoverd[47023]: Unable to take recovery lock - contention
2018/09/09 15:45:58.649383 ctdb-recoverd[47023]: Unable to get recovery lock - retrying recovery
2018/09/09 15:45:59.650827 ctdb-recoverd[47023]: Unable to take recovery lock - contention
2018/09/09 15:45:59.650872 ctdb-recoverd[47023]: Unable to get recovery lock - retrying recovery
  • Each node marks itself as the recovery master simultaneously.

Environment

  • Red Hat Enterprise Linux Server 7 (with the Resilient Storage Add-on)

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content