CTDB resource start failed in RHEL High Availability Cluster with Timed Out status
Issue
- CTDB resource start failed with the following error:
ctdb_start_0 on node1 'unknown error' (1): call=39, status=Timed Out,
exitreason='none',
- ctdb logs repeatedly report failure to get recovery lock.
2018/09/09 15:45:53.460791 ctdbd[46960]: Starting CTDBD (Version 4.7.1) as PID: 46960
2018/09/09 15:45:53.460960 ctdbd[46960]: Removed stale socket /tmp/ctdb.socket
2018/09/09 15:45:53.461820 ctdbd[46960]: connect() failed, errno=2
2018/09/09 15:45:54.641746 ctdbd[46960]: This node (1) is now the recovery master
2018/09/09 15:45:57.644921 ctdb-recoverd[47023]: Election period ended
2018/09/09 15:45:57.666176 ctdb-recoverd[47023]: Unable to take recovery lock - contention
2018/09/09 15:45:57.666220 ctdb-recoverd[47023]: Unable to get recovery lock - retrying recovery
2018/09/09 15:45:58.649349 ctdb-recoverd[47023]: Unable to take recovery lock - contention
2018/09/09 15:45:58.649383 ctdb-recoverd[47023]: Unable to get recovery lock - retrying recovery
2018/09/09 15:45:59.650827 ctdb-recoverd[47023]: Unable to take recovery lock - contention
2018/09/09 15:45:59.650872 ctdb-recoverd[47023]: Unable to get recovery lock - retrying recovery
- Each node marks itself as the recovery master simultaneously.
Environment
- Red Hat Enterprise Linux Server 7 (with the Resilient Storage Add-on)
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.