nova-compute resource times out on stop triggering a controller fence with RHOSP 8

Solution In Progress - Updated -

Issue

  • We issued a pcs resource restart on our openstack-nova-api resource and all the compute nodes are fenced by Pacemaker and all the VM instances stopped.
  • The nova-compute resource is regularly timing out on stop during a resource restart or cluster node shutdown, causing controller node fencing
  • nova-compute fails with OCF_TIMEOUT
  • Our nova-compute resource is failing to stop due to a timeout error
  Sep 03 15:25:03 [3480] node1    pengine:  warning: unpack_rsc_op_failure: Processing failed op stop for nova-compute:0 on compute-l01: OCF_TIMEOUT (198)
  Sep 03 15:25:03 [3480] node1    pengine:  warning: pe_fence_node: Node compute1 will be fenced because of resource failure(s)
  Sep 01 16:46:44 [3485] node1       crmd:     info: process_graph_event:   Detected action (296.568) nova-compute_stop_0.73608=OCF_TIMEOUT: failed
  Sep 01 16:46:44 [3485] node1       crmd:  warning: status_from_rc:        Action 568 (nova-compute_stop_0) on compute-l02 failed (target: 0 vs. rc: 198): Error
  Sep 01 16:46:44 [3485] node1       crmd:     info: abort_transition_graph:        Transition aborted by nova-compute_stop_0 'modify' on node2: Event failed (magic=2:198;568:296:0:3dd776e6-e59b-4d8b-b1b8-fbce832e096f, cib=0.316.2, source=match_graph_event:381, 0)
  Sep 01 16:46:44 [3485] node1       crmd:     info: match_graph_event:     Action nova-compute_stop_0 (568) confirmed on compute-l02 (rc=198)
  Sep 01 16:46:44 [3485] node1       crmd:     info: update_failcount:      Updating failcount for nova-compute on compute-l02 after failed stop: rc=198 (update=INFINITY, time=1472716004)
  Sep 01 16:46:44 [3485] node1       crmd:     info: process_graph_event:   Detected action (296.568) nova-compute_stop_0.73608=OCF_TIMEOUT: failed

Environment

  • Red Hat Openstack Platform (RHOSP) 8
  • Red Hat Enterprise Linux (RHEL) 7 with the High Availability Add On for RHEL-OSP controller nodes
  • nova-compute managed by the High Availability cluster

Subscriber exclusive content

A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.

Current Customers and Partners

Log in for full access

Log In

New to Red Hat?

Learn more about Red Hat subscriptions

Using a Red Hat product through a public cloud?

How to access this content